Excel BI - PowerQuery Challenge 329

excel-challenges
power-query
Transpose the given data from Problem Table into Result table as shown.
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 329

Challenge Description

Transpose the given data from Problem Table into Result table as shown.

Solutions

library(tidyverse)
library(readxl)

path = "Power Query/300-399/329/PQ_Challenge_329.xlsx"
input = read_excel(path, range = "A1:C12")
test  = read_excel(path, range = "E1:F18")

result = input %>%
  fill(everything()) %>%
  mutate(Capital = str_remove(Cities, " \\(C\\)"),
         IsCapital = str_detect(Cities, "\\(C\\)")) %>%
  summarise(
    Capital = ifelse(any(IsCapital), first(Capital[IsCapital]), "None"),
    `Other Cities` = ifelse(any(!IsCapital), paste(Cities[!IsCapital], collapse = ", "), "None"),
    .by = c(Country, State)
  ) %>%
  mutate(rn = row_number()) %>%
  pivot_longer(-rn, names_to = "Type", values_to = "City") %>%
  filter(!(Type == "Country" & duplicated(City))) %>%
  select(-rn) %>%
  replace_na(list(City = "None"))

colnames(result) = colnames(test)

all.equal(result, test)
# TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd

path = "300-399/329/PQ_Challenge_329.xlsx"
input = pd.read_excel(path, usecols="A:C", nrows=12).ffill()
test = pd.read_excel(path, usecols="E:F", nrows=18).fillna("None")

input["cap"] = input["Cities"].str.contains(r"\(C\)")
input["city"] = input["Cities"].str.replace(r" \(C\)", "", regex=True)

agg = (input.groupby(["Country", "State"], sort=False)
    .apply(lambda g: pd.Series({
        "Capital": g.loc[g.cap, "city"].drop_duplicates().iloc[0] if g.cap.any() else "None",
        "Other Cities": ", ".join(g.loc[~g.cap, "city"].drop_duplicates()) or "None"
    })).reset_index())

rows = []
for c in input["Country"].drop_duplicates():
    rows.append(pd.DataFrame({"Type": ["Country"], "City": [c]}))
    for s in input.loc[input["Country"].eq(c), "State"].drop_duplicates():
     r = agg.query("Country == @c and State == @s").iloc[0]
     rows.append(pd.DataFrame({
         "Type": ["State", "Capital", "Other Cities"],
         "City": [s, r["Capital"], r["Other Cities"]]
     }))

result = pd.concat(rows, ignore_index=True)
result.columns = test.columns
print(result.equals(test))
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.