Excel BI - PowerQuery Challenge 147

excel-challenges
power-query
Cust ID Cust Name Amount Type Smith R1
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 147

Challenge Description

Cust ID Cust Name Amount Type Smith R1

Solutions

library(tidyverse)
library(readxl)

input = read_excel("Power Query/PQ_Challenge_147.xlsx", range = "A1:D17")
test  = read_excel("Power Query/PQ_Challenge_147.xlsx", range = "F1:I17") %>%
  janitor::clean_names()

reshape <- function(input) {
  input %>%
    janitor::clean_names() %>%
    mutate(nr = row_number()) %>%
    mutate(across(c(cust_id, cust_name, amount, type),
                  ~ ifelse(is.na(.), NA, cumsum(!is.na(.))),
                  .names = "index_{.col}"),
           max_index = pmax(index_cust_id, index_cust_name, index_amount, index_type, na.rm = TRUE)) %>%
    group_by(max_index) %>%
    mutate(across(c(cust_id, cust_name, amount, type),
                  ~ max(., na.rm = TRUE)),
           min_row = min(nr, na.rm = TRUE),
           max_row = max(nr, na.rm = TRUE)) %>%
    ungroup() %>%
    filter(!is.na(max_index)) %>%
    select(-starts_with("index_"), -max_index, -nr) %>%
    distinct() %>%
    mutate(row_seq = map2(min_row, max_row, seq)) %>%
    unnest(row_seq) %>%
    select(-min_row, -max_row, -row_seq) %>%
    group_by(cust_id) %>%
    mutate(type = paste0(type, row_number())) %>%
    ungroup()
}


result = reshape(input)

identical(result, test)
# [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd

input_data = pd.read_excel("PQ_Challenge_147.xlsx", usecols="A:D", nrows=17)
test = pd.read_excel("PQ_Challenge_147.xlsx", usecols="F:I", nrows=17)
test.columns = [c.strip().lower() for c in test.columns]

result = input_data.copy()
result.columns = [c.strip().lower() for c in result.columns]
result["nr"] = range(1, len(result) + 1)
for col in ["cust_id", "cust_name", "amount", "type"]:
    result[f"index_{col}"] = result[col].notna().cumsum().where(result[col].notna())
result["max_index"] = result[[f"index_{c}" for c in ["cust_id", "cust_name", "amount", "type"]]].max(axis=1)

agg = (
    result.groupby("max_index", dropna=True)
    .agg(
        cust_id=("cust_id", "max"),
        cust_name=("cust_name", "max"),
        amount=("amount", "max"),
        type=("type", "max"),
        min_row=("nr", "min"),
        max_row=("nr", "max"),
    )
    .reset_index(drop=True)
)
rows = []
for _, row in agg.iterrows():
    for seq in range(int(row["min_row"]), int(row["max_row"]) + 1):
        rows.append({
            "cust_id": row["cust_id"],
            "cust_name": row["cust_name"],
            "amount": row["amount"],
            "type": row["type"],
            "row_seq": seq,
        })
result2 = pd.DataFrame(rows).drop(columns="row_seq")
result2["type"] = result2.groupby("cust_id").cumcount().add(1).astype(str).radd(result2["type"])

print(result2.equals(test))
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.