Excel BI - PowerQuery Challenge 327

excel-challenges
power-query
Data Answer Expected Shipment Month Mar Vegetables Amount
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 327

Challenge Description

Data Answer Expected Shipment Month Mar Vegetables Amount

Solutions

library(tidyverse)
library(readxl)
library(janitor)

path <- "Power Query/300-399/327/PQ_Challenge_327.xlsx"
input <- read_excel(path, range = "A2:E21", col_names = FALSE) 
test  <- read_excel(path, range = "G2:H7")

input <- input %>% filter(`...1` != "Shipment Month")
header_rows <- which(grepl("Date", input$`...1`))
end_rows <- c(header_rows[-1] - 1, nrow(input))
sections <- map2(header_rows, end_rows, ~ input[.x:.y, ] %>% row_to_names(1))
sections <- map(sections, ~ t(.) %>% as_tibble(rownames = "Product"))
sections <- map(sections, ~ filter(.x, Product != "Date"))
sections <- map_dfr(sections, ~ pivot_longer(.x, -Product, names_to = "Date", values_to = "Sales"))
result <- sections %>% summarise(Amount <- sum(as.numeric(Sales), na.rm = T), .by = Product) %>%
  arrange(desc(Amount)) %>%
  rename(Vegetables <- Product)

all.equal(result, test)
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd

path = "300-399/327/PQ_Challenge_327.xlsx"
input = pd.read_excel(path, header=None, usecols="A:E", skiprows=1, nrows=20)
test = pd.read_excel(path, usecols="G:H", skiprows=1, nrows=5)

df = input[~input[0].astype(str).str.contains("Shipment Month", na=False)].copy()
df["block"] = df[0].astype(str).str.contains("Date", na=False).cumsum()

def tidy(g):
    g = g.drop(columns="block").reset_index(drop=True)
    g.columns = g.iloc[0]
    return g.iloc[1:].melt(id_vars="Date", var_name="Vegetables", value_name="Sales")

long = pd.concat([tidy(g) for _, g in df.groupby("block")], ignore_index=True)

result = (
    long.assign(Sales=pd.to_numeric(long.Sales, errors="coerce"))
        .dropna(subset=["Sales"])
        .groupby("Vegetables", as_index=False)["Sales"].sum()
        .rename(columns={"Sales": "Amount"})
        .sort_values("Amount", ascending=False)
        .reset_index(drop=True)
)
result["Amount"] = result["Amount"].astype(int)

print(result.equals(test)) #True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.