Excel BI - PowerQuery Challenge 282

excel-challenges
power-query
Pivot the table with year and sales figures. Table 2 has names of departments.
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 282

Challenge Description

Pivot the table with year and sales figures. Table 2 has names of departments.

Solutions

library(tidyverse)
library(readxl)

path = "Power Query/PQ_Challenge_282.xlsx"
input1 = read_excel(path, range = "A1:A22")
input2 = read_excel(path, range = "C1:C9")
test = read_excel(path, range = "E1:H4")

result = input1 %>%
  mutate(
    invalid = ifelse(Dept == "Employees" | lag(Dept) == "Employees", 1, 0),
    dept = ifelse(Dept %in% input2$Dept, Dept, NA),
    year = ifelse(str_detect(Dept, "^Y[0-9]{4}$"), Dept, NA)
  ) %>%
  replace_na(list(invalid = 0)) %>%
  filter(invalid == 0) %>%
  fill(dept, year) %>%
  filter(Dept != dept, Dept != year) %>%
  mutate(sdept = sum(as.numeric(Dept), na.rm = TRUE), .by = dept) %>%
  pivot_wider(names_from = year, values_from = Dept) %>%
  arrange(desc(sdept)) %>%
  select(-c(sdept, invalid))
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Builds helper columns that drive the final output

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import numpy as np

path = "PQ_Challenge_282.xlsx"
input1 = pd.read_excel(path, usecols="A", nrows=22, names=["Dept"])
input2 = pd.read_excel(path, usecols="C", nrows=9, names=["Dept"])
test = pd.read_excel(path, usecols="E:H", nrows=4)

input1["invalid"] = np.where(
    (input1["Dept"] == "Employees") | (input1["Dept"].shift(1) == "Employees"), 1, 0
)
input1["dept"] = np.where(input1["Dept"].isin(input2["Dept"]), input1["Dept"], np.nan)
input1["year"] = input1["Dept"].where(input1["Dept"].str.match(r"Y200[1-3]"), np.nan)

input1 = input1[input1["invalid"] == 0]
input1["dept"] = input1["dept"].ffill()
input1["year"] = input1["year"].ffill()

input1 = input1[(input1["Dept"] != input1["dept"]) & (input1["Dept"] != input1["year"])]
input1["sdept"] = input1.groupby("dept")["Dept"].transform(lambda x: x.astype(float).sum(skipna=True))

input1_pivot = input1.pivot_table(index=["dept","invalid","sdept"], columns="year", values="dept")
input1_pivot = input1_pivot.sort_values(by="sdept", ascending=False).reset_index()
input1_pivot.columns.name = None
input1_pivot = input1_pivot.drop(columns=["invalid", "sdept"])

print(input1_pivot)
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.