Excel BI - PowerQuery Challenge 308

excel-challenges
power-query
Pivot the given table as shown and
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 308

Challenge Description

Pivot the given table as shown and

Solutions

library(tidyverse)
library(readxl)
library(janitor)

path = "Power Query/300-399/308/PQ_Challenge_308.xlsx"
input = read_excel(path, range = "B2:B6")
test  = read_excel(path, range = "D2:F7")

result = input %>%
  extract(Data, into = c("Company", "Data"), 
          regex = "(Company [A-Z]{1})(.*)") %>%
  mutate(Data = str_trim(str_to_lower(Data))) %>%
  mutate(Data = str_extract_all(Data, "[a-z]+[\\s:-]*\\d+")) %>%
  unnest_longer(Data) %>%
  mutate(Data_1 = str_extract(Data, "[a-z]+"),
         Data_2 = str_extract(Data, "\\d+")) %>%
  mutate(Data_2 = as.numeric(Data_2)) %>%
  filter(Data_1 != "year") %>%
  select(-Data) %>%
  pivot_wider(names_from = Data_1, values_from = Data_2, values_fn = sum) %>%
  rename_with(~ str_to_title(.), everything()) %>%
  adorn_totals("row")

all.equal(result, test, check.attributes = FALSE)
# > [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Builds helper columns that drive the final output

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import re
from pathlib import Path

path = "300-399/308/PQ_Challenge_308.xlsx"
input = pd.read_excel(path, usecols="B", skiprows=1, nrows=4)
test = pd.read_excel(path, usecols="D:F", skiprows=1, nrows=5)

input[['Company', 'Data']] = input['Data'].str.extract(r'(Company [A-Z])(.+)')
input['Data'] = input['Data'].str.lower().str.findall(r'[a-z]+[\s:-]*\d+')
unnested = input.explode('Data').dropna()
unnested[['Type', 'Value']] = unnested['Data'].str.extract(r'([a-z]+).*?(\d+)')
unnested = unnested[unnested['Type'] != 'year']
pivot = unnested.pivot_table(index='Company', columns='Type', values='Value', aggfunc=lambda x: x.astype(int).sum()).reset_index()
pivot.columns = [c.title() for c in pivot.columns]
totals = {col: pivot[col].astype(int).sum() if col != 'Company' else 'Total' for col in pivot.columns}
pivot = pd.concat([pivot, pd.DataFrame([totals])], ignore_index=True)[['Company', 'Revenue', 'Cost']]

print(pivot.equals(test))
# True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Uses direct pattern parsing where the workbook encodes logic in text

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.