Excel BI - PowerQuery Challenge 360

excel-challenges
power-query
Pivot the given data as shown showing total hours for each employee for different project codes. Also insert a Total row and a Total column.
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 360

Challenge Description

Pivot the given data as shown showing total hours for each employee for different project codes. Also insert a Total row and a Total column.

Solutions

library(tidyverse)
library(readxl)

path <- "Power Query/300-399/360/PQ_Challenge_360.xlsx"
input <- read_excel(path, range = "A1:B101")
test <- read_excel(path, range = "D1:H8")

result = input %>%
  separate_wider_delim(cols = Log, delim = " | ", names = c("Log", "Emp")) %>%
  separate_longer_delim(cols = Emp, delim = ", ") %>%
  mutate(
    Project = str_extract(Log, "\\[(.*)\\]"),
    hours = str_extract(Log, "\\d+(\\.\\d+)?(?=\\s*(h|hr|hrs))") %>%
      as.numeric()
  ) %>%
  summarise(hours = sum(hours, na.rm = TRUE), .by = c(Emp, Project)) %>%
  arrange(Project) %>%
  pivot_wider(names_from = Project, values_from = hours, values_fill = 0) %>%
  janitor::adorn_totals(c("row", "col")) %>%
  arrange(Emp)

all.equal(result, test, check.attributes = FALSE)
# [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import re

path = "Power Query/300-399/360/PQ_Challenge_360.xlsx"
input = pd.read_excel(path, usecols="A:B", nrows=101)
test = pd.read_excel(path, usecols="D:H", nrows=7)

input['bracketed_substring'] = input.apply(lambda row: re.search(r'\[(.*?)\]', str(row)).group(1) if re.search(r'\[(.*?)\]', str(row)) else None, axis=1)
input['hours'] = input.apply(lambda row: float(re.search(r'\b(\d+(?:\.\d+)?)\s*(h|hr|hrs)\b', str(row), re.IGNORECASE).group(1)) if re.search(r'\b(\d+(?:\.\d+)?)\s*(h|hr|hrs)\b', str(row), re.IGNORECASE) else None, axis=1)
input['Employee'] = input.apply(lambda row: re.search(r'\|\s*(.*)', str(row)).group(1) if re.search(r'\|\s*(.*)', str(row)) else None, axis=1)
input['Employee'] = input['Employee'].str.split(', ')
input = input.explode('Employee')

summary = input.groupby(['Employee', 'bracketed_substring'])['hours'].sum().astype('int64').reset_index(drop=False)
pivot_table = summary.pivot(index='Employee', columns='bracketed_substring', values='hours').fillna(0).reset_index()
pivot_table.columns.name = None
pivot_table = pivot_table.astype({col: 'int64' for col in pivot_table.columns if col != 'Employee'})
pivot_table['Total'] = pivot_table.drop('Employee', axis=1).sum(axis=1)
total_row = pivot_table.drop('Employee', axis=1).sum()
total_row['Employee'] = 'Total'
pivot_table = pd.concat([pivot_table, total_row.to_frame().T], ignore_index=True)

print(all(pivot_table==test))
# True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.