Excel BI - PowerQuery Challenge 369

excel-challenges
power-query
HxId,ParentId,CreatedDate,OldValue,NewValue ParentId Status Path Lifecycle Days H001,APP-101,2024-01-01,Created APP-101
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 369

Challenge Description

HxId,ParentId,CreatedDate,OldValue,NewValue ParentId Status Path Lifecycle Days H001,APP-101,2024-01-01,Created APP-101

Solutions

library(tidyverse)
library(readxl)

path <- "Power Query/300-399/369/PQ_Challenge_369.xlsx"
input <- read_excel(path, range = "A1:A21", col_names = FALSE)
test <- read_excel(path, range = "C1:E4")

result <- input %>%
  separate_wider_delim(
    col = 1,
    delim = ",",
    names = c("HxId", "ParentId", "CreatedDate", "OldValue", "NewValue"),
    too_few = "align_start"
  ) %>%
  janitor::row_to_names(1) %>%
  mutate(
    CreatedDate = as.Date(CreatedDate),
    Status = coalesce(NewValue, OldValue)
  ) %>%
  arrange(ParentId, CreatedDate) %>%
  summarise(
    `Status Path` = str_c(Status, collapse = " > "),
    `Lifecycle Days` = as.integer(max(CreatedDate) - min(CreatedDate)),
    .by = ParentId
  )

all.equal(result, test)
#> [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd

path = "Power Query/300-399/369/PQ_Challenge_369.xlsx"

input = pd.read_excel(path, usecols="A", nrows=21, header=None)
test = pd.read_excel(path, usecols="C:E", nrows=3)

input = input[0].str.split(",", expand=True)
input.columns = input.iloc[0]
input = input[1:]

input["CreatedDate"] = pd.to_datetime(input["CreatedDate"], errors='coerce')
input["Status"] = input["NewValue"].combine_first(input["OldValue"])
input = input.sort_values(by=["ParentId", "CreatedDate"])

result = input.groupby("ParentId").agg(
    Status_Path=("Status", lambda x: " > ".join(x.dropna())),
    Lifecycle_Days=("CreatedDate", lambda x: (x.max() - x.min()).days)
).reset_index().rename(columns={"Status_Path": "Status Path", "Lifecycle_Days": "Lifecycle Days"})
print(result.equals(test))
# > True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.