Excel BI - PowerQuery Challenge 209

excel-challenges
power-query
Populate the End dates against all owners in a pivot format as shown. Here End Date = Start Date + Max of Tasks Duration Days
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 209

Challenge Description

Populate the End dates against all owners in a pivot format as shown. Here End Date = Start Date + Max of Tasks Duration Days

Solutions

library(tidyverse)
library(readxl)

path = 'Power Query/PQ_Challenge_209.xlsx'
input1 = read_excel(path, range = "A2:C10")
input2 = read_excel(path, range = "A13:C17")
test  = read_excel(path, range = "F1:J5") %>%
  mutate(across(-1, as.Date))

i1 = input1 %>%
  mutate(process_part = row_number(), .by = Process) %>%
  separate_rows(Task, sep = ", ") %>%
  left_join(input2, by = c("Task")) %>%
  mutate(max_dur = max(`Duration Days`, na.rm = T),
         end_date = as.Date(`Start Date`) + max_dur, 
         .by = c(Process, process_part)) %>%
  select(Owner, Process, end_date) %>%
  pivot_wider(names_from = Owner, values_from = end_date) %>%
  select(Process, Anne, Lisa, Nathan, Robert)

identical(i1, test)
# [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd

path = "PQ_Challenge_209.xlsx"
input1 = pd.read_excel(path, usecols="A:C", skiprows=1, nrows = 8)
input2 = pd.read_excel(path, usecols="A:C", skiprows=12, nrows = 4)
test  = pd.read_excel(path, usecols="F:J", nrows = 4)
test.columns = test.columns.str.replace('.1', '')

input1['process_part'] = input1.groupby('Process').cumcount() + 1
input1['Task'] = input1['Task'].str.split(', ')
input1 = input1.explode('Task')
i1 = input1.merge(input2, on='Task', how='left')
i1['max_dur'] = i1.groupby(['Process', 'process_part'])['Duration Days'].transform('max')
i1['end_date'] = pd.to_datetime(i1['Start Date']) + pd.to_timedelta(i1['max_dur'], unit='D')
i1 = i1.pivot_table(index='Process', columns='Owner', values='end_date', aggfunc='first').reset_index()
i1 = i1[['Process', 'Anne', 'Lisa', 'Nathan', 'Robert']]
i1.columns.name = None

print(i1.equals(test)) # True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.