library(tidyverse)
library(readxl)
path = "Excel/645 Align WBS Data.xlsx"
input = read_excel(path, range = "A1:B30")
test = read_excel(path, range = "E1:J30")
result = reduce(0:5, function(data, i) {
col = as.character(i)
data = data %>%
mutate(!!col := if_else(str_detect(WBS, paste0("WBS_", col)), WBS, NA_character_))
data = if (i == 0) {
data %>% fill(!!sym(col), .direction = "down")
} else {
data %>% group_by(!!sym(as.character(i - 1))) %>%
fill(!!sym(col), .direction = "down") %>%
ungroup()
}
data
}, .init = input) %>%
mutate(across(everything(), ~replace_na(.x, "XXX"))) %>%
select(-WBS, -ID)
all.equal(result, test, check.attributes = FALSE) # TRUEExcel BI - Excel Challenge 645
excel-challenges
excel-formulas
🔰 ID WBS WBS_0 XXX WBS_1 WBS_2A MLSSIP1000 SIPEXESTR00005 SIPEXESTR00015 WBS_2B

Challenge Description
🔰 ID WBS WBS_0 XXX WBS_1 WBS_2A MLSSIP1000 SIPEXESTR00005 SIPEXESTR00015 WBS_2B
Solutions
- Logic: Read the workbook ranges needed for the challenge; Derive the required intermediate columns; Aggregate or rank the data at the required grouping level.
- Strengths: The transformation is organized around the correct grouping level, which keeps the business logic clear.
- Areas for Improvement: The solution assumes the workbook layout and selected ranges remain stable, so any structural change in the sheet would require small adjustments.
- Gem: The key move is solving the problem at the right grain before shaping the final output.
import pandas as pd
import numpy as np
path = "645 Align WBS Data.xlsx"
input = pd.read_excel(path, usecols="A:B", nrows=30)
test = pd.read_excel(path, usecols="E:J", nrows=30)
def process_data(input):
for i in range(6):
col = str(i)
input[col] = np.where(input['WBS'].str.contains(f'WBS_{col}'), input['WBS'], np.nan)
input[col] = input[col].ffill() if i == 0 else input.groupby(str(i - 1))[col].ffill()
return input.fillna("XXX").drop(columns=['WBS', 'ID'])
result = process_data(input)
result.columns = test.columns
print(result.equals(test)) # TrueThe Python version follows the same grouped logic and keeps the transformation explicit in a dataframe pipeline.
Difficulty Level
Medium
The individual steps are manageable, but the correct transformation pattern is not obvious from the raw data.