library(tidyverse)
library(readxl)
path = "Power Query/300-399/330/PQ_Challenge_330.xlsx"
input = read_excel(path, range = "A1:B35")
test = read_excel(path, range = "D1:F12")
result = input %>%
mutate(Year = ifelse(Data1 == "Year", Data2, NA)) %>%
fill(Year) %>%
filter(str_detect(Data1, "Year|TOTAL|Continent", negate = TRUE),
Data2 > 0) %>%
select(Continent = Data1, Year, Sales = Data2) %>%
mutate(across(c(Year, Sales), as.integer))
all.equal(result, test)
# [1] TRUEExcel BI - PowerQuery Challenge 330
excel-challenges
power-query
Data1 Data2 Continent Year Sales Asia

Challenge Description
Data1 Data2 Continent Year Sales Asia
Solutions
Logic:
Reads the workbook range needed for the challenge
Builds helper columns that drive the final output
Uses direct pattern parsing where the workbook encodes logic in text
Strengths:
- The R solution stays close to the workbook logic and keeps the transformation compact.
Areas for Improvement:
- The code assumes the workbook layout and selected ranges remain stable.
Gem:
- The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
path = "300-399/330/PQ_Challenge_330.xlsx"
input = pd.read_excel(path, usecols="A:B", nrows=35)
test = pd.read_excel(path, usecols="D:F", nrows=11)
input['Year'] = input['Data2'].where(input['Data1'] == 'Year').ffill()
mask = (~input['Data1'].str.contains("Year|TOTAL|Continent", na=False) & pd.to_numeric(input['Data2'], errors='coerce').gt(0))
result = input.loc[mask, ['Data1', 'Year', 'Data2']].rename(columns={'Data1': 'Continent', 'Data2': 'Sales'}).reset_index(drop=True)
result[['Year', 'Sales']] = result[['Year', 'Sales']].astype(int)
print(result.equals(test))Logic:
- Reads the workbook range needed for the challenge
Strengths:
- The Python version follows the same workbook rule in a direct pandas-oriented implementation.
Areas for Improvement:
- As with the R version, any workbook layout change would require small adjustments.
Gem:
- The implementation stays close to the source challenge instead of adding unnecessary abstraction.
Difficulty Level
This task is moderate:
It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.
The main challenge is reproducing the workbook output structure exactly.