library(tidyverse)
library(readxl)
path = "Power Query/PQ_Challenge_281.xlsx"
input = read_excel(path, range = "A1:A9")
test = read_excel(path, range = "C1:D10")
# Solution 1
result = input %>%
separate_rows(Data, sep = ", ") %>%
filter(str_detect(Data, "\\(C\\)")) %>%
separate(Data, into = c("State", "Capital"), sep = " \\(C\\) - ")
# Solution 2
result <- input %>%
separate_rows(Data, sep = ", ") %>%
extract(Data, c("State", "Capital"), "^(.*?) \\(C\\) - (.*)$") %>%
drop_na()Excel BI - PowerQuery Challenge 281
excel-challenges
power-query
Extract the capitals of different states. States appear after dash (‘-’) and capitals are followed by ‘(C)’.

Challenge Description
Extract the capitals of different states. States appear after dash (“-”) and capitals are followed by “(C)”.
Solutions
Logic:
Reads the workbook range needed for the challenge
Uses direct pattern parsing where the workbook encodes logic in text
Strengths:
- The R solution stays close to the workbook logic and keeps the transformation compact.
Areas for Improvement:
- The code assumes the workbook layout and selected ranges remain stable.
Gem:
- The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
path = "PQ_Challenge_281.xlsx"
input = pd.read_excel(path, usecols="A", nrows=8)
test = pd.read_excel(path, usecols="C:D", nrows=10)
result = input.assign(Data=input['Data'].str.split(',')).explode('Data')
result[['Capital', 'State']] = result['Data'].str.split(r' \(C\) - ', expand=True)
result = result.drop(columns=['Data']).sort_values(by=['State'])
result = result[["State", "Capital"]].reset_index(drop=True)
result = result.dropna()
print(result)
print(test)Logic:
Reads the workbook range needed for the challenge
Builds helper columns that drive the final output
Strengths:
- The Python version follows the same workbook rule in a direct pandas-oriented implementation.
Areas for Improvement:
- As with the R version, any workbook layout change would require small adjustments.
Gem:
- The implementation stays close to the source challenge instead of adding unnecessary abstraction.
Difficulty Level
This task is moderate:
It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.
The main challenge is reproducing the workbook output structure exactly.