Excel BI - PowerQuery Challenge 190

excel-challenges
power-query
Split the data into columns.
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 190

Challenge Description

Split the data into columns.

Solutions

library(tidyverse)
library(readxl)
library(rebus)

input = read_excel("Power Query/PQ_Challenge_190.xlsx", range = "A1:A3")
test  = read_excel("Power Query/PQ_Challenge_190.xlsx", range = "A6:E8")

name_pattern = "Name:" %R% capture(one_or_more(WRD)) %R% "Org:"
org_pattern = "Org:" %R% capture(one_or_more(WRD)) %R% "City:"
city_pattern = "City:" %R% capture(one_or_more(WRD)) %R% "FromDate:"
from_date_pattern = "FromDate:" %R% capture(one_or_more(WRD)) %R% "ToDate:"
to_date_pattern = "ToDate:" %R% capture(one_or_more(WRD))

extract_and_space <- function(a, name_pattern) {
  extracted <- str_match(a, name_pattern)
  result <- extracted %>% 
    pluck(2) %>%
    {if (is.na(.)) extracted %>% pluck(1) else .} %>%
    str_replace_all("([a-z])([A-Z])", "\\1 \\2") %>%
    str_replace_all("([A-Z])([A-Z][a-z])", "\\1 \\2")
  
  return(result)
}

result = input %>%
  mutate(Name = map_chr(Data, ~extract_and_space(.x, name_pattern)),
         Org = map_chr(Data, ~ str_match(.x, org_pattern) %>% pluck(2)),
         City = map_chr(Data, ~ extract_and_space(.x, city_pattern)),
         `From Date` = map_chr(Data, ~ str_match(.x, from_date_pattern) %>% pluck(2)),
         `To Date` = map_chr(Data, ~ str_match(.x, to_date_pattern) %>% pluck(2))) %>%
  mutate(`From Date` = ymd(`From Date`) %>% as.POSIXct(),
         `To Date` = ymd(`To Date`) %>% as.POSIXct()) %>%
  select(-Data)

identical(result, test)
# [1] TRUE         

### Solution 2 ------

pattern =  'Name:(\\w+)Org:(\\w+)City:(\\w+)FromDate:(\\d+)ToDate:(\\d+)'

result2 <- input %>%
  extract(Data, into = c("Name", "Org", "City", "From Date", "To Date"), regex = pattern, remove = FALSE) %>%
  mutate(across(c(`From Date`, `To Date`), ~ ymd(.x) %>% as.POSIXct())) %>%
  mutate(across(c(Name, City), ~ str_replace_all(.x, "([A-Z])", " \\1") %>% trimws(which = "left"))) %>%
  select(-Data)

identical(result2, test)
# [1] TRUE
  • Logic:

    • Reads the workbook range needed for the challenge

    • Builds helper columns that drive the final output

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import re

input = pd.read_excel("PQ_Challenge_190.xlsx", usecols="A", nrows = 2)
test = pd.read_excel("PQ_Challenge_190.xlsx",  usecols= "A:E", nrows = 2, skiprows=5)

name_pattern = "Name:([\w]+)Org:"
org_pattern = "Org:([\w]+)City:"
city_pattern = "City:([\w]+)FromDate:"
from_date_pattern = "FromDate:([\w]+)ToDate:"
to_date_pattern = "ToDate:([\w]+)"

def extract_and_space(a, pattern):
    return re.sub(r"([A-Z])", r" \1", re.search(pattern, a).group(1)).lstrip()

result = input.copy()
result["Name"] = result["Data"].apply(lambda x: extract_and_space(x, name_pattern))
result["Org"] = result["Data"].apply(lambda x: re.search(org_pattern, x).group(1))
result["City"] = result["Data"].apply(lambda x: extract_and_space(x, city_pattern))
result["From Date"] = result["Data"].apply(lambda x: re.search(from_date_pattern, x).group(1))
result["To Date"] = result["Data"].apply(lambda x: re.search(to_date_pattern, x).group(1))
result["From Date"] = pd.to_datetime(result["From Date"])
result["To Date"] = pd.to_datetime(result["To Date"])
result = result.drop(columns=["Data"])

print(result.equals(test)) # True
  • Logic:

    • Reads the workbook range needed for the challenge

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.