Excel BI - PowerQuery Challenge 349

excel-challenges
power-query
Extract First, Middle, Last Names & Duration of US Presidents
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 349

Challenge Description

Extract First, Middle, Last Names & Duration of US Presidents

Solutions

library(tidyverse)
library(readxl)

path <- "Power Query/300-399/349/PQ_Challenge_349.xlsx"
input <- read_excel(path, range = "A1:A70")
test <- read_excel(path, range = "C1:I70") %>%
  replace_na(list(`Middle Name` = ""))

result = input %>%
  mutate(`Precidency #` = row_number()) %>%
  mutate(
    `Full Name` = str_extract(`US Presidents`, "^[^(]+"),
    Duration = str_extract(`US Presidents`, "\\(([^)]+)\\)") %>%
      str_remove_all("\\(|\\)")
  ) %>%
  mutate(
    Name_parts = str_split(str_trim(`Full Name`), " ", simplify = FALSE)
  ) %>%
  mutate(
    `First Name` = map_chr(Name_parts, ~ .x[1]),
    `Last Name` = map_chr(Name_parts, ~ .x[length(.x)]),
    `Middle Name` = map_chr(Name_parts, function(x) {
      if (length(x) <= 2) "" else paste(x[2:(length(x) - 1)], collapse = " ")
    })
  ) %>%
  select(-Name_parts) %>%
  mutate(num_presidencies = row_number(), .by = `Full Name`) %>%
  mutate(
    `Dynasty Flag` = ifelse(n_distinct(`Full Name`) > 1, "Yes", "No"),
    .by = `Last Name`
  ) %>%
  mutate(
    `Term Check` = case_when(
      num_presidencies == 1 ~ "First Term",
      num_presidencies > 1 &
        lag(`Full Name`) == `Full Name` ~ "Re-elected (Consecutive)",
      TRUE ~ "Re-elected (Non-Consecutive)"
    )
  ) %>%
  select(
    `Precidency #`,
    `First Name`,
    `Middle Name`,
    `Last Name`,
    `Duration`,
    `Term Check`,
    `Dynasty Flag`
  )

all.equal(result, test, check.attributes = FALSE)
  • Logic:

    • Reads the workbook range needed for the challenge

    • Builds helper columns that drive the final output

    • Uses direct pattern parsing where the workbook encodes logic in text

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import numpy as np
import re

input_path = "Power Query/300-399/349/PQ_Challenge_349.xlsx"

input_df = pd.read_excel(input_path, usecols="A", nrows=70)
test_df = pd.read_excel(input_path, usecols="C:I", nrows=70)
test_df["Middle Name"] = test_df["Middle Name"].fillna("")

def extract_full_name(s):
    match = re.match(r"^[^(]+", s)
    return match.group(0).strip() if match else ""

def extract_duration(s):
    match = re.search(r"\(([^)]+)\)", s)
    return match.group(1) if match else ""

def split_name(full_name):
    parts = full_name.strip().split()
    first = parts[0] if parts else ""
    last = parts[-1] if len(parts) > 1 else ""
    middle = " ".join(parts[1:-1]) if len(parts) > 2 else ""
    return first, middle, last

result = input_df.copy()
result["Presidency #"] = np.arange(1, len(result) + 1)
result["Full Name"] = result["US Presidents"].apply(extract_full_name)
result["Duration"] = result["US Presidents"].apply(extract_duration)

result[["First Name", "Middle Name", "Last Name"]] = result["Full Name"].apply(
    lambda x: pd.Series(split_name(x))
)

result["num_presidencies"] = result.groupby("Full Name").cumcount() + 1

last_name_counts = result.groupby("Last Name")["Full Name"].nunique()
result["Dynasty Flag"] = result["Last Name"].map(lambda x: "Yes" if last_name_counts[x] > 1 else "No")

def term_check(row, prev_row):
    if row["num_presidencies"] == 1:
        return "First Term"
    elif prev_row is not None and prev_row["Full Name"] == row["Full Name"]:
        return "Re-elected (Consecutive)"
    else:
        return "Re-elected (Non-Consecutive)"

result["Term Check"] = [
    term_check(row, result.iloc[i - 1] if i > 0 else None)
    for i, row in result.iterrows()
]

result = result[
    [
        "Presidency #",
        "First Name",
        "Middle Name",
        "Last Name",
        "Duration",
        "Term Check",
        "Dynasty Flag",
    ]
] 
print(result.equals(test_df))
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Uses direct pattern parsing where the workbook encodes logic in text

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.