Excel BI - PowerQuery Challenge 311

excel-challenges
power-query
Column1 Column2 Column3 Column4 Clinic Patient Name
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 311

Challenge Description

Column1 Column2 Column3 Column4 Clinic Patient Name

Solutions

library(tidyverse)
library(readxl)
library(janitor)

path = "Power Query/300-399/311/PQ_Challenge_311.xlsx"
input = read_excel(path, range = "A1:D13")
test  = read_excel(path, range = "F1:I10") %>%
  arrange(Clinic, `Patient Name`)

result = input %>%
  mutate(Clinic = ifelse(Column1 == "Clinic", Column2, NA)) %>%
  fill(Clinic) %>%
  filter(Column1 != "Clinic") %>%
  mutate(Clinic = ifelse(row_number() == 1, "Clinic", Clinic)) %>%
  row_to_names(row_number = 1) %>%
  pivot_longer(-c(Clinic, Patients), names_to = "type", values_to = "date") %>%
  mutate(date = as.POSIXct(as.Date(as.numeric(date), origin = "1899-12-30"))) %>%
  na.omit() %>%
  group_by(Patients) %>%
  slice_tail(n = 1) %>%
  ungroup() %>%
  mutate(Status = case_when(
    str_detect(type, "Reg") ~ "Registered",
    str_detect(type, "In") ~ "Admitted",
    str_detect(type, "Out") ~ "Discharged"
  )) %>%
  arrange(Clinic) %>%
  select(Clinic, `Patient Name` = Patients, Status, `Last Status Date` = date) 

all.equal(result, test, check.attributes = FALSE, check.names = FALSE)
# different dates in provided solution
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
from datetime import datetime, timedelta
import re

path = "300-399/311/PQ_Challenge_311.xlsx"
input = pd.read_excel(path, usecols="A:D", nrows=13)
test = pd.read_excel(path, usecols="F:I", nrows=10).sort_values(['Clinic', 'Patient Name']).reset_index(drop=True)

input['Clinic'] = input.apply(lambda row: row['Column2'] if row['Column1'] == 'Clinic' else None, axis=1)
input['Clinic'] = input['Clinic'].ffill()
input = input[input['Column1'] != 'Clinic'].reset_index(drop=True)
input.loc[0, 'Clinic'] = 'Clinic'
input.columns = input.iloc[0]
input = input[1:].reset_index(drop=True)

id_vars = ['Clinic', 'Patients']
value_vars = [col for col in input.columns if col not in id_vars]
df_long = input.melt(id_vars=id_vars, value_vars=value_vars, var_name='type', value_name='date')

df_long = df_long.dropna(subset=['date'])
df_last = df_long.sort_values('date').groupby('Patients').tail(1).reset_index(drop=True)

status_map = {
    'Reg': 'Registered',
    'In': 'Admitted',
    'Out': 'Discharged'
}
df_last['Status'] = [
    next((v for k, v in status_map.items() if k in str(t)), None)
    for t in df_last['type']
]

result = df_last.sort_values(['Clinic','Patients']).rename(
    columns={'Patients': 'Patient Name', 'date': 'Last Status Date'}
)[['Clinic', 'Patient Name', 'Status', 'Last Status Date']].reset_index(drop=True)

print(result)
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.