Excel BI - PowerQuery Challenge 174

excel-challenges
power-query
Emp From Date To Date Sales Monthly Sales Running Total
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 174

Challenge Description

Emp From Date To Date Sales Monthly Sales Running Total

Solutions

library(tidyverse)
library(readxl)
library(padr)

input = read_excel("Power Query/PQ_Challenge_174.xlsx", range = "A1:D5")
test  = read_excel("Power Query/PQ_Challenge_174.xlsx", range = "F1:J20")

result = input %>%
  pivot_longer(cols = -c(1, 4), names_to = "date", values_to = "value") %>%
  select(-date) %>%
  group_by(Emp) %>%
  pad() %>%
  fill(Sales, .direction = "down") %>%
  mutate(days = n(),
         daily_sales = Sales / days,
         month = floor_date(value, "month"),
         year = year(value)) %>%
  ungroup() %>%
  summarise(`Monthly Sales` = sum(daily_sales), 
            `From Date` = min(value),
            `To Date` = max(value), 
            .by = c("Emp", "month", "year")) %>%
  mutate(`Running Total` = cumsum(`Monthly Sales`), .by = c("Emp", "year")) %>%
  select(Emp, `From Date`, `To Date`, `Monthly Sales`, `Running Total`) %>%
  mutate(across(c(4:5), ~round(., digits = 2)))

# not all results match because of floaring point precision
# structure achieved
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
from pandas.tseries.offsets import MonthEnd

input = pd.read_excel("PQ_Challenge_174.xlsx", sheet_name="Sheet1",  usecols="A:D", nrows=4)
test = pd.read_excel("PQ_Challenge_174.xlsx", sheet_name="Sheet1",  usecols="F:J", nrows=20)
test.columns = ["Emp", "From Date", "To Date", "Monthly Sales", "Running Total"]

# function mimicing R padr::pad() function to fill missing dates
def pad(df, date_col, freq='D'):
    df[date_col] = pd.to_datetime(df[date_col])
    df = df.set_index(date_col)
    df = df.asfreq(freq)
    df = df.reset_index()
    return df

result = input.melt(id_vars=["Emp", "Sales"], var_name="date", value_name="value").sort_values(["Emp", "value"]).reset_index(drop=True)    
result = result.groupby("Emp").apply(lambda x: pad(x, "value"))
result = result.fillna(method='ffill').reset_index(drop=True)
result["days"] = result.groupby("Emp")["value"].transform("count")
result["daily_sales"] = result["Sales"] / result["days"]
result["month"] = result["value"].dt.to_period("M").dt.to_timestamp() 
result["year"] = result["value"].dt.year
result = result.groupby(["Emp", "month", "year"]).agg({"daily_sales": "sum", "value": ["min", "max"]})
result.columns = ["Monthly Sales", "From Date", "To Date"]
result["Running Total"] = result.groupby(["Emp", "year"])["Monthly Sales"].cumsum()
result = result.reset_index()
result = result[["Emp", "From Date", "To Date", "Monthly Sales", "Running Total"]]
result[["Monthly Sales", "Running Total"]] = result[["Monthly Sales", "Running Total"]].round(2)

print(result)
print(test)

# results comparison fails due to floating point precision
  • Logic:

    • Reads the workbook range needed for the challenge

    • Reshapes the data into the structure required by the result table

    • Aggregates or ranks values at the relevant grouping level

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • It combines reshaping, grouping, or parsing steps that are common in Power Query style problems.

  • The main challenge is reproducing the workbook output structure exactly.