Excel BI - PowerQuery Challenge 340

excel-challenges
power-query
Group Next Nearest Group Next Nearest Group Revenue Find the total for a group and list the groups whose total is nearest (in absolute difference term).
Published

March 24, 2026

Illustration for Excel BI - PowerQuery Challenge 340

Challenge Description

Group Next Nearest Group Next Nearest Group Revenue Find the total for a group and list the groups whose total is nearest (in absolute difference term).

Solutions

library(tidyverse)
library(readxl)

path <- "Power Query/300-399/340/PQ_Challenge_840.xlsx"
input <- read_excel(path, range = "A1:B16")
test  <- read_excel(path, range = "D1:G6")

result <- input %>%
  summarise(Revenue = sum(Revenue), .by = Group)

grid = crossing(x = result$Group, y = result$Group) %>%
  filter(x != y) %>%
  left_join(result, by = c("x" = "Group")) %>%
  left_join(result, by = c("y" = "Group"), suffix = c(".x", ".y")) %>%
  mutate(diff = abs(Revenue.x - Revenue.y)) %>%
  filter(diff == min(diff), .by = x) %>%
  summarise(Revenue = first(Revenue.x),
            `Next Nearest Group` = paste0(y, collapse = ", "),
            `Next Nearest Group Revenue` = first(Revenue.y),
            .by = x) %>%
  rename(Group = x) 

all.equal(grid, test)
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Builds helper columns that drive the final output

  • Strengths:

    • The R solution stays close to the workbook logic and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the workbook layout and selected ranges remain stable.
  • Gem:

    • The best part of the solution is choosing the right intermediate shape before formatting the final output.
import pandas as pd
import numpy as np
import itertools

path = "Power Query/300-399/340/PQ_Challenge_840.xlsx"
input = pd.read_excel(path, usecols="A:B", nrows=16)
test = pd.read_excel(path, usecols="D:G", nrows=5).rename(columns=lambda col: col.replace('.1', ''))

result = input.groupby("Group", as_index=False)["Revenue"].sum()

groups = result["Group"].tolist()
pairs = [(x, y) for x, y in itertools.product(groups, repeat=2) if x != y]
grid = pd.DataFrame(pairs, columns=["x", "y"])

grid = grid.merge(result.rename(columns={"Group": "x", "Revenue": "Revenue_x"}), on="x")
grid = grid.merge(result.rename(columns={"Group": "y", "Revenue": "Revenue_y"}), on="y")

grid["diff"] = (grid["Revenue_x"] - grid["Revenue_y"]).abs()

min_diff = grid.groupby("x")["diff"].min().reset_index().rename(columns={"diff": "min_diff"})
grid = grid.merge(min_diff, on="x")
grid = grid[grid["diff"] == grid["min_diff"]]

out = (
    grid.groupby("x", as_index=False)
    .agg(
        Revenue=("Revenue_x", "first"),
        **{
            "Next Nearest Group": ("y", lambda s: ", ".join(sorted(s))),
            "Next Nearest Group Revenue": ("Revenue_y", "first"),
        }
    )
    .rename(columns={"x": "Group"})
)

print(out.equals(test))
  • Logic:

    • Reads the workbook range needed for the challenge

    • Aggregates or ranks values at the relevant grouping level

    • Applies the rule iteratively until the output is complete

  • Strengths:

    • The Python version follows the same workbook rule in a direct pandas-oriented implementation.
  • Areas for Improvement:

    • As with the R version, any workbook layout change would require small adjustments.
  • Gem:

    • The implementation stays close to the source challenge instead of adding unnecessary abstraction.

Difficulty Level

This task is easy to moderate:

  • The transformation rule is readable, but the final layout still requires a careful implementation.