Omid - Challenge 378

data-challenges
advanced-exercises
🔰 Table Transformation!
Published

March 24, 2026

Illustration for Omid - Challenge 378

Challenge Description

🔰 Table Transformation!

Solutions

library(tidyverse)
library(readxl)

path <- "300-399/378/CH-378 Table Transformation.xlsx"
input <- read_excel(path, range = "B3:B12")
test <- read_excel(path, range = "D3:F9")

result = input %>%
  separate_longer_delim(col = 1, delim = ", ") %>%
  mutate(
    type = case_when(
      str_length(Col1) > 3 ~ "Date",
      str_detect(Col1, "^[A-Za-z]+$") ~ "Product",
      TRUE ~ "Sale"
    )
  ) %>%
  mutate(rn = row_number(), .by = type) %>%
  pivot_wider(names_from = type, values_from = Col1) %>%
  select(-rn) %>%
  mutate(Sale = as.numeric(Sale))

all.equal(result, test)
# Correct transformation. Cannot be checked because R read dates differently.
  • Logic:

    • Reads the workbook ranges needed for the challenge

    • Reshapes the data into the grain required by the task

    • Builds the intermediate columns that drive the final result

    • Parses the text patterns directly instead of relying on manual cleanup

  • Strengths:

    • The R solution stays close to the workbook rule and keeps the transformation compact.
  • Areas for Improvement:

    • The code assumes the sheet structure and source ranges remain stable.
  • Gem:

    • The strongest part of the solution is choosing the right intermediate representation before shaping the final output.
import pandas as pd
import re

path = "300-399/378/CH-378 Table Transformation.xlsx"
input_df = pd.read_excel(path, usecols="B", skiprows=2, nrows=9, dtype=str)
test = pd.read_excel(path, usecols="D:F", skiprows=2, nrows=6)

df = (
    input_df["Col1"]
    .str.split(", ")
    .explode()
    .reset_index(drop=True)
    .to_frame()
)
def classify(val):
    if len(val) > 3:
        return "Date"
    elif re.match(r"^[A-Za-z]+$", val):
        return "Product"
    else:
        return "Sale"
df["type"] = df["Col1"].apply(classify)
df["rn"] = df.groupby("type").cumcount()
result = (
    df.pivot(index="rn", columns="type", values="Col1")
    .reset_index(drop=True)
)
result.columns.name = None
result["Sale"] = pd.to_numeric(result["Sale"])

print(result.equals(test))
# Different dates formating. But transformation is correct.
  • Logic:

    • Reads the workbook ranges needed for the challenge

    • Reshapes the data into the grain required by the task

    • Aggregates or ranks values at the relevant grouping level

    • Parses the text patterns directly instead of relying on manual cleanup

  • Strengths:

    • The Python version follows the same rule in a direct dataframe-oriented implementation.
  • Areas for Improvement:

    • The code assumes the workbook layout remains stable, so any sheet redesign would require small adjustments.
  • Gem:

    • The implementation stays close to the original workbook rule instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

  • The core logic is clear, but the correct transformation pattern is not obvious from the raw input.

  • The challenge combines multiple reshaping, grouping, or parsing steps.