Omid - Challenge 58

data-challenges

advanced-exercises

🔰 In this challenge, we aim to find an efficient way to calculate the stepped tax based on the tax rates presented in the question table.

Published

March 24, 2026

Challenge Description

🔰 In this challenge, we aim to find an efficient way to calculate the stepped tax based on the tax rates presented in the question table.

Solutions

library(tidyverse)
library(readxl)

input1 = read_excel("files/CH-058 Stepped Tax.xlsx", range = "B2:D7")
input2 = read_excel("files/CH-058 Stepped Tax.xlsx", range = "F2:G7")
test   = read_excel("files/CH-058 Stepped Tax.xlsx", range = "H2:H7")

input1$To = ifelse(input1$To == "Over", Inf, input1$To) %>% as.numeric()

result = input1 %>% 
  mutate(key = 1) %>%
  full_join(input2 %>% mutate(key = 1), by = "key") %>%
  select(-key) %>%
  filter(From <= To) %>%
  mutate(income_over_threshold = Income - From,
         income_in_threshold = ifelse(Income >= From & Income <= To  , T, F)) %>%
  filter(income_over_threshold >= 0) %>%
  arrange(`Person ID`) %>%
  mutate(tax = ifelse(income_in_threshold, 
                      income_over_threshold * `Tax Rate`,
                      (To - From) * `Tax Rate`)) %>%
  summarise(Tax = sum(tax), .by = c(`Person ID`, Income)) %>%
  select(Tax)

all(round(result$Tax, 1) == round(test$Tax, 1))
# TRUE

Logic:
- Reads the workbook ranges needed for the challenge
- Aggregates or ranks values at the relevant grouping level
- Builds the intermediate columns that drive the final result
Strengths:
- The R solution stays close to the workbook rule and keeps the transformation compact.
Areas for Improvement:
- The code assumes the sheet structure and source ranges remain stable.
Gem:
- The strongest part of the solution is choosing the right intermediate representation before shaping the final output.

import pandas as pd
import numpy as np
input1 = pd.read_excel("CH-058 Stepped Tax.xlsx", usecols="B:D", skiprows=1, nrows = 6)
input2 = pd.read_excel("CH-058 Stepped Tax.xlsx", usecols="F:G", skiprows=1, nrows = 6)
test   = pd.read_excel("CH-058 Stepped Tax.xlsx", usecols="H", skiprows=1, nrows = 6)

input1.loc[4, 'To'] = float('inf')
input1['key'] = 1
input2['key'] = 1
output = pd.merge(input1, input2, on='key')
output['income_over_threshold'] = output["Income"] - output["From"]
output['income_in_threshold'] = np.where((output["Income"] >= output["From"]) & (output["Income"] <= output["To"]), True, False)
output = output[output['income_over_threshold'] > 0].sort_values(by = ["Person ID"]).reset_index(drop = True)
output['tax'] = np.where(output['income_in_threshold'], 
                         output['income_over_threshold'] * output['Tax Rate'], 
                         (output['To'] - output['From']) * output['Tax Rate'])
output = output.groupby('Person ID').agg({'tax': 'sum'}).astype("float64").reset_index()

output['tax'] = output['tax'].round(2)
test['tax'] = test['Tax'].round(2)

print(all(output['tax'] == test['tax'])) # True

Logic:
- Reads the workbook ranges needed for the challenge
- Aggregates or ranks values at the relevant grouping level
Strengths:
- The Python version follows the same rule in a direct dataframe-oriented implementation.
Areas for Improvement:
- The code assumes the workbook layout remains stable, so any sheet redesign would require small adjustments.
Gem:
- The implementation stays close to the original workbook rule instead of adding unnecessary abstraction.

Difficulty Level

This task is moderate:

The business rule is readable, but the workbook still requires careful implementation to reach the expected layout.

Challenge Description

Solutions

Difficulty Level

Related Challenges