Excel BI - Excel Challenge 902

excel-challenges

excel-formulas

🔰 Calculate the Duration for Different categories.

Published

March 24, 2026

Illustration for Excel BI - Excel Challenge 902

Challenge Description

🔰 Calculate the Duration for Different categories. If numeric portion of EmpID is >150, then duration will be 20% more as a seniority surcharge. Also duration will be multiplied by 1.5 for Beta category and 2 for Gamma category.

Solutions

library(tidyverse)
library(readxl)

path <- "Excel/900-999/902/902 Total Hours.xlsx"
input <- read_excel(path, range = "A2:C36")
test <- read_excel(path, range = "E2:F6")

result = input %>%
  separate(Timing, into = c("Start", "End"), sep = "-") %>%
  mutate(
    Start = as.POSIXct(Start, format = "%H:%M"),
    End = as.POSIXct(End, format = "%H:%M"),
    Duration = as.numeric(difftime(
      if_else(End < Start, End + lubridate::days(1), End),
      Start,
      units = "hours"
    )),
    Seniority = ifelse(
      as.numeric(str_extract(EmpID, "\\d+")) > 150,
      TRUE,
      FALSE
    )
  ) %>%
  mutate(
    `Total Hours` = Duration *
      ifelse(Seniority, 1.2, 1) *
      case_when(
        Category == "ALPHA" ~ 1,
        Category == "BETA" ~ 1.5,
        Category == "GAMMA" ~ 2
      )
  ) %>%
  summarise(`Total Hours` = sum(`Total Hours`), .by = Category) %>%
  janitor::adorn_totals("row")

all.equal(result, test, check.attributes = FALSE)
# [1] TRUE

Logic: Read the workbook ranges needed for the challenge; Derive the required intermediate columns; Parse the packed text or string structure; Aggregate or rank the data at the required grouping level.
Strengths: The solution stays close to the text pattern itself, which makes the extraction logic easy to audit.
Areas for Improvement: The solution assumes the workbook layout and selected ranges remain stable, so any structural change in the sheet would require small adjustments.
Gem: A small number of well-targeted text patterns does most of the heavy lifting.

import pandas as pd
import numpy as np
from datetime import timedelta

path = "Excel/900-999/902/902 Total Hours.xlsx"
input_data = pd.read_excel(path, usecols="A:C", skiprows=1, nrows=35)
test = pd.read_excel(path, usecols="E:F", skiprows=1, nrows=4).rename(columns=lambda x: x.replace(".1", ""))
test['Total Hours'] = test['Total Hours'].round(2)

input_data[['Start', 'End']] = input_data['Timing'].str.split('-', expand=True)
input_data['Start'] = pd.to_datetime(input_data['Start'], format="%H:%M")
input_data['End'] = pd.to_datetime(input_data['End'], format="%H:%M")

input_data['Duration'] = input_data.apply(
    lambda row: (row['End'] + timedelta(days=1) if row['End'] < row['Start'] else row['End']) - row['Start'], axis=1
).dt.total_seconds() / 3600

input_data['Total Hours'] = input_data['Duration'] * \
    np.where(input_data['EmpID'].str.extract(r'(\d+)')[0].astype(int) > 150, 1.2, 1) * \
    input_data['Category'].map({"ALPHA": 1, "BETA": 1.5, "GAMMA": 2})

result = input_data.groupby('Category')['Total Hours'].sum().reset_index()
result.loc[len(result)] = ['Total', result['Total Hours'].sum()]
result['Total Hours'] = result['Total Hours'].round(2)

print(test.equals(result))

The Python version follows the same grouped logic and keeps the transformation explicit in a dataframe pipeline.

Difficulty Level

Medium

The individual steps are manageable, but the correct transformation pattern is not obvious from the raw data.