library(tidyverse)
library(readxl)
path = "files/CH-107 Matching Tables.xlsx"
T1 = read_excel(path, range = "B2:C9")
T2 = read_excel(path, range = "E2:F9")
test = read_excel(path, range = "H2:I12")
T_full = tibble(`Question ID` = str_c("Q-", 1:10)) %>%
full_join(T1, by = "Question ID") %>%
full_join(T2, by = "Question ID") %>%
arrange(desc(parse_number(`Question ID`))) %>%
mutate(Response = case_when(
is.na(Response.x) & !is.na(Response.y) ~ Response.y,
!is.na(Response.x) & is.na(Response.y) ~ Response.x,
!is.na(Response.x) & !is.na(Response.y) ~ Response.y,
TRUE ~ Response.x
)) %>%
select(-Response.x, -Response.y)
identical(T_full, test)
#> [1] TRUEOmid - Challenge 107
data-challenges
advanced-exercises
🔰 Challenge 107: Matching Tables

Challenge Description
🔰 Challenge 107: Matching Tables
Solutions
Logic:
Reads the workbook ranges needed for the challenge
Builds the intermediate columns that drive the final result
Parses the text patterns directly instead of relying on manual cleanup
Strengths:
- The R solution stays close to the workbook rule and keeps the transformation compact.
Areas for Improvement:
- The code assumes the sheet structure and source ranges remain stable.
Gem:
- The strongest part of the solution is choosing the right intermediate representation before shaping the final output.
import pandas as pd
path = "CH-107 Matching Tables.xlsx"
T1 = pd.read_excel(path, usecols="B:C", skiprows=1, nrows=7, names=["Question ID", "Response"])
T2 = pd.read_excel(path, usecols="E:F", skiprows=1, nrows=7, names=["Question ID", "Response"])
test = pd.read_excel(path, usecols="H:I", skiprows=1, names=["Question ID", "Response"])
T_full = pd.merge(test["Question ID"], T1, on="Question ID", how="outer")
T_full = pd.merge(T_full, T2, on="Question ID", how="outer")
T_full["Number"] = T_full["Question ID"].str.extract("(\d+)").astype(int)
T_full = T_full.sort_values(by="Number", ascending=True).reset_index(drop=True)
T_full["Response"] = T_full["Response_y"].fillna(T_full["Response_x"])
T_full = T_full.drop(columns=["Response_x", "Response_y", "Number"])
print(T_full.equals(test)) # TrueLogic:
- Reads the workbook ranges needed for the challenge
Strengths:
- The Python version follows the same rule in a direct dataframe-oriented implementation.
Areas for Improvement:
- The code assumes the workbook layout remains stable, so any sheet redesign would require small adjustments.
Gem:
- The implementation stays close to the original workbook rule instead of adding unnecessary abstraction.
Difficulty Level
This task is moderate:
The core logic is clear, but the correct transformation pattern is not obvious from the raw input.
The challenge combines multiple reshaping, grouping, or parsing steps.