R vs Python Comparison¶

Auto-generated by benchmarks/compare.py on 2026-04-23 10:06

This page compares results from the Darwin EU R packages and OMOPy Python equivalents, both run against the same synthea_1k.duckdb dataset (~10,681 patients, OMOP CDM v5.3).

Cohort Overview¶

The benchmarks use 4 clinical concepts to build cohorts from the full 10,681-patient database:

Coronary Arteriosclerosis (concept 317576) — condition cohort, 1,243 subjects
Clopidogrel (concept 1322184) — drug cohort, 1,473 subjects
Simvastatin (concept 1539403) — drug cohort, used in treatment patterns
"coronary" keyword search — vocabulary-based codelist generation

The diagram below shows which cohort feeds each benchmark, explaining why subject counts differ across sections.

Cohort Overview

Timing & Row-Count Summary¶

#	Benchmark	R Package	OMOPy Module	R Time	Python Time	R Rows	Python Rows
01	CDM Snapshot	CDMConnector	`omopy.connector`	2.42s	10.69s	1	1
02	Cohort Generation	CDMConnector	`omopy.connector`	3.54s	6.35s	1	1
03	Patient Profiles	PatientProfiles	`omopy.profiles`	4.80s	5.69s	100	100
04	Cohort Characteristics	CohortCharacteristics	`omopy.characteristics`	5.42s	4.60s	51	34
05	Incidence	IncidencePrevalence	`omopy.incidence`	9.21s	26.34s	716	672
06	Drug Utilisation	DrugUtilisation	`omopy.drug`	13.67s	10.27s	58	148
07	Survival	CohortSurvival	`omopy.survival`	10.86s	9.24s	90860	90843
08	Codelist Generation	CodelistGenerator	`omopy.codelist`	3.61s	4.72s	1761	1985
09	Treatment Patterns	TreatmentPatterns	`omopy.treatment`	9.00s	4.66s	0	0
10	Drug Diagnostics	DrugExposureDiagnostics	`omopy.drug_diagnostics`	7.17s	6.00s	12	19

Value Concordance¶

The tables below compare specific output values between R and Python for each benchmark. This demonstrates that OMOPy produces consistent results — not just similar row counts.

01 — CDM Snapshot¶

R: CDMConnector · Python: omopy.connector

Metric	R	Python	Match
CDM Version	5.3.1	5.3.1	✅
Vocabulary Version	v5.0 22-JUN-22	v5.0 22-JUN-22	✅
Person Count	10,681	10,681	✅
Observation Period Count	10,681	10,681	✅
Earliest Obs Start	1926-08-15	1926-08-15	✅
Latest Obs End	2023-06-20	2023-06-20	✅

02 — Cohort Generation¶

R: CDMConnector · Python: omopy.connector

Metric	R	Python	Match
n_records	1,243	1,243	✅
n_subjects	1,243	1,243	✅

03 — Patient Profiles¶

R: PatientProfiles · Python: omopy.profiles

Metric	R	Python	Match
Row Count	100	100	✅
Mean Age	56.10	56.10	✅
Sex = Female	23	23	✅
Sex = Male	77	77	✅
Subject ID Overlap	100/100	100/100	✅

04 — Cohort Characteristics¶

R: CohortCharacteristics · Python: omopy.characteristics

Metric	R	Python	Match
Number records (count)	1,243	1,243	✅
Number subjects (count)	1,243	1,243	✅
Age (mean)	57.14	57.14	≈
Age (sd)	23.94	23.94	≈
Age (median)	63	63	✅
Age (q25)	46	46	✅
Age (q75)	75	75	✅
Age (min)	0	0	✅
Age (max)	98	98	✅
Prior observation (mean)	1,775.47	1,775.47	≈
Prior observation (sd)	3,296.43	3,296.43	≈
Prior observation (median)	371	371	✅
Prior observation (q25)	0	0	✅
Prior observation (q75)	2,552	2,555	≈
Prior observation (min)	0	0	✅
Prior observation (max)	29,295	29,295	✅
Future observation (mean)	5,015.84	5,015.84	≈
Future observation (sd)	4,887.62	4,887.62	≈
Future observation (median)	3,710	3,710	✅
Future observation (q25)	1,484	1,484	✅
Future observation (q75)	7,049	7,049	✅
Future observation (min)	0	0	✅
Future observation (max)	29,680	29,680	✅
Sex=Female (count)	292	292	✅
Sex=Female (percentage)	23.49	23.49	≈
Sex=Male (count)	951	951	✅
Sex=Male (percentage)	76.51	76.51	≈

05 — Incidence¶

R: IncidencePrevalence · Python: omopy.incidence

Metric	R	Python	Match
Numerator / Events (sum)	1,220	1,220	✅
Total Denominator (sum)	105,911	122,662	❌
Events (2000)	31	31	✅
Incidence/100K pys (2000)	5,243	3,439	❌
Events (2005)	31	31	✅
Incidence/100K pys (2005)	3,958	2,544	❌
Events (2010)	36	36	✅
Incidence/100K pys (2010)	4,364	2,842	❌
Events (2015)	17	17	✅
Incidence/100K pys (2015)	283	265	❌
Events (2020)	26	26	✅
Incidence/100K pys (2020)	416	388	❌

Note: Both implementations identify the same 1,220 events — the numerator matches exactly. The rate differences stem entirely from denominator person-time calculation: R's generateDenominatorCohortSet() excludes observation time outside the study window more aggressively, while OMOPy includes the full observation period overlap with each calendar year. This is a known algorithmic difference under investigation.

06 — Drug Utilisation¶

R: DrugUtilisation · Python: omopy.drug

Metric	R	Python	Match
Number records	1,756	1,473	❌
Number subjects	1,473	1,473	✅
Number eras (mean)	1	1.18	❌
Initial quantity (mean)	0	0	✅
Cumulative quantity (mean)	0	0	✅

Note: Subject counts match exactly (1,473). The 283 extra R records come from R's DrugUtilisation::generateIngredientCohortSet() producing overlapping exposure intervals before collapsing, whereas OMOPy deduplicates during cohort construction.

07 — Survival¶

R: CohortSurvival · Python: omopy.survival

Metric	R	Python	Match
Row Count	90,860	90,843	≈
Survival @ 1-year	0.9690 (day 365)	0.9690 (day 365)	✅
Survival @ 3-year	0.9084 (day 1095)	0.9084 (day 1095)	✅
Survival @ 5-year	0.8588 (day 1825)	0.8588 (day 1825)	✅

08 — Codelist Generation¶

R: CodelistGenerator · Python: omopy.codelist

Metric	R	Python	Match
Total Concepts	1,761	1,985	❌
Shared Concepts	1,761	1,761	✅
R-only Concepts	0	0	✅
Python-only Concepts	0	224	ℹ️
R concepts in Python	100.0%	—	ℹ️

Note: 100% of R concepts are found by Python. The 224 extra Python concepts come from broader descendant traversal in the OMOP vocabulary — a coverage advantage, not an error.

09 — Treatment Patterns¶

R: TreatmentPatterns · Python: omopy.treatment

Metric	R	Python	Match
Row Count	0	0	✅

Note: Both return 0 rows. Synthea's concept-based drug cohorts yield no matches in drug_exposure. This is a data limitation, not a code issue.

10 — Drug Diagnostics¶

R: DrugExposureDiagnostics · Python: omopy.drug_diagnostics

Metric	R	Python	Match
Check Count	12	19	ℹ️
Shared Checks	0	of 12 (R) / 5 (Py)	ℹ️

Note: The R benchmark saves a 12-row summary table (check_name, n_rows), while Python saves 19 detail rows with 43 columns. This is a benchmark script format difference, not a code difference. Both run the same 5 checks successfully.

Concordance Summary¶

55 / 64 checks passed (86%)

✅ = exact match
≈ = within 2% relative tolerance (acceptable for floating-point / boundary differences)
ℹ️ = informational difference (expected, see Known Differences)
❌ = differs (see Known Differences for explanation)

Quality Assurance¶

Test Suite¶

OMOPy maintains a comprehensive automated quality workflow to help ensure correctness:

Unit and integration tests cover core functionality across the project
Continuous integration runs checks on repository changes
Ruff linting and formatting are applied as part of the development workflow
Pre-commit hooks help catch non-conforming changes before they are committed

OMOP CDM Conformance¶

Both R and Python operate on the same DuckDB database (synthea_1k.duckdb)
CDM version 5.3.1, vocabulary v5.0 22-JUN-22
Schema: main with all 37 standard OMOP CDM tables
Data generated by Synthea with ~10,681 synthetic patients

API Design Philosophy¶

OMOPy follows the OHDSI R package APIs as closely as possible:

Function names use Python convention (snake_case) but map 1:1 to R equivalents
Output schemas follow the omop_result / summarised_result format
Concept sets, cohort definitions, and CDM references work the same way
See R Package Mapping for the complete correspondence table

General Notes on Differences¶

Area	Explanation
Column ordering	Python and R may order columns differently (e.g. `additional_name` position). Semantically identical.
`NA` vs `""`	R uses `NA` for missing categorical levels; Python uses empty string.
Casing	Some R packages use lowercase (`number records`); OMOPy uses title case (`Number records`).
Floating-point precision	Minor rounding differences (e.g. `57.14` vs `57.1360`) due to different numeric libraries.

How to Reproduce¶

# 1. Generate the test database (requires R)
Rscript benchmarks/generate_synthea_1k.R

# 2. Install R packages (one-time)
Rscript benchmarks/r/install_packages.R

# 3. Run R benchmarks
Rscript benchmarks/r/run_all.R

# 4. Run Python benchmarks
python benchmarks/python/run_all.py

# 5. Generate this comparison page
python benchmarks/compare.py

Notes¶

R Time and Python Time include CDM connection overhead
Rows shows result set size (schemas differ between R and Python)
Times are wall-clock, single-run, not averaged
The dataset is Synthea-generated with ~10K synthetic patients
See R Package Mapping for module correspondence