Skip to content

R vs Python Comparison

Auto-generated by benchmarks/compare.py on 2026-04-23 10:06

This page compares results from the Darwin EU R packages and OMOPy Python equivalents, both run against the same synthea_1k.duckdb dataset (~10,681 patients, OMOP CDM v5.3).


Cohort Overview

The benchmarks use 4 clinical concepts to build cohorts from the full 10,681-patient database:

  • Coronary Arteriosclerosis (concept 317576) — condition cohort, 1,243 subjects
  • Clopidogrel (concept 1322184) — drug cohort, 1,473 subjects
  • Simvastatin (concept 1539403) — drug cohort, used in treatment patterns
  • "coronary" keyword search — vocabulary-based codelist generation

The diagram below shows which cohort feeds each benchmark, explaining why subject counts differ across sections.

Cohort Overview


Timing & Row-Count Summary

# Benchmark R Package OMOPy Module R Time Python Time R Rows Python Rows
01 CDM Snapshot CDMConnector omopy.connector 2.42s 10.69s 1 1
02 Cohort Generation CDMConnector omopy.connector 3.54s 6.35s 1 1
03 Patient Profiles PatientProfiles omopy.profiles 4.80s 5.69s 100 100
04 Cohort Characteristics CohortCharacteristics omopy.characteristics 5.42s 4.60s 51 34
05 Incidence IncidencePrevalence omopy.incidence 9.21s 26.34s 716 672
06 Drug Utilisation DrugUtilisation omopy.drug 13.67s 10.27s 58 148
07 Survival CohortSurvival omopy.survival 10.86s 9.24s 90860 90843
08 Codelist Generation CodelistGenerator omopy.codelist 3.61s 4.72s 1761 1985
09 Treatment Patterns TreatmentPatterns omopy.treatment 9.00s 4.66s 0 0
10 Drug Diagnostics DrugExposureDiagnostics omopy.drug_diagnostics 7.17s 6.00s 12 19

Value Concordance

The tables below compare specific output values between R and Python for each benchmark. This demonstrates that OMOPy produces consistent results — not just similar row counts.

01 — CDM Snapshot

R: CDMConnector · Python: omopy.connector

Metric R Python Match
CDM Version 5.3.1 5.3.1
Vocabulary Version v5.0 22-JUN-22 v5.0 22-JUN-22
Person Count 10,681 10,681
Observation Period Count 10,681 10,681
Earliest Obs Start 1926-08-15 1926-08-15
Latest Obs End 2023-06-20 2023-06-20

02 — Cohort Generation

R: CDMConnector · Python: omopy.connector

Metric R Python Match
n_records 1,243 1,243
n_subjects 1,243 1,243

03 — Patient Profiles

R: PatientProfiles · Python: omopy.profiles

Metric R Python Match
Row Count 100 100
Mean Age 56.10 56.10
Sex = Female 23 23
Sex = Male 77 77
Subject ID Overlap 100/100 100/100

04 — Cohort Characteristics

R: CohortCharacteristics · Python: omopy.characteristics

Metric R Python Match
Number records (count) 1,243 1,243
Number subjects (count) 1,243 1,243
Age (mean) 57.14 57.14
Age (sd) 23.94 23.94
Age (median) 63 63
Age (q25) 46 46
Age (q75) 75 75
Age (min) 0 0
Age (max) 98 98
Prior observation (mean) 1,775.47 1,775.47
Prior observation (sd) 3,296.43 3,296.43
Prior observation (median) 371 371
Prior observation (q25) 0 0
Prior observation (q75) 2,552 2,555
Prior observation (min) 0 0
Prior observation (max) 29,295 29,295
Future observation (mean) 5,015.84 5,015.84
Future observation (sd) 4,887.62 4,887.62
Future observation (median) 3,710 3,710
Future observation (q25) 1,484 1,484
Future observation (q75) 7,049 7,049
Future observation (min) 0 0
Future observation (max) 29,680 29,680
Sex=Female (count) 292 292
Sex=Female (percentage) 23.49 23.49
Sex=Male (count) 951 951
Sex=Male (percentage) 76.51 76.51

05 — Incidence

R: IncidencePrevalence · Python: omopy.incidence

Metric R Python Match
Numerator / Events (sum) 1,220 1,220
Total Denominator (sum) 105,911 122,662
Events (2000) 31 31
Incidence/100K pys (2000) 5,243 3,439
Events (2005) 31 31
Incidence/100K pys (2005) 3,958 2,544
Events (2010) 36 36
Incidence/100K pys (2010) 4,364 2,842
Events (2015) 17 17
Incidence/100K pys (2015) 283 265
Events (2020) 26 26
Incidence/100K pys (2020) 416 388

Note: Both implementations identify the same 1,220 events — the numerator matches exactly. The rate differences stem entirely from denominator person-time calculation: R's generateDenominatorCohortSet() excludes observation time outside the study window more aggressively, while OMOPy includes the full observation period overlap with each calendar year. This is a known algorithmic difference under investigation.

06 — Drug Utilisation

R: DrugUtilisation · Python: omopy.drug

Metric R Python Match
Number records 1,756 1,473
Number subjects 1,473 1,473
Number eras (mean) 1 1.18
Initial quantity (mean) 0 0
Cumulative quantity (mean) 0 0

Note: Subject counts match exactly (1,473). The 283 extra R records come from R's DrugUtilisation::generateIngredientCohortSet() producing overlapping exposure intervals before collapsing, whereas OMOPy deduplicates during cohort construction.

07 — Survival

R: CohortSurvival · Python: omopy.survival

Metric R Python Match
Row Count 90,860 90,843
Survival @ 1-year 0.9690 (day 365) 0.9690 (day 365)
Survival @ 3-year 0.9084 (day 1095) 0.9084 (day 1095)
Survival @ 5-year 0.8588 (day 1825) 0.8588 (day 1825)

08 — Codelist Generation

R: CodelistGenerator · Python: omopy.codelist

Metric R Python Match
Total Concepts 1,761 1,985
Shared Concepts 1,761 1,761
R-only Concepts 0 0
Python-only Concepts 0 224 ℹ️
R concepts in Python 100.0% ℹ️

Note: 100% of R concepts are found by Python. The 224 extra Python concepts come from broader descendant traversal in the OMOP vocabulary — a coverage advantage, not an error.

09 — Treatment Patterns

R: TreatmentPatterns · Python: omopy.treatment

Metric R Python Match
Row Count 0 0

Note: Both return 0 rows. Synthea's concept-based drug cohorts yield no matches in drug_exposure. This is a data limitation, not a code issue.

10 — Drug Diagnostics

R: DrugExposureDiagnostics · Python: omopy.drug_diagnostics

Metric R Python Match
Check Count 12 19 ℹ️
Shared Checks 0 of 12 (R) / 5 (Py) ℹ️

Note: The R benchmark saves a 12-row summary table (check_name, n_rows), while Python saves 19 detail rows with 43 columns. This is a benchmark script format difference, not a code difference. Both run the same 5 checks successfully.


Concordance Summary

55 / 64 checks passed (86%)

  • ✅ = exact match
  • ≈ = within 2% relative tolerance (acceptable for floating-point / boundary differences)
  • ℹ️ = informational difference (expected, see Known Differences)
  • ❌ = differs (see Known Differences for explanation)

Quality Assurance

Test Suite

OMOPy maintains a comprehensive automated quality workflow to help ensure correctness:

  • Unit and integration tests cover core functionality across the project
  • Continuous integration runs checks on repository changes
  • Ruff linting and formatting are applied as part of the development workflow
  • Pre-commit hooks help catch non-conforming changes before they are committed

OMOP CDM Conformance

  • Both R and Python operate on the same DuckDB database (synthea_1k.duckdb)
  • CDM version 5.3.1, vocabulary v5.0 22-JUN-22
  • Schema: main with all 37 standard OMOP CDM tables
  • Data generated by Synthea with ~10,681 synthetic patients

API Design Philosophy

OMOPy follows the OHDSI R package APIs as closely as possible:

  • Function names use Python convention (snake_case) but map 1:1 to R equivalents
  • Output schemas follow the omop_result / summarised_result format
  • Concept sets, cohort definitions, and CDM references work the same way
  • See R Package Mapping for the complete correspondence table

General Notes on Differences

Area Explanation
Column ordering Python and R may order columns differently (e.g. additional_name position). Semantically identical.
NA vs "" R uses NA for missing categorical levels; Python uses empty string.
Casing Some R packages use lowercase (number records); OMOPy uses title case (Number records).
Floating-point precision Minor rounding differences (e.g. 57.14 vs 57.1360) due to different numeric libraries.

How to Reproduce

# 1. Generate the test database (requires R)
Rscript benchmarks/generate_synthea_1k.R

# 2. Install R packages (one-time)
Rscript benchmarks/r/install_packages.R

# 3. Run R benchmarks
Rscript benchmarks/r/run_all.R

# 4. Run Python benchmarks
python benchmarks/python/run_all.py

# 5. Generate this comparison page
python benchmarks/compare.py

Notes

  • R Time and Python Time include CDM connection overhead
  • Rows shows result set size (schemas differ between R and Python)
  • Times are wall-clock, single-run, not averaged
  • The dataset is Synthea-generated with ~10K synthetic patients
  • See R Package Mapping for module correspondence