Skip to content

R Package Mapping

OMOPy reimplements the OHDSI / DARWIN-EU R package ecosystem as a single Python monorepo package. The table below shows how each R package maps to an OMOPy module.

For the full development history, design decisions, and technical details behind each module, see the Audit Trail.

Package ↔ Module mapping

OHDSI R Package OMOPy Module Phase Description
omopgenerics omopy.generics 0 Core type system — CDM schema, codelists, cohort tables, summarised results
CDMConnector omopy.connector 1–2 Database connection, CDM reference, cohort generation, CIRCE engine, subsetting, snapshots
PatientProfiles omopy.profiles 3A Patient-level enrichment — demographics, intersections (flag/count/date/days), death
CodelistGenerator omopy.codelist 3B Vocabulary-based code list generation, hierarchy traversal, diagnostics
visOmopResults omopy.vis 3C Formatting, tabulation, and plotting of SummarisedResult objects
CohortCharacteristics omopy.characteristics 4A Cohort characterisation — summarise, table, and plot functions for demographics, timing, overlap
IncidencePrevalence omopy.incidence 4B Denominator generation, incidence rate and prevalence estimation with confidence intervals
DrugUtilisation omopy.drug 5A Drug cohort generation, daily dose, utilisation metrics, indication, treatment, dose coverage
CohortSurvival omopy.survival 5B Kaplan-Meier and Aalen-Johansen competing-risk survival analysis
TreatmentPatterns omopy.treatment 6A Sequential treatment pathway computation, Sankey/sunburst visualisation
DrugExposureDiagnostics omopy.drug_diagnostics 6B 12 diagnostic checks on drug exposure records (missingness, duration, dose, etc.)
PregnancyIdentifier omopy.pregnancy 7A HIPPS algorithm for pregnancy episode identification
TestGenerator omopy.testing 8A Synthetic OMOP CDM test data generation

Key technology differences

The table below summarises the main technology substitutions made in the Python rewrite:

Concern R ecosystem OMOPy (Python)
Lazy SQL dbplyr Ibis
DataFrames tibble / data.frame Polars
Data models S4 classes / R6 Pydantic BaseModel
Plotting ggplot2 + plotly Plotly
Tables gt great_tables
Survival survival (R) lifelines + custom Aalen-Johansen
Statistics stats (R) SciPy
Package manager renv uv

Design philosophy

  1. Single package. All 13 R packages are consolidated into one installable Python package (pip install omopy) with sub-modules.
  2. Clean-room implementation. Code was written against specifications and documentation only — no R source code was consulted.
  3. Lazy by default. Database queries are built as Ibis expressions and only executed when .collect() is called.
  4. Standardised output. All analytics produce SummarisedResult objects (the Python equivalent of summarised_result in omopgenerics), enabling consistent downstream formatting, tabulation, and plotting.