R Package Mapping¶
OMOPy reimplements the OHDSI / DARWIN-EU R package ecosystem as a single Python monorepo package. The table below shows how each R package maps to an OMOPy module.
For the full development history, design decisions, and technical details behind each module, see the Audit Trail.
Package ↔ Module mapping¶
| OHDSI R Package | OMOPy Module | Phase | Description |
|---|---|---|---|
| omopgenerics | omopy.generics |
0 | Core type system — CDM schema, codelists, cohort tables, summarised results |
| CDMConnector | omopy.connector |
1–2 | Database connection, CDM reference, cohort generation, CIRCE engine, subsetting, snapshots |
| PatientProfiles | omopy.profiles |
3A | Patient-level enrichment — demographics, intersections (flag/count/date/days), death |
| CodelistGenerator | omopy.codelist |
3B | Vocabulary-based code list generation, hierarchy traversal, diagnostics |
| visOmopResults | omopy.vis |
3C | Formatting, tabulation, and plotting of SummarisedResult objects |
| CohortCharacteristics | omopy.characteristics |
4A | Cohort characterisation — summarise, table, and plot functions for demographics, timing, overlap |
| IncidencePrevalence | omopy.incidence |
4B | Denominator generation, incidence rate and prevalence estimation with confidence intervals |
| DrugUtilisation | omopy.drug |
5A | Drug cohort generation, daily dose, utilisation metrics, indication, treatment, dose coverage |
| CohortSurvival | omopy.survival |
5B | Kaplan-Meier and Aalen-Johansen competing-risk survival analysis |
| TreatmentPatterns | omopy.treatment |
6A | Sequential treatment pathway computation, Sankey/sunburst visualisation |
| DrugExposureDiagnostics | omopy.drug_diagnostics |
6B | 12 diagnostic checks on drug exposure records (missingness, duration, dose, etc.) |
| PregnancyIdentifier | omopy.pregnancy |
7A | HIPPS algorithm for pregnancy episode identification |
| TestGenerator | omopy.testing |
8A | Synthetic OMOP CDM test data generation |
Key technology differences¶
The table below summarises the main technology substitutions made in the Python rewrite:
| Concern | R ecosystem | OMOPy (Python) |
|---|---|---|
| Lazy SQL | dbplyr | Ibis |
| DataFrames | tibble / data.frame | Polars |
| Data models | S4 classes / R6 | Pydantic BaseModel |
| Plotting | ggplot2 + plotly | Plotly |
| Tables | gt | great_tables |
| Survival | survival (R) | lifelines + custom Aalen-Johansen |
| Statistics | stats (R) | SciPy |
| Package manager | renv | uv |
Design philosophy¶
- Single package. All 13 R packages are consolidated into one
installable Python package (
pip install omopy) with sub-modules. - Clean-room implementation. Code was written against specifications and documentation only — no R source code was consulted.
- Lazy by default. Database queries are built as Ibis expressions
and only executed when
.collect()is called. - Standardised output. All analytics produce
SummarisedResultobjects (the Python equivalent ofsummarised_resultin omopgenerics), enabling consistent downstream formatting, tabulation, and plotting.