omopy.treatment¶
Treatment pathway analysis — compute sequential treatment pathways from OMOP CDM cohort data, summarise frequencies and durations, and visualise results as Sankey diagrams, sunburst charts, and box plots.
This module is the Python equivalent of the R TreatmentPatterns package.
Plot rendering uses plotly; table rendering
delegates to omopy.vis.
Core Types¶
Pydantic models for defining cohort roles and storing pathway results.
CohortSpec
¶
Bases: BaseModel
Specification for a cohort used in pathway computation.
Parameters¶
cohort_id
The cohort_definition_id in the cohort table.
cohort_name
Human-readable name for this cohort.
type
Role: "target" (defines observation window), "event"
(treatment to track), or "exit" (appended after processing).
PathwayResult
¶
Bases: BaseModel
Result container from :func:compute_pathways.
Contains patient-level treatment history, attrition tracking,
and metadata. This is not a SummarisedResult; it is an
intermediate representation that :func:summarise_treatment_pathways
converts into the standardised format.
Attributes¶
treatment_history
Patient-level treatment history with columns:
person_id, index_year, event_cohort_id,
event_cohort_name, event_start_date, event_end_date,
duration_era, event_seq, age, sex,
target_cohort_id, target_cohort_name.
attrition
Step-by-step record/subject counts through the pipeline.
cohorts
The cohort specifications used.
cdm_name
Name of the CDM database.
arguments
Dictionary of all arguments passed to compute_pathways().
Pathway Computation¶
Compute sequential treatment pathways from cohort data.
compute_pathways
¶
compute_pathways(
cohort: CohortTable,
cdm: CdmReference,
cohorts: list[CohortSpec],
*,
start_anchor: Literal[
"start_date", "end_date"
] = "start_date",
window_start: int = 0,
end_anchor: Literal[
"start_date", "end_date"
] = "end_date",
window_end: int = 0,
min_era_duration: int = 0,
split_event_cohorts: list[int] | None = None,
split_time: list[int] | None = None,
era_collapse_size: int = 30,
combination_window: int = 30,
min_post_combination_duration: int = 30,
filter_treatments: Literal[
"first", "changes", "all"
] = "first",
max_path_length: int = 5,
overlap_method: Literal[
"truncate", "keep"
] = "truncate",
concat_targets: bool = True,
) -> PathwayResult
Compute treatment pathways from cohort data.
Takes a CohortTable with target, event, and optionally exit
cohorts, and computes sequential treatment pathways through a
multi-step pipeline: data ingestion, treatment history construction,
optional event splitting, era collapse, combination detection,
treatment filtering, and pathway sequencing.
Parameters¶
cohort
A CohortTable containing all target, event, and exit cohorts.
cdm
The CdmReference for demographic data and CDM metadata.
cohorts
List of :class:CohortSpec defining the role of each cohort.
start_anchor
Anchor for observation window start: "start_date" or
"end_date" of the target cohort.
window_start
Day offset from start_anchor for the observation window start.
end_anchor
Anchor for observation window end: "start_date" or
"end_date" of the target cohort.
window_end
Day offset from end_anchor for the observation window end.
min_era_duration
Minimum duration in days for an event era to be included.
split_event_cohorts
Cohort IDs to split into acute/therapy based on duration.
split_time
Day cutoffs for splitting (parallel to split_event_cohorts).
era_collapse_size
Maximum gap in days within which consecutive same-drug eras are
merged.
combination_window
Minimum overlap in days for two drugs to be considered a
combination treatment.
min_post_combination_duration
Minimum duration in days for eras flanking combinations.
filter_treatments
Strategy: "first" (keep first occurrence of each drug),
"changes" (remove consecutive duplicates), "all" (keep
everything).
max_path_length
Maximum number of treatment steps in a pathway.
overlap_method
How to handle short overlaps (not combinations): "truncate"
clips the earlier era, "keep" preserves original dates.
concat_targets
If True, treat multiple target entries per person as a
single continuous observation.
Returns¶
PathwayResult Contains patient-level treatment history, attrition, cohort specifications, and metadata.
Summarise Functions¶
Aggregate pathway results into frequencies and duration statistics.
summarise_treatment_pathways
¶
summarise_treatment_pathways(
result: PathwayResult,
*,
age_window: int | list[int] = 10,
min_cell_count: int = 5,
strata: list[str | list[str]] | None = None,
include_none_paths: bool = False,
) -> SummarisedResult
Aggregate treatment pathways into a SummarisedResult.
Converts patient-level PathwayResult from :func:compute_pathways
into aggregate pathway frequency counts, optionally stratified by
age group, sex, and/or index year.
Parameters¶
result
Output from :func:compute_pathways.
age_window
Age bin size (single int) or list of breakpoints for age groups.
min_cell_count
Minimum frequency for a pathway to be included (privacy).
strata
Additional stratification columns. Built-in strata for age, sex,
and index_year are always available via the PathwayResult
demographics.
include_none_paths
If True, include persons with no treatment events as a
"None" pathway.
Returns¶
SummarisedResult
With result_type="summarise_treatment_pathways".
summarise_event_duration
¶
Summarise duration statistics of treatment events.
Computes min, Q1, median, Q3, max, mean, and SD of event durations, broken down by:
- Overall: all events combined
- Per treatment line (1st-line, 2nd-line, etc.)
- Per individual treatment (drug name)
- Per individual treatment per line
Parameters¶
result
Output from :func:compute_pathways.
min_cell_count
Minimum number of events required for a statistic to be included.
Returns¶
SummarisedResult
With result_type="summarise_event_duration".
Table Functions¶
Format summarised results as publication-ready tables using
omopy.vis.vis_omop_table().
table_treatment_pathways
¶
table_treatment_pathways(
result: SummarisedResult,
*,
type: Literal["gt", "polars"] | None = None,
header: list[str] | None = None,
group_column: list[str] | None = None,
hide: list[str] | None = None,
style: Any | None = None,
) -> Any
Format treatment pathway results as a display-ready table.
Parameters¶
result
A SummarisedResult with result_type="summarise_treatment_pathways".
type
Output format: "gt" for great_tables.GT, "polars" for
a Polars DataFrame. Default is "polars".
header
Columns to use as multi-level headers.
group_column
Columns to use for row grouping.
hide
Columns to hide from the output.
style
Optional TableStyle for customisation.
Returns¶
great_tables.GT | polars.DataFrame Formatted table.
table_event_duration
¶
table_event_duration(
result: SummarisedResult,
*,
type: Literal["gt", "polars"] | None = None,
header: list[str] | None = None,
group_column: list[str] | None = None,
hide: list[str] | None = None,
style: Any | None = None,
) -> Any
Format event duration results as a display-ready table.
Parameters¶
result
A SummarisedResult with result_type="summarise_event_duration".
type
Output format: "gt" for great_tables.GT, "polars" for
a Polars DataFrame.
header
Columns to use as multi-level headers.
group_column
Columns to use for row grouping.
hide
Columns to hide from the output.
style
Optional TableStyle for customisation.
Returns¶
great_tables.GT | polars.DataFrame Formatted table.
Plot Functions¶
Sankey diagrams, sunburst charts, and event duration box plots.
plot_sankey
¶
plot_sankey(
result: SummarisedResult,
*,
group_combinations: bool = False,
colors: dict[str, str] | list[str] | None = None,
max_paths: int = 20,
title: str = "Treatment Pathways",
) -> Any
Create a Sankey diagram of treatment pathways.
Each treatment line is represented as a column of nodes. Links flow from one treatment step to the next, with width proportional to patient count.
Parameters¶
result
A SummarisedResult with
result_type="summarise_treatment_pathways".
group_combinations
If True, replace combination treatments (e.g. "A+B")
with a generic "Combination" label.
colors
Optional color mapping. Either a dict mapping treatment names
to hex colors, or a list of hex colors to cycle through.
max_paths
Maximum number of pathways to display (top N by frequency).
title
Chart title.
Returns¶
plotly.graph_objects.Figure Sankey diagram figure.
plot_sunburst
¶
plot_sunburst(
result: SummarisedResult,
*,
group_combinations: bool = False,
colors: dict[str, str] | list[str] | None = None,
max_paths: int = 30,
title: str = "Treatment Pathways",
unit: str = "percent",
) -> Any
Create a sunburst chart of treatment pathways.
Inner ring represents first-line treatment; outer rings represent subsequent treatment lines.
Parameters¶
result
A SummarisedResult with
result_type="summarise_treatment_pathways".
group_combinations
If True, replace combination treatments with "Combination".
colors
Optional color mapping for treatments.
max_paths
Maximum number of pathways to display.
title
Chart title.
unit
"percent" or "count" for hover labels.
Returns¶
plotly.graph_objects.Figure Sunburst chart figure.
plot_event_duration
¶
plot_event_duration(
result: SummarisedResult,
*,
min_cell_count: int = 0,
treatment_groups: str = "both",
event_lines: list[int] | None = None,
include_overall: bool = True,
title: str = "Event Duration",
) -> Any
Create box plots of treatment event durations.
Parameters¶
result
A SummarisedResult with
result_type="summarise_event_duration".
min_cell_count
Filter events with count below this threshold.
treatment_groups
"both" (mono + combination + individual), "group"
(mono/combination only), "individual" (per-drug only).
event_lines
Pathway positions to include (None = all).
include_overall
Include the "overall" line aggregation.
title
Chart title.
Returns¶
plotly.graph_objects.Figure Box plot figure.
Mock Data¶
mock_treatment_pathways
¶
mock_treatment_pathways(
*,
n_targets: int = 1,
n_drugs: int = 4,
n_pathways: int = 15,
include_duration: bool = True,
seed: int = 42,
) -> SummarisedResult
Generate a mock SummarisedResult for treatment pathways.
Creates synthetic data representative of
summarise_treatment_pathways() and optionally
summarise_event_duration() output, useful for testing
table/plot functions without requiring a database.
Parameters¶
n_targets
Number of target cohorts to simulate.
n_drugs
Number of distinct drug treatments to include.
n_pathways
Number of distinct pathways to generate per target.
include_duration
If True, also include summarise_event_duration rows.
seed
Random seed for reproducibility.
Returns¶
SummarisedResult
With result_type values "summarise_treatment_pathways"
and optionally "summarise_event_duration".