Skip to content

omopy.treatment

Treatment pathway analysis — compute sequential treatment pathways from OMOP CDM cohort data, summarise frequencies and durations, and visualise results as Sankey diagrams, sunburst charts, and box plots.

This module is the Python equivalent of the R TreatmentPatterns package. Plot rendering uses plotly; table rendering delegates to omopy.vis.

Core Types

Pydantic models for defining cohort roles and storing pathway results.

CohortSpec

Bases: BaseModel

Specification for a cohort used in pathway computation.

Parameters

cohort_id The cohort_definition_id in the cohort table. cohort_name Human-readable name for this cohort. type Role: "target" (defines observation window), "event" (treatment to track), or "exit" (appended after processing).

PathwayResult

Bases: BaseModel

Result container from :func:compute_pathways.

Contains patient-level treatment history, attrition tracking, and metadata. This is not a SummarisedResult; it is an intermediate representation that :func:summarise_treatment_pathways converts into the standardised format.

Attributes

treatment_history Patient-level treatment history with columns: person_id, index_year, event_cohort_id, event_cohort_name, event_start_date, event_end_date, duration_era, event_seq, age, sex, target_cohort_id, target_cohort_name. attrition Step-by-step record/subject counts through the pipeline. cohorts The cohort specifications used. cdm_name Name of the CDM database. arguments Dictionary of all arguments passed to compute_pathways().

Pathway Computation

Compute sequential treatment pathways from cohort data.

compute_pathways

compute_pathways(
    cohort: CohortTable,
    cdm: CdmReference,
    cohorts: list[CohortSpec],
    *,
    start_anchor: Literal[
        "start_date", "end_date"
    ] = "start_date",
    window_start: int = 0,
    end_anchor: Literal[
        "start_date", "end_date"
    ] = "end_date",
    window_end: int = 0,
    min_era_duration: int = 0,
    split_event_cohorts: list[int] | None = None,
    split_time: list[int] | None = None,
    era_collapse_size: int = 30,
    combination_window: int = 30,
    min_post_combination_duration: int = 30,
    filter_treatments: Literal[
        "first", "changes", "all"
    ] = "first",
    max_path_length: int = 5,
    overlap_method: Literal[
        "truncate", "keep"
    ] = "truncate",
    concat_targets: bool = True,
) -> PathwayResult

Compute treatment pathways from cohort data.

Takes a CohortTable with target, event, and optionally exit cohorts, and computes sequential treatment pathways through a multi-step pipeline: data ingestion, treatment history construction, optional event splitting, era collapse, combination detection, treatment filtering, and pathway sequencing.

Parameters

cohort A CohortTable containing all target, event, and exit cohorts. cdm The CdmReference for demographic data and CDM metadata. cohorts List of :class:CohortSpec defining the role of each cohort. start_anchor Anchor for observation window start: "start_date" or "end_date" of the target cohort. window_start Day offset from start_anchor for the observation window start. end_anchor Anchor for observation window end: "start_date" or "end_date" of the target cohort. window_end Day offset from end_anchor for the observation window end. min_era_duration Minimum duration in days for an event era to be included. split_event_cohorts Cohort IDs to split into acute/therapy based on duration. split_time Day cutoffs for splitting (parallel to split_event_cohorts). era_collapse_size Maximum gap in days within which consecutive same-drug eras are merged. combination_window Minimum overlap in days for two drugs to be considered a combination treatment. min_post_combination_duration Minimum duration in days for eras flanking combinations. filter_treatments Strategy: "first" (keep first occurrence of each drug), "changes" (remove consecutive duplicates), "all" (keep everything). max_path_length Maximum number of treatment steps in a pathway. overlap_method How to handle short overlaps (not combinations): "truncate" clips the earlier era, "keep" preserves original dates. concat_targets If True, treat multiple target entries per person as a single continuous observation.

Returns

PathwayResult Contains patient-level treatment history, attrition, cohort specifications, and metadata.

Summarise Functions

Aggregate pathway results into frequencies and duration statistics.

summarise_treatment_pathways

summarise_treatment_pathways(
    result: PathwayResult,
    *,
    age_window: int | list[int] = 10,
    min_cell_count: int = 5,
    strata: list[str | list[str]] | None = None,
    include_none_paths: bool = False,
) -> SummarisedResult

Aggregate treatment pathways into a SummarisedResult.

Converts patient-level PathwayResult from :func:compute_pathways into aggregate pathway frequency counts, optionally stratified by age group, sex, and/or index year.

Parameters

result Output from :func:compute_pathways. age_window Age bin size (single int) or list of breakpoints for age groups. min_cell_count Minimum frequency for a pathway to be included (privacy). strata Additional stratification columns. Built-in strata for age, sex, and index_year are always available via the PathwayResult demographics. include_none_paths If True, include persons with no treatment events as a "None" pathway.

Returns

SummarisedResult With result_type="summarise_treatment_pathways".

summarise_event_duration

summarise_event_duration(
    result: PathwayResult, *, min_cell_count: int = 0
) -> SummarisedResult

Summarise duration statistics of treatment events.

Computes min, Q1, median, Q3, max, mean, and SD of event durations, broken down by:

  • Overall: all events combined
  • Per treatment line (1st-line, 2nd-line, etc.)
  • Per individual treatment (drug name)
  • Per individual treatment per line

Parameters

result Output from :func:compute_pathways. min_cell_count Minimum number of events required for a statistic to be included.

Returns

SummarisedResult With result_type="summarise_event_duration".

Table Functions

Format summarised results as publication-ready tables using omopy.vis.vis_omop_table().

table_treatment_pathways

table_treatment_pathways(
    result: SummarisedResult,
    *,
    type: Literal["gt", "polars"] | None = None,
    header: list[str] | None = None,
    group_column: list[str] | None = None,
    hide: list[str] | None = None,
    style: Any | None = None,
) -> Any

Format treatment pathway results as a display-ready table.

Parameters

result A SummarisedResult with result_type="summarise_treatment_pathways". type Output format: "gt" for great_tables.GT, "polars" for a Polars DataFrame. Default is "polars". header Columns to use as multi-level headers. group_column Columns to use for row grouping. hide Columns to hide from the output. style Optional TableStyle for customisation.

Returns

great_tables.GT | polars.DataFrame Formatted table.

table_event_duration

table_event_duration(
    result: SummarisedResult,
    *,
    type: Literal["gt", "polars"] | None = None,
    header: list[str] | None = None,
    group_column: list[str] | None = None,
    hide: list[str] | None = None,
    style: Any | None = None,
) -> Any

Format event duration results as a display-ready table.

Parameters

result A SummarisedResult with result_type="summarise_event_duration". type Output format: "gt" for great_tables.GT, "polars" for a Polars DataFrame. header Columns to use as multi-level headers. group_column Columns to use for row grouping. hide Columns to hide from the output. style Optional TableStyle for customisation.

Returns

great_tables.GT | polars.DataFrame Formatted table.

Plot Functions

Sankey diagrams, sunburst charts, and event duration box plots.

plot_sankey

plot_sankey(
    result: SummarisedResult,
    *,
    group_combinations: bool = False,
    colors: dict[str, str] | list[str] | None = None,
    max_paths: int = 20,
    title: str = "Treatment Pathways",
) -> Any

Create a Sankey diagram of treatment pathways.

Each treatment line is represented as a column of nodes. Links flow from one treatment step to the next, with width proportional to patient count.

Parameters

result A SummarisedResult with result_type="summarise_treatment_pathways". group_combinations If True, replace combination treatments (e.g. "A+B") with a generic "Combination" label. colors Optional color mapping. Either a dict mapping treatment names to hex colors, or a list of hex colors to cycle through. max_paths Maximum number of pathways to display (top N by frequency). title Chart title.

Returns

plotly.graph_objects.Figure Sankey diagram figure.

plot_sunburst

plot_sunburst(
    result: SummarisedResult,
    *,
    group_combinations: bool = False,
    colors: dict[str, str] | list[str] | None = None,
    max_paths: int = 30,
    title: str = "Treatment Pathways",
    unit: str = "percent",
) -> Any

Create a sunburst chart of treatment pathways.

Inner ring represents first-line treatment; outer rings represent subsequent treatment lines.

Parameters

result A SummarisedResult with result_type="summarise_treatment_pathways". group_combinations If True, replace combination treatments with "Combination". colors Optional color mapping for treatments. max_paths Maximum number of pathways to display. title Chart title. unit "percent" or "count" for hover labels.

Returns

plotly.graph_objects.Figure Sunburst chart figure.

plot_event_duration

plot_event_duration(
    result: SummarisedResult,
    *,
    min_cell_count: int = 0,
    treatment_groups: str = "both",
    event_lines: list[int] | None = None,
    include_overall: bool = True,
    title: str = "Event Duration",
) -> Any

Create box plots of treatment event durations.

Parameters

result A SummarisedResult with result_type="summarise_event_duration". min_cell_count Filter events with count below this threshold. treatment_groups "both" (mono + combination + individual), "group" (mono/combination only), "individual" (per-drug only). event_lines Pathway positions to include (None = all). include_overall Include the "overall" line aggregation. title Chart title.

Returns

plotly.graph_objects.Figure Box plot figure.

Mock Data

mock_treatment_pathways

mock_treatment_pathways(
    *,
    n_targets: int = 1,
    n_drugs: int = 4,
    n_pathways: int = 15,
    include_duration: bool = True,
    seed: int = 42,
) -> SummarisedResult

Generate a mock SummarisedResult for treatment pathways.

Creates synthetic data representative of summarise_treatment_pathways() and optionally summarise_event_duration() output, useful for testing table/plot functions without requiring a database.

Parameters

n_targets Number of target cohorts to simulate. n_drugs Number of distinct drug treatments to include. n_pathways Number of distinct pathways to generate per target. include_duration If True, also include summarise_event_duration rows. seed Random seed for reproducibility.

Returns

SummarisedResult With result_type values "summarise_treatment_pathways" and optionally "summarise_event_duration".