omopy.survival¶

Cohort survival analysis — Kaplan-Meier estimation, competing risk cumulative incidence, risk tables, and survival plots.

This module is the Python equivalent of the R CohortSurvival package. Kaplan-Meier estimation uses lifelines; competing risk analysis uses a custom Aalen-Johansen implementation. Table rendering delegates to omopy.vis; plot rendering uses plotly.

Core Estimation¶

Compute survival curves, summary statistics, risk tables, and attrition from target and outcome cohorts.

estimate_single_event_survival ¶

estimate_single_event_survival(
    cdm: CdmReference,
    target_cohort_table: str,
    outcome_cohort_table: str,
    *,
    target_cohort_id: int | list[int] | None = None,
    outcome_cohort_id: int | list[int] | None = None,
    outcome_date_variable: str = "cohort_start_date",
    outcome_washout: int | float = float("inf"),
    censor_on_cohort_exit: bool = False,
    censor_on_date: str | None = None,
    follow_up_days: int | float = float("inf"),
    strata: list[str] | None = None,
    event_gap: int = 30,
    estimate_gap: int = 1,
    restricted_mean_follow_up: int | None = None,
    minimum_survival_days: int = 1,
) -> SummarisedResult

Estimate Kaplan-Meier survival for a single event.

Parameters¶

cdm CDM reference. target_cohort_table Name of the target (exposure/index) cohort table. outcome_cohort_table Name of the outcome cohort table. target_cohort_id Which target cohort IDs to analyse. None = all. outcome_cohort_id Which outcome cohort IDs to analyse. None = all. outcome_date_variable Date column in outcome cohort for event timing. outcome_washout Days before index to exclude prior events. inf = entire history. censor_on_cohort_exit Censor at target cohort end date rather than observation end. censor_on_date Column name with a censoring date. follow_up_days Maximum follow-up in days. inf = no cap. strata Column names for stratification (must be in target cohort). event_gap Interval width in days for the risk table. estimate_gap Step size in days for the survival curve estimates. restricted_mean_follow_up Time horizon for restricted mean survival. None = max observed. minimum_survival_days Exclude persons with follow-up < this many days.

Returns¶

SummarisedResult Result with survival_estimates, survival_events, survival_summary, and survival_attrition result types.

estimate_competing_risk_survival ¶

estimate_competing_risk_survival(
    cdm: CdmReference,
    target_cohort_table: str,
    outcome_cohort_table: str,
    competing_outcome_cohort_table: str,
    *,
    target_cohort_id: int | list[int] | None = None,
    outcome_cohort_id: int | list[int] | None = None,
    outcome_date_variable: str = "cohort_start_date",
    outcome_washout: int | float = float("inf"),
    competing_outcome_cohort_id: int
    | list[int]
    | None = None,
    competing_outcome_date_variable: str = "cohort_start_date",
    competing_outcome_washout: int | float = float("inf"),
    censor_on_cohort_exit: bool = False,
    censor_on_date: str | None = None,
    follow_up_days: int | float = float("inf"),
    strata: list[str] | None = None,
    event_gap: int = 30,
    estimate_gap: int = 1,
    restricted_mean_follow_up: int | None = None,
    minimum_survival_days: int = 1,
) -> SummarisedResult

Estimate cumulative incidence with competing risks.

Uses the Aalen-Johansen estimator to compute cumulative incidence functions (CIF) in the presence of a competing event.

Parameters¶

cdm CDM reference. target_cohort_table Name of the target cohort table. outcome_cohort_table Name of the primary outcome cohort table. competing_outcome_cohort_table Name of the competing outcome cohort table. target_cohort_id, outcome_cohort_id Cohort ID filters. outcome_date_variable Date column for primary outcome timing. outcome_washout Washout period for primary outcome. competing_outcome_cohort_id Which competing outcome cohort IDs to use. competing_outcome_date_variable Date column for competing outcome timing. competing_outcome_washout Washout period for competing outcome. censor_on_cohort_exit Censor at target cohort end. censor_on_date Column with censoring date. follow_up_days Maximum follow-up in days. strata Stratification columns. event_gap, estimate_gap Risk table interval and estimate step. restricted_mean_follow_up Time horizon for restricted mean. minimum_survival_days Minimum follow-up filter.

Returns¶

SummarisedResult Result with competing risk cumulative incidence estimates.

Add Survival Columns¶

Enrich a cohort table with survival time and event status columns.

add_cohort_survival ¶

add_cohort_survival(
    x: CdmTable,
    cdm: CdmReference | None = None,
    *,
    outcome_cohort_table: str | CohortTable,
    outcome_cohort_id: int = 1,
    outcome_date_variable: str = "cohort_start_date",
    outcome_washout: int | float = float("inf"),
    censor_on_cohort_exit: bool = False,
    censor_on_date: str | None = None,
    follow_up_days: int | float = float("inf"),
    time_column: str = "time",
    status_column: str = "status",
) -> CdmTable

Add survival time and status columns to a cohort table.

Computes days-to-event/censoring for each person in the input cohort, relative to an outcome cohort.

Parameters¶

x Input cohort table (must have subject_id, cohort_start_date, cohort_end_date). cdm CDM reference. If None, uses x.cdm. outcome_cohort_table Name of the outcome cohort table in the CDM, or a CohortTable. outcome_cohort_id Which cohort definition ID in the outcome cohort to use. outcome_date_variable Date column in the outcome cohort for the event date. outcome_washout Number of days before index date to check for prior outcome. float('inf') means check the entire prior history. censor_on_cohort_exit If True, censor at cohort_end_date instead of observation end. censor_on_date Column name in x containing a date to censor on. follow_up_days Maximum follow-up time in days. float('inf') = no cap. time_column Name of the output time column. status_column Name of the output status column.

Returns¶

CdmTable Input table with time_column and status_column added.

Result Conversion¶

Convert a long-format SummarisedResult into structured wide-format DataFrames for estimates, events, summary, and attrition.

as_survival_result ¶

as_survival_result(
    result: SummarisedResult,
) -> dict[str, pl.DataFrame]

Convert a SummarisedResult to a structured survival result.

Extracts and pivots the four types of survival data embedded in the SummarisedResult:

estimates: time-point survival/CIF values in wide format
events: risk table data per interval
summary: summary statistics (median, RMST, quantiles)
attrition: attrition tracking

Parameters¶

result A SummarisedResult from estimate_single_event_survival() or estimate_competing_risk_survival().

Returns¶

dict Dictionary with keys "estimates", "events", "summary", "attrition", each containing a wide-format Polars DataFrame.

Table Functions¶

Format survival results as publication-ready tables using omopy.vis.vis_omop_table().

table_survival ¶

table_survival(
    result: SummarisedResult,
    *,
    times: list[int] | None = None,
    time_scale: str = "days",
    header: str | list[str] = "estimate",
    estimates: list[str] | None = None,
    type: str = "gt",
    **kwargs: Any,
) -> Any

Create a summary table of survival results.

Shows median survival, restricted mean survival, and optionally survival estimates at specific time points.

Parameters¶

result SummarisedResult from a survival estimation function. times Optional list of time points (in days) to include estimates for. time_scale Scale for display: "days", "months", or "years". header Column(s) to use as header. estimates Which summary estimates to include. Default: median + RMST. type Output type ("gt" for great-tables).

Returns¶

great_tables.GT or pl.DataFrame Rendered table.

table_survival_events ¶

table_survival_events(
    result: SummarisedResult,
    *,
    event_gap: int | None = None,
    header: str | list[str] = "estimate",
    type: str = "gt",
    **kwargs: Any,
) -> Any

Create a risk table showing events per time interval.

Parameters¶

result SummarisedResult from a survival estimation function. event_gap Override the event gap interval. None uses the original. header Column(s) to use as header. type Output type.

Returns¶

great_tables.GT or pl.DataFrame

table_survival_attrition ¶

table_survival_attrition(
    result: SummarisedResult,
    *,
    type: str = "gt",
    **kwargs: Any,
) -> Any

Create an attrition table.

Parameters¶

result SummarisedResult from a survival estimation function. type Output type.

Returns¶

great_tables.GT or pl.DataFrame

options_table_survival ¶

options_table_survival() -> dict[str, Any]

Return default options for table_survival().

Returns¶

dict Dictionary of default option values.

Plot Functions¶

Kaplan-Meier and cumulative incidence curves with confidence interval ribbons, risk tables, and faceting support.

plot_survival ¶

plot_survival(
    result: SummarisedResult,
    *,
    ribbon: bool = True,
    facet: str | list[str] | None = None,
    colour: str | None = None,
    cumulative_failure: bool = False,
    risk_table: bool = False,
    risk_interval: int = 30,
    log_log: bool = False,
    time_scale: str = "days",
    style: dict[str, Any] | None = None,
) -> Any

Create a survival curve plot.

Generates a Kaplan-Meier survival curve or cumulative incidence/failure plot from a SummarisedResult.

Parameters¶

result SummarisedResult from a survival estimation function. ribbon Show 95% confidence interval as a shaded ribbon. facet Column(s) to facet by. colour Column to colour by. cumulative_failure If True, plot 1 - S(t) (cumulative failure) instead of S(t). risk_table If True, add a risk table below the plot. risk_interval Interval for risk table display (in days). log_log If True, use log(-log(S(t))) y-axis scale. time_scale Time axis scale: "days", "months", or "years". style Additional plotly style overrides.

Returns¶

plotly.graph_objects.Figure A plotly figure with the survival plot.

available_survival_grouping ¶

available_survival_grouping(
    result: SummarisedResult, *, varying: bool = False
) -> list[str]

List columns available for faceting or colouring survival plots.

Parameters¶

result A SummarisedResult from a survival estimation function. varying If True, only return columns with more than one unique value.

Returns¶

list[str] Column names available for grouping.

Mock Data¶

mock_survival ¶

mock_survival(
    n_persons: int = 200,
    *,
    seed: int = 42,
    target_name: str = "target",
    outcome_name: str = "outcome",
    competing_name: str = "competing",
    event_rate: float = 0.3,
    competing_rate: float = 0.15,
    max_follow_up: int = 3650,
    include_strata: bool = True,
) -> CdmReference

Create a mock CDM with cohort tables for survival analysis testing.

Generates synthetic person, observation_period, target cohort, outcome cohort, and competing risk cohort tables.

Parameters¶

n_persons Number of persons to simulate. seed Random seed for reproducibility. target_name Name of the target cohort. outcome_name Name of the outcome cohort. competing_name Name of the competing risk cohort. event_rate Proportion of persons who experience the primary event. competing_rate Proportion of persons who experience the competing event. max_follow_up Maximum follow-up time in days. include_strata If True, add sex and age_group columns to the target cohort for stratified analysis.

Returns¶

CdmReference CDM with person, observation_period, target cohort, outcome cohort, and competing cohort tables.