omopy.survival¶
Cohort survival analysis — Kaplan-Meier estimation, competing risk cumulative incidence, risk tables, and survival plots.
This module is the Python equivalent of the R CohortSurvival package.
Kaplan-Meier estimation uses lifelines;
competing risk analysis uses a custom Aalen-Johansen implementation.
Table rendering delegates to omopy.vis; plot rendering uses
plotly.
Core Estimation¶
Compute survival curves, summary statistics, risk tables, and attrition from target and outcome cohorts.
estimate_single_event_survival
¶
estimate_single_event_survival(
cdm: CdmReference,
target_cohort_table: str,
outcome_cohort_table: str,
*,
target_cohort_id: int | list[int] | None = None,
outcome_cohort_id: int | list[int] | None = None,
outcome_date_variable: str = "cohort_start_date",
outcome_washout: int | float = float("inf"),
censor_on_cohort_exit: bool = False,
censor_on_date: str | None = None,
follow_up_days: int | float = float("inf"),
strata: list[str] | None = None,
event_gap: int = 30,
estimate_gap: int = 1,
restricted_mean_follow_up: int | None = None,
minimum_survival_days: int = 1,
) -> SummarisedResult
Estimate Kaplan-Meier survival for a single event.
Parameters¶
cdm
CDM reference.
target_cohort_table
Name of the target (exposure/index) cohort table.
outcome_cohort_table
Name of the outcome cohort table.
target_cohort_id
Which target cohort IDs to analyse. None = all.
outcome_cohort_id
Which outcome cohort IDs to analyse. None = all.
outcome_date_variable
Date column in outcome cohort for event timing.
outcome_washout
Days before index to exclude prior events. inf = entire history.
censor_on_cohort_exit
Censor at target cohort end date rather than observation end.
censor_on_date
Column name with a censoring date.
follow_up_days
Maximum follow-up in days. inf = no cap.
strata
Column names for stratification (must be in target cohort).
event_gap
Interval width in days for the risk table.
estimate_gap
Step size in days for the survival curve estimates.
restricted_mean_follow_up
Time horizon for restricted mean survival. None = max observed.
minimum_survival_days
Exclude persons with follow-up < this many days.
Returns¶
SummarisedResult
Result with survival_estimates, survival_events,
survival_summary, and survival_attrition result types.
estimate_competing_risk_survival
¶
estimate_competing_risk_survival(
cdm: CdmReference,
target_cohort_table: str,
outcome_cohort_table: str,
competing_outcome_cohort_table: str,
*,
target_cohort_id: int | list[int] | None = None,
outcome_cohort_id: int | list[int] | None = None,
outcome_date_variable: str = "cohort_start_date",
outcome_washout: int | float = float("inf"),
competing_outcome_cohort_id: int
| list[int]
| None = None,
competing_outcome_date_variable: str = "cohort_start_date",
competing_outcome_washout: int | float = float("inf"),
censor_on_cohort_exit: bool = False,
censor_on_date: str | None = None,
follow_up_days: int | float = float("inf"),
strata: list[str] | None = None,
event_gap: int = 30,
estimate_gap: int = 1,
restricted_mean_follow_up: int | None = None,
minimum_survival_days: int = 1,
) -> SummarisedResult
Estimate cumulative incidence with competing risks.
Uses the Aalen-Johansen estimator to compute cumulative incidence functions (CIF) in the presence of a competing event.
Parameters¶
cdm CDM reference. target_cohort_table Name of the target cohort table. outcome_cohort_table Name of the primary outcome cohort table. competing_outcome_cohort_table Name of the competing outcome cohort table. target_cohort_id, outcome_cohort_id Cohort ID filters. outcome_date_variable Date column for primary outcome timing. outcome_washout Washout period for primary outcome. competing_outcome_cohort_id Which competing outcome cohort IDs to use. competing_outcome_date_variable Date column for competing outcome timing. competing_outcome_washout Washout period for competing outcome. censor_on_cohort_exit Censor at target cohort end. censor_on_date Column with censoring date. follow_up_days Maximum follow-up in days. strata Stratification columns. event_gap, estimate_gap Risk table interval and estimate step. restricted_mean_follow_up Time horizon for restricted mean. minimum_survival_days Minimum follow-up filter.
Returns¶
SummarisedResult Result with competing risk cumulative incidence estimates.
Add Survival Columns¶
Enrich a cohort table with survival time and event status columns.
add_cohort_survival
¶
add_cohort_survival(
x: CdmTable,
cdm: CdmReference | None = None,
*,
outcome_cohort_table: str | CohortTable,
outcome_cohort_id: int = 1,
outcome_date_variable: str = "cohort_start_date",
outcome_washout: int | float = float("inf"),
censor_on_cohort_exit: bool = False,
censor_on_date: str | None = None,
follow_up_days: int | float = float("inf"),
time_column: str = "time",
status_column: str = "status",
) -> CdmTable
Add survival time and status columns to a cohort table.
Computes days-to-event/censoring for each person in the input cohort, relative to an outcome cohort.
Parameters¶
x
Input cohort table (must have subject_id, cohort_start_date,
cohort_end_date).
cdm
CDM reference. If None, uses x.cdm.
outcome_cohort_table
Name of the outcome cohort table in the CDM, or a CohortTable.
outcome_cohort_id
Which cohort definition ID in the outcome cohort to use.
outcome_date_variable
Date column in the outcome cohort for the event date.
outcome_washout
Number of days before index date to check for prior outcome.
float('inf') means check the entire prior history.
censor_on_cohort_exit
If True, censor at cohort_end_date instead of observation end.
censor_on_date
Column name in x containing a date to censor on.
follow_up_days
Maximum follow-up time in days. float('inf') = no cap.
time_column
Name of the output time column.
status_column
Name of the output status column.
Returns¶
CdmTable
Input table with time_column and status_column added.
Result Conversion¶
Convert a long-format SummarisedResult into structured wide-format
DataFrames for estimates, events, summary, and attrition.
as_survival_result
¶
Convert a SummarisedResult to a structured survival result.
Extracts and pivots the four types of survival data embedded in the SummarisedResult:
- estimates: time-point survival/CIF values in wide format
- events: risk table data per interval
- summary: summary statistics (median, RMST, quantiles)
- attrition: attrition tracking
Parameters¶
result
A SummarisedResult from estimate_single_event_survival()
or estimate_competing_risk_survival().
Returns¶
dict
Dictionary with keys "estimates", "events", "summary",
"attrition", each containing a wide-format Polars DataFrame.
Table Functions¶
Format survival results as publication-ready tables using
omopy.vis.vis_omop_table().
table_survival
¶
table_survival(
result: SummarisedResult,
*,
times: list[int] | None = None,
time_scale: str = "days",
header: str | list[str] = "estimate",
estimates: list[str] | None = None,
type: str = "gt",
**kwargs: Any,
) -> Any
Create a summary table of survival results.
Shows median survival, restricted mean survival, and optionally survival estimates at specific time points.
Parameters¶
result
SummarisedResult from a survival estimation function.
times
Optional list of time points (in days) to include estimates for.
time_scale
Scale for display: "days", "months", or "years".
header
Column(s) to use as header.
estimates
Which summary estimates to include. Default: median + RMST.
type
Output type ("gt" for great-tables).
Returns¶
great_tables.GT or pl.DataFrame Rendered table.
table_survival_events
¶
table_survival_attrition
¶
options_table_survival
¶
Plot Functions¶
Kaplan-Meier and cumulative incidence curves with confidence interval ribbons, risk tables, and faceting support.
plot_survival
¶
plot_survival(
result: SummarisedResult,
*,
ribbon: bool = True,
facet: str | list[str] | None = None,
colour: str | None = None,
cumulative_failure: bool = False,
risk_table: bool = False,
risk_interval: int = 30,
log_log: bool = False,
time_scale: str = "days",
style: dict[str, Any] | None = None,
) -> Any
Create a survival curve plot.
Generates a Kaplan-Meier survival curve or cumulative incidence/failure plot from a SummarisedResult.
Parameters¶
result
SummarisedResult from a survival estimation function.
ribbon
Show 95% confidence interval as a shaded ribbon.
facet
Column(s) to facet by.
colour
Column to colour by.
cumulative_failure
If True, plot 1 - S(t) (cumulative failure) instead of S(t).
risk_table
If True, add a risk table below the plot.
risk_interval
Interval for risk table display (in days).
log_log
If True, use log(-log(S(t))) y-axis scale.
time_scale
Time axis scale: "days", "months", or "years".
style
Additional plotly style overrides.
Returns¶
plotly.graph_objects.Figure A plotly figure with the survival plot.
available_survival_grouping
¶
Mock Data¶
mock_survival
¶
mock_survival(
n_persons: int = 200,
*,
seed: int = 42,
target_name: str = "target",
outcome_name: str = "outcome",
competing_name: str = "competing",
event_rate: float = 0.3,
competing_rate: float = 0.15,
max_follow_up: int = 3650,
include_strata: bool = True,
) -> CdmReference
Create a mock CDM with cohort tables for survival analysis testing.
Generates synthetic person, observation_period, target cohort, outcome cohort, and competing risk cohort tables.
Parameters¶
n_persons
Number of persons to simulate.
seed
Random seed for reproducibility.
target_name
Name of the target cohort.
outcome_name
Name of the outcome cohort.
competing_name
Name of the competing risk cohort.
event_rate
Proportion of persons who experience the primary event.
competing_rate
Proportion of persons who experience the competing event.
max_follow_up
Maximum follow-up time in days.
include_strata
If True, add sex and age_group columns to the
target cohort for stratified analysis.
Returns¶
CdmReference
CDM with person, observation_period, target cohort,
outcome cohort, and competing cohort tables.