Skip to content

omopy.pregnancy

Pregnancy episode identification using the HIPPS algorithm — identify pregnancy episodes from OMOP CDM data, summarise findings, and visualise results.

This module is the Python equivalent of the R PregnancyIdentifier package. Table rendering delegates to omopy.vis; plot rendering uses plotly.

Constants

OUTCOME_CATEGORIES module-attribute

OUTCOME_CATEGORIES: dict[str, str] = {
    "LB": "Live birth",
    "SB": "Stillbirth",
    "AB": "Abortion",
    "SA": "Spontaneous abortion",
    "DELIV": "Delivery (unspecified)",
    "ECT": "Ectopic pregnancy",
    "PREG": "Pregnancy (ongoing/unspecified)",
}

Core Types

Pydantic model for storing pregnancy identification results.

PregnancyResult

Bases: BaseModel

Container for pregnancy identification results.

Attributes

episodes Final pregnancy episodes (one row per episode) after ESD refinement. hip_episodes HIP-only episodes before merging. pps_episodes PPS-only episodes before merging. merged_episodes Merged HIP+PPS episodes before ESD refinement. cdm_name Name of the CDM instance. n_persons_input Number of distinct persons with any pregnancy-related record. n_episodes Total number of final episodes. settings Parameters used for the analysis.

Pregnancy Identification

Run the HIPPS algorithm to identify pregnancy episodes.

identify_pregnancies

identify_pregnancies(
    cdm: CdmReference,
    *,
    start_date: date | None = None,
    end_date: date | None = None,
    age_bounds: tuple[int, int] = (10, 55),
    just_gestation: bool = True,
    min_cell_count: int = 5,
) -> PregnancyResult

Identify pregnancy episodes from an OMOP CDM.

Main entry point for the HIPPS pregnancy identification algorithm. Runs the full pipeline: init → HIP → PPS → merge → ESD.

Parameters

cdm A :class:CdmReference with clinical tables. start_date Restrict to records on or after this date. end_date Restrict to records on or before this date. age_bounds (min_age, max_age) for filtering persons by age at record date. just_gestation If True, run HIP Pass 2 for gestation-only episodes. min_cell_count Minimum cell count for suppression.

Returns

PregnancyResult Container with episodes, intermediate results, and metadata.

Summarise Functions

Aggregate pregnancy episodes into a standardised SummarisedResult.

summarise_pregnancies

summarise_pregnancies(
    result: PregnancyResult,
    *,
    strata: list[str] | None = None,
) -> SummarisedResult

Summarise pregnancy episodes into SummarisedResult format.

Parameters

result A :class:PregnancyResult from :func:identify_pregnancies. strata Optional list of columns to stratify by (e.g., ["category"]).

Returns

SummarisedResult Standard OHDSI result format with pregnancy episode statistics.

Table Functions

Format summarised results as publication-ready tables using omopy.vis.vis_omop_table().

table_pregnancies

table_pregnancies(
    result: SummarisedResult,
    *,
    type: Literal["gt", "polars"] | None = None,
    header: list[str] | None = None,
    group_column: list[str] | None = None,
    hide: list[str] | None = None,
    style: Any | None = None,
    **options: Any,
) -> Any

Render pregnancy results as a formatted table.

Parameters

result A :class:SummarisedResult from :func:summarise_pregnancies. type "gt" for a great_tables table, "polars" for a raw DataFrame. header Columns to use as header grouping. group_column Columns for row grouping. hide Columns to hide. style A TableStyle for customisation. **options Additional options forwarded to vis_omop_table.

Returns

great_tables.GT | polars.DataFrame

Plot Functions

Visualise pregnancy outcomes, gestational age, and timelines.

plot_pregnancies

plot_pregnancies(
    result: SummarisedResult,
    *,
    type: str = "outcome",
    facet: str | list[str] | None = None,
    colour: str | None = None,
    style: Any | None = None,
) -> Any

Plot pregnancy results.

Parameters

result A :class:SummarisedResult from :func:summarise_pregnancies. type "outcome" for outcome category bar chart, "source" for source distribution, "duration" for episode duration, "precision" for precision distribution. facet Column(s) for faceting. colour Column for colour grouping. style A PlotStyle for styling.

Returns

plotly.graph_objects.Figure | polars.DataFrame A Plotly figure, or a tidy DataFrame if Plotly is unavailable.

Validation & Mock Data

validate_episodes

validate_episodes(
    episodes: DataFrame, *, max_days: int = 365
) -> pl.DataFrame

Validate pregnancy episode periods.

Checks that episodes satisfy basic temporal constraints:

  • episode_start_date <= episode_end_date
  • Duration does not exceed max_days
  • No overlapping episodes for the same person

Parameters

episodes DataFrame with at least: person_id, episode_start_date, episode_end_date. max_days Maximum allowed episode duration in days.

Returns

pl.DataFrame Validation report with columns: check, n_violations, details.

mock_pregnancy_cdm

mock_pregnancy_cdm(
    *, seed: int = 42, n_persons: int = 20
) -> CdmReference

Create a mock CDM with pregnancy-related records for testing.

Generates synthetic data including:

  • person table (females aged 15–45)
  • observation_period table
  • condition_occurrence with pregnancy outcome codes
  • procedure_occurrence with delivery procedure codes
  • measurement with gestational age measurements
  • observation with prenatal observation codes

Parameters

seed Random seed for reproducibility. n_persons Number of persons to generate.

Returns

CdmReference CDM with clinical tables containing pregnancy-related records.