omopy.drug_diagnostics¶

Drug exposure diagnostics — run configurable quality checks on drug_exposure records for specified ingredient concepts, summarise findings, and visualise results.

This module is the Python equivalent of the R DrugExposureDiagnostics package. Table rendering delegates to omopy.vis; plot rendering uses plotly.

Constants¶

AVAILABLE_CHECKS `module-attribute` ¶

AVAILABLE_CHECKS: tuple[str, ...] = (
    "missing",
    "exposure_duration",
    "type",
    "route",
    "source_concept",
    "days_supply",
    "verbatim_end_date",
    "dose",
    "sig",
    "quantity",
    "days_between",
    "diagnostics_summary",
)

Core Types¶

Pydantic model for storing diagnostic check results.

DiagnosticsResult ¶

Bases: BaseModel

Container for drug exposure diagnostics results.

Holds a named dict of Polars DataFrames (one per check) plus metadata about the execution. Immutable after creation.

Attributes¶

results Dict mapping check names to Polars DataFrames with check results. checks_performed Tuple of check names that were actually run. ingredient_concepts Dict mapping ingredient concept IDs to their names. cdm_name Name of the CDM instance. sample_size Number of records sampled per ingredient (or None if no sampling). min_cell_count Minimum cell count threshold used for suppression. execution_time_seconds Total execution time in seconds.

getitem ¶

__getitem__(key: str) -> pl.DataFrame

Allow dict-like access: result['missing'].

iter ¶

__iter__()

Iterate over check names (not Pydantic fields).

keys ¶

keys()

Return check names.

values ¶

values()

Return result DataFrames.

items ¶

items()

Return (check_name, DataFrame) pairs.

Diagnostic Computation¶

Run one or more diagnostic checks on drug exposure records.

execute_checks ¶

execute_checks(
    cdm: CdmReference,
    ingredient_concept_ids: list[int] | int,
    *,
    checks: list[str] | tuple[str, ...] | None = None,
    sample_size: int | None = 10000,
    min_cell_count: int = 5,
) -> DiagnosticsResult

Run drug exposure diagnostic checks for specified ingredients.

This is the main entry point for the drug diagnostics module. For each ingredient concept ID, it resolves descendant drug concepts, fetches (and optionally samples) drug_exposure records, and runs each enabled check.

Parameters¶

cdm A CdmReference connected to an OMOP CDM database. ingredient_concept_ids One or more ingredient concept IDs to diagnose. checks Which checks to run. Defaults to all available checks. See :data:AVAILABLE_CHECKS for valid names. sample_size Maximum number of records to sample per ingredient. Set to None to use all records (can be slow for large datasets). min_cell_count Counts below this threshold are replaced with None for privacy protection. Set to 0 to disable.

Returns¶

DiagnosticsResult Container with a dict of Polars DataFrames (one per check), plus metadata about the execution.

Raises¶

ValueError If any check name is not in AVAILABLE_CHECKS. TypeError If cdm is not a CdmReference.

Examples¶

import omopy cdm = omopy.connector.cdm_from_con(con, cdm_schema="base") result = omopy.drug_diagnostics.execute_checks( ... cdm, ... ingredient_concept_ids=[1125315, 1503297], ... checks=["missing", "exposure_duration", "type"], ... sample_size=5000, ... ) result["missing"] # Polars DataFrame

Summarise Functions¶

Aggregate diagnostic results into a standardised SummarisedResult.

summarise_drug_diagnostics ¶

summarise_drug_diagnostics(
    result: DiagnosticsResult,
) -> SummarisedResult

Convert drug diagnostics results to SummarisedResult format.

Transforms the dict-of-DataFrames output from :func:execute_checks into the standard 13-column :class:~omopy.generics.SummarisedResult format used by table_drug_diagnostics() and plot_drug_diagnostics().

Parameters¶

result Output from :func:execute_checks.

Returns¶

SummarisedResult Standardised result with one result_id per check type.

Examples¶

diag = omopy.drug_diagnostics.execute_checks(cdm, [1125315]) sr = omopy.drug_diagnostics.summarise_drug_diagnostics(diag) sr.settings

Table Functions¶

Format summarised results as publication-ready tables using omopy.vis.vis_omop_table().

table_drug_diagnostics ¶

table_drug_diagnostics(
    result: SummarisedResult,
    *,
    check: str | None = None,
    type: Literal["gt", "polars"] | None = None,
    header: list[str] | None = None,
    group_column: list[str] | None = None,
    hide: list[str] | None = None,
    style: Any | None = None,
) -> Any

Format drug diagnostics results as a display-ready table.

Parameters¶

result A SummarisedResult from :func:summarise_drug_diagnostics. check Specific check to display (e.g. "missing", "exposure_duration"). If None, all checks are included. type Output format: "gt" for great_tables.GT, "polars" for a Polars DataFrame. Default is "polars". header Columns to use as multi-level headers. group_column Columns to use for row grouping. hide Columns to hide from the output. style Optional TableStyle for customisation.

Returns¶

great_tables.GT | polars.DataFrame Formatted table.

Plot Functions¶

Visualise diagnostic results as bar charts and box plots.

plot_drug_diagnostics ¶

plot_drug_diagnostics(
    result: SummarisedResult,
    *,
    check: str = "missing",
    facet: str | None = None,
    colour: str | None = None,
    title: str | None = None,
    style: Any | None = None,
) -> Any

Create a plot for drug diagnostics results.

Generates bar charts for categorical checks and box plots for quantile-based checks.

Parameters¶

result A SummarisedResult from :func:summarise_drug_diagnostics. check Which check to plot. One of: "missing", "exposure_duration", "type", "route", "source_concept", "sig", "quantity", "days_supply", "days_between". facet Column to facet by (currently unused, reserved for future use). colour Override colour for all bars/boxes. title Chart title. Defaults to a descriptive title based on the check. style Optional plot style configuration (reserved for future use).

Returns¶

plotly.graph_objects.Figure Interactive plotly figure.

Raises¶

ValueError If check is not a valid plottable check name.

Mock Data & Benchmarking¶

mock_drug_exposure ¶

mock_drug_exposure(
    *,
    n_ingredients: int = 2,
    n_records_per_ingredient: int = 100,
    seed: int = 42,
    include_checks: list[str] | None = None,
) -> DiagnosticsResult

Generate a mock DiagnosticsResult for testing.

Creates synthetic data representative of :func:execute_checks output, useful for testing table/plot/summarise functions without requiring a database.

Parameters¶

n_ingredients Number of ingredient concepts to simulate. n_records_per_ingredient Number of drug exposure records per ingredient. seed Random seed for reproducibility. include_checks Which checks to include. Defaults to all available checks.

Returns¶

DiagnosticsResult Mock results with realistic distributions.

benchmark_drug_diagnostics ¶

benchmark_drug_diagnostics(
    cdm: Any,
    ingredient_concept_ids: list[int],
    *,
    checks: list[str] | None = None,
    sample_size: int | None = 10000,
    n_runs: int = 3,
) -> pl.DataFrame

Benchmark execute_checks performance.

Runs :func:execute_checks multiple times and reports timing statistics.

Parameters¶

cdm A CdmReference connected to an OMOP CDM database. ingredient_concept_ids Ingredient concept IDs to diagnose. checks Which checks to run. Defaults to all. sample_size Maximum records to sample per ingredient. n_runs Number of repetitions for timing.

Returns¶

polars.DataFrame DataFrame with columns: run, ingredient_concept_id, n_records, execution_time_seconds.

omopy.drug_diagnostics¶

Constants¶

AVAILABLE_CHECKS module-attribute ¶

Core Types¶

DiagnosticsResult ¶

Attributes¶

__getitem__ ¶

__iter__ ¶

keys ¶

values ¶

items ¶

Diagnostic Computation¶

execute_checks ¶

Parameters¶

Returns¶

Raises¶

Examples¶

Summarise Functions¶

summarise_drug_diagnostics ¶

Parameters¶

Returns¶

Examples¶

Table Functions¶

table_drug_diagnostics ¶

Parameters¶

Returns¶

Plot Functions¶

plot_drug_diagnostics ¶

Parameters¶

Returns¶

Raises¶

Mock Data & Benchmarking¶

mock_drug_exposure ¶

Parameters¶

Returns¶

benchmark_drug_diagnostics ¶

Parameters¶

Returns¶

AVAILABLE_CHECKS `module-attribute` ¶

getitem ¶

iter ¶