Tagline?

WHAT PONTEGRA PROVIDES
Build repeatable cohort and feature pipelines from FHIR on Apache Spark
Pontegra helps life sciences and real-world evidence (RWE) teams turn longitudinal clinical data into analysis-ready datasets in a fast, repeatable, and scalable way.
With Spark-on-FHIR you can define cohorts, create sampling timepoints, compute features and endpoints, and deliver curated tables for HEOR, epidemiology, and AI workflows.
Powered by: Spark-on-FHIR, Pontegra’s FHIR-native Spark toolkit.
WHAT YOU CAN ACHIEVE
Title?
Subtitle?
Accelerate time-to-dataset
Move from data access to analysis-ready tables without hand-built flattening scripts.
Make studies reproducible
Define cohort logic, sampling rules, and feature definitions as versioned pipelines that can be rerun and refreshed.
Scale to real-world volumes
Run the same logic across millions of patients and billions of records with Spark-native execution.
Support iterative study development
Start with a cohort sample for quick validation, then scale out to full production runs.
Why Spark-on-FHIR for RWE
Why Spark-on-FHIR for RWE
Most RWE pipelines spend disproportionate time on:
- extracting and normalizing longitudinal events,
- aligning timelines to an index date,
- generating covariates/outcomes consistently,
- keeping refresh runs stable as data grows.
Spark-on-FHIR is built specifically to reduce that burden by combining:
- FHIR-native query semantics (FHIR search + FHIRPath),
- Spark scale-out compute, and
- higher-level building blocks for cohorts, sampling, and feature extraction.
HOW IT CAN BE DONE
Core workflow
Subtitle?
1.
Define the cohort
Model cohorts with reproducible logic:
- index event definition (e.g., first diagnosis, first prescription, procedure date
- inclusion/exclusion criteria
- entry/exit timepoints (e.g., diagnosis to death/discharge)
- optional sampling mode to test logic on large datasets quickly
2.
Sample the timeline
Create timepoints for feature computation:
- Periodic sampling (e.g., monthly/quarterly)
- Event-aligned sampling (e.g., pre/post ICU admission, surgery, treatment start)
- Support “since last timepoint” and “since index event” windows
3.
Compute features and endpoints
Generate analysis datasets:
- covariates (labs, vitals, diagnoses, meds, utilization)
- endpoints/outcomes (events, counts, time-to-event derivations)
- feature aggregations (avg/max/min/last/count; windowed variants)
- export tidy tables to Parquet/Delta/CSV for downstream analysis