Design near‑real‑time trial pipelines

- FDA is piloting real-time clinical trials with AstraZeneca and Amgen, pushing trial data toward regulators as it is generated instead of waiting for end-of-study packages. - The technical bar is higher than “faster ETL” — FDA’s January 2025 AI draft guidance and ICH E6(R3) both center validation, traceability, and fit-for-purpose systems. - That matters because near-real-time review only works if pipelines, rules, and retrieval layers are auditable enough for regulated clinical decisions.

Clinical trial data plumbing is suddenly a frontline product problem. The FDA is now openly testing real-time clinical trial review, including pilots with AstraZeneca and Amgen, instead of waiting for the usual giant submission bundle at the end. That sounds like a workflow tweak. It isn’t. It changes what a “good” data pipeline has to be. In this world, your pipeline is part transport layer, part evidence chain, part compliance surface. (raps.org) ### What changed? The immediate shift is on the regulator side. FDA’s existing Real-Time Oncology Review program already let sponsors submit pieces of an application before the full package was done. Now the agency is pushing further into proof-of-concept work where endpoints and safety signals can be gathered and reported in real time, with AstraZeneca and Amgen named in the new effort. (fda.gov)the bar for data engineering? Because batch-era shortcuts stop working. If regulators may look at data while a trial is still running, you need incremental ingestion, stable schemas, lineage, replayability, and quality checks that fire before bad records propagate downstream. A daily or hourly refresh is not enough by itself — the system has to show what changed, when it changed, who touche(fda.gov)ection ICH E6(R3) pushes: flexible use of technology, but with quality by design and risk-based oversight baked in. (fda.gov) ### What does “fit for purpose” mean here? Basically, the pipeline cannot just be modern. It has to be defensible. ICH E6(R3), finalized by ICH in January 2025 and posted by FDA as final guidance in September 2025, says computerized systems used in trials should be fit for purpose and handled with risk-based validation. That is regulator language for a very practical demand: prove the system (fda.gov)efault. (database.ich.org) ### Where does AI fit? FDA’s January 2025 draft AI guidance is the other half of the story. It focuses on AI used to generate information for regulatory decision-making and lays out a risk-based credibility assessment tied to context of use. So if you add a clinical decision-support layer on top of the pipeline — risk scoring, signal detection, retrieval over trial documents — the question is (database.ich.org) with this exact impact on safety, efficacy, or quality decisions. (fda.gov) ### Why do deterministic rules and RAG keep coming up? Because they solve different trust problems. Rule-based logic gives you stable, inspectable behavior for things like inclusion flags, protocol deviations, or adverse-event triage. Retrieval-augmented generation can help pull the right protocol text, SAP language, or prior g(fda.gov)a regulated workflow. Otherwise you get a fast answer with no audit trail. That is the opposite of what FDA and GCP modernization are asking for. This is an inference from the guidance, but it follows directly from the traceability and credibility requirements they emphasize. (fda.gov) ### So what should the pipeline actually do? In plain English — ingest trial data incrementally, preserve lineage, enforce data quality expectations, separate and govern PII, version transformations, and make every downstream metric reproducible. Databricks and Snowflake can absolutely be part of that. But the architecture only(fda.gov)cal-trial platform write-up makes the same point from the vendor side — the hard part is not storing more data, it is managing the lifecycle under regulated constraints. (databricks.com) ### What’s the bottom line? Near-real-time trials sound like a regulator speed story. They are really a systems-design story. If FDA wants live-ish visibility into trial data, then clinical pipelines have to behave less like analytics back ends and more like regulated production systems — observable, validated, and explainable end to end. (raps.org)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.