Data Diversity Stressed for Cell Therapy Models

Industry experts are highlighting the importance of using iPSC-derived cell models from diverse genetic backgrounds to better predict clinical success. An analysis from Cytochroma argues that this approach is critical for building robust biomanufacturing data infrastructure. This push aligns with NIH initiatives encouraging the reuse of genomics and clinical data, suggesting future data systems must be built to handle heterogeneous datasets.

* A significant challenge in scaling up iPSC therapies is the high cost and complexity of defining critical quality attributes and developing reliable potency assays, which are often highly variable and introduced late in development. * Most genomic data used in genome-wide association studies come from individuals of European ancestry (95.76%), with Asian ancestry at 2.9% and both African and Hispanic/Latin American ancestries below 1%, creating a significant data gap. This lack of diversity can lead to treatments that are less effective for underrepresented populations. * Digital twins are being used to optimize bioreactor processes by creating dynamic virtual replicas that integrate real-time sensor data with mechanistic and data-driven models. This allows for "what-if" scenario simulations to mitigate production failures and accelerate development by reducing physical experiments. * The transition from paper to electronic batch records (EBRs) is critical for managing the complexity of cell and gene therapy manufacturing, where a single batch can generate upwards of 3,000 specific data points from donor qualification to final release. Companies like InstantGMP and Sapio Sciences are developing EBR systems to improve data integrity and streamline GMP compliance. * Machine learning algorithms are being applied to analyze the vast and complex datasets generated during bioprocessing to optimize process parameters like temperature, pH, and nutrient levels, thereby reducing trial-and-error experimentation. These models can simulate different scenarios to identify optimal conditions for cell growth and protein expression. * The NIH Genomic Data Sharing (GDS) Policy mandates the broad sharing of large-scale human and non-human genomic data from NIH-funded research to accelerate discoveries. This requires investigators to submit data to an appropriate repository like the database of Genotypes and Phenotypes (dbGaP). * Integrating heterogeneous data from diverse sources, such as electronic health records, claims data, and patient registries, presents significant challenges due to a lack of standardization in data collection, coding systems, and formats. This complexity makes aggregation and reconciliation of data for clinical trials error-prone. * Genetic variation between individual donors has been shown to have a greater impact on the molecular profile and differentiation capabilities of iPSC lines than the original parental cell type (e.g., fibroblasts vs. blood cells). One study found that donor genetic background was sufficient to cause differences in how iPSC-derived hepatocytes differentiated.

Data Diversity Stressed for Cell Therapy Models

Get your own daily briefing