AI Models Predict Viral Host Range from Protein Data

A new study demonstrates that protein language models (PLMs) can accurately predict the host range of viral proteins. This AI-driven approach enables more efficient engineering of viral vectors for gene therapy, potentially accelerating development by tailoring vectors to specific therapeutic targets.

- Fine-tuning existing protein language models on specific viral protein datasets significantly improves their predictive performance, overcoming biases against underrepresented viral proteins, often called the "dark matter" of biology. This parameter-efficient approach allows for faster training on standard GPUs, making advanced AI tools more accessible for academic labs. - A primary manufacturing challenge is the trade-off between adherent cell cultures, which are difficult to scale, and suspension-based systems that historically produce lower cell densities. Innovations in upstream processing, such as high-density perfusion cultures and advanced transfection reagents, are critical for improving viral vector productivity and quality to meet clinical and market demands. - The lack of standardized assays and data management tools across the cell and gene therapy industry creates significant inefficiencies, especially as production volumes increase. This challenge is compounded by the complexity of the therapies themselves and the need to implement Quality by Design (QbD) principles for improved consistency and regulatory compliance. - Contract Development and Manufacturing Organizations (CDMOs) are crucial for about 90% of biotech companies in the cell and gene therapy space, providing specialized expertise in process development, scalable GMP manufacturing, and regulatory support. The global cell and gene therapy CDMO market is projected to grow significantly, with a compound annual growth rate of 17.5% expected between 2023 and 2029. - AI is being applied to optimize various stages of manufacturing, including predicting which cellular structures will thrive in 3D models and improving downstream processing. AI-powered predictive models are also used to enhance both upstream and downstream processing in vector production, which has led to higher yields and reduced costs. - Integrating multi-omics data (genomics, proteomics, etc.) through AI provides a comprehensive view of disease biology, enabling the design of more tailored and effective therapies. This approach helps in identifying subtle patterns in patient data to prioritize gene therapy targets more effectively. - While biotech venture capital funding saw a peak in 2021, investment in cell and gene therapy has since seen a significant downturn due to the high capital requirements for manufacturing and long development timelines. Investors are now more selective, favoring de-risked assets with validated targets and clear regulatory paths. - Adeno-associated virus (AAV) vectors are the leading platform for in vivo gene delivery, with seven AAV-mediated therapies approved by the FDA as of late 2024 and over 300 candidates in clinical trials. Despite their advantages, challenges remain in addressing potential immunogenicity and optimizing the size of the transgene expression cassette.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.