Open-Source AI Tackles Genomics

A new open-source AI model named Evo was trained on trillions of bacterial genome bases. The landmark model can predict or even generate plausible new gene sequences, a major advance that could accelerate research in synthetic biology and drug discovery.

The project is a collaboration between the Arc Institute, NVIDIA, and researchers from Stanford University, UC Berkeley, and UC San Francisco. Their goal is to apply large-scale AI to biology in the same way large language models are used for human text. The latest version, Evo 2, was trained on a massive dataset of 9.3 trillion nucleotides from over 128,000 genomes, covering all three domains of life: bacteria, archaea, and eukaryotes (including humans and plants). Its predecessor, Evo 1, was trained exclusively on single-cell genomes. The model's StripedHyena 2 architecture is key to its power, allowing it to process incredibly long DNA sequences of up to 1 million base pairs at once. This long-context window enables the AI to understand relationships between distant parts of a genome. The training process took several months and utilized over 2,000 NVIDIA H100 GPUs. In a demonstration of its predictive power, Evo 2 achieved over 90% accuracy in distinguishing between benign and potentially pathogenic mutations in the BRCA1 gene, which is associated with breast cancer. The model can identify these variants without specific pre-training on the gene itself. Beyond prediction, Arc researchers have used the model to generate functional synthetic bacteriophages, which could have applications for treating antibiotic-resistant bacteria. An earlier version of Evo successfully generated a completely novel and functional CRISPR-Cas system as a proof of concept. The entire project—including the model weights, training data, and code—has been made fully open-source. It is also integrated into NVIDIA's BioNeMo framework, a platform designed to accelerate AI-driven biological research for scientists globally.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.