Landscape: 380 biology AI models
A mapping of biology AI models shows about 380 active models today, up from fewer than ten in 2015, and reports that roughly 63% of those models are trained on the same databases. The thread also notes roughly $18 billion of funding flowing into proprietary data, autonomous agents and robotic labs and highlights pharma deals like GSK’s $50M oncology investment and Novo Nordisk’s partnership with OpenAI. (x.com)
Biology artificial intelligence has gone from a niche research tool to a crowded market: one recent mapping counted more than 1,100 biological artificial intelligence models across nine categories. (epoch.ai) Those systems range from general protein language models, which learn patterns in DNA, RNA, or protein sequences the way chatbots learn patterns in text, to specialized tools for structure prediction, mutation effects, and drug design. Epoch AI said its earlier biology dataset covered 360-plus models in January 2025 before expanding the scope in a February 2026 report. (epoch.ai 1) (epoch.ai 2) The data behind those models is concentrated. Bessemer Venture Partners, citing Epoch AI, said nearly 63% of biology models are trained on protein sequences and structures from UniProt and the Protein Data Bank, two of the field’s standard public repositories. (bvp.com) That concentration has pushed companies toward proprietary datasets and automated labs. Bessemer said about $18 billion was raised by roughly 200 companies using artificial intelligence for drug discovery between 2012 and 2022, with newer spending aimed at data generation, autonomous agents, and robotics that can run experiments faster. (bvp.com) Drugmakers are now buying access to those systems. GSK said in January 2026 that it licensed Noetik’s virtual-cell foundation models for non-small cell lung cancer and colorectal cancer in a deal worth $50 million in upfront capital and near-term milestones, plus annual license fees. (businesswire.com) (fiercebiotech.com) Novo Nordisk made a broader move on April 14, 2026, announcing a partnership with OpenAI to use artificial intelligence from early drug discovery through manufacturing and commercial operations. Novo Nordisk said the agreement includes strict data governance and human oversight. (biospace.com) (cnbc.com) The pitch is speed and scale. Novo Nordisk said the partnership will help analyze complex datasets and identify promising drug candidates faster, while OpenAI has separately argued in its healthcare work that shared benchmarks are still needed to measure model performance and safety. (biospace.com) (openai.com) Researchers are also tracking a second effect: concentration in the science itself. A January 2026 Nature study found that artificial intelligence tools can raise individual scientists’ output while narrowing the range of topics studied and pulling work toward areas with the richest data. (nature.com) That leaves biology artificial intelligence with a familiar split. More models are being built, more money is chasing proprietary data, and the winners may depend less on who has another model than on who controls better experiments, cleaner datasets, and the labs that can produce them. (epoch.ai) (bvp.com)