600+ immune proteins found
- An MIT tool called DefensePredictor reportedly identified over 600 previously unknown immune‑related proteins in E. coli in minutes. (x.com/i/status/2047072530070933697) - The social post credited the system with rapid discovery that would have taken far longer by traditional methods. (x.com/i/status/2047072530070933697) - Fast, AI‑driven protein discovery could speed basic research, but reproducibility and peer review remain crucial next steps. (x.com/i/status/2047072530070933697)
Bacteria fight viruses with proteins, much as animals use immune molecules, and a new MIT model called DefensePredictor flagged 624 likely defense proteins in 69 *Escherichia coli* strains. (science.org) The system uses a protein language model called Evolutionary Scale Model 2, or ESM2, to read amino-acid sequences the way a text model reads words, then adds each protein’s four nearest genomic neighbors to judge whether it is part of an anti-phage defense system. (science.org) Older searches often looked for “defense islands,” or clusters of known immune genes packed together on a genome. The Science paper says that strategy misses systems scattered elsewhere, including genes carried on plasmids, prophages, and transposons. (science.org) When the MIT team ran DefensePredictor on those 69 *E. coli* strains, it called 624 proteins defense-related with high confidence, and more than 100 had no detectable homology to known defense proteins. Nearly half were outside the usual genomic hotspots. (science.org) The researchers then moved from prediction to lab tests. They cloned 94 predicted systems into a susceptible *E. coli* strain and found that 42 protected against at least one of 24 phages, with 15 protein domains not previously validated as defensive. (science.org) The paper was published in *Science* in April 2026, after first appearing as a bioRxiv preprint in January 2025. The preprint reported 45 validated systems and more than 750 additional high-confidence proteins, numbers that were tightened in the journal version to 42 validated systems and 624 high-confidence proteins. (biorxiv.org, science.org) That revision matters because the social-media version of the story can blur preprint and peer-reviewed results. The peer-reviewed paper is the current record, and it shows a large screen followed by selective experimental confirmation rather than proof that every predicted protein is defensive. (biorxiv.org, science.org) The wider search went beyond *E. coli*. Applied to 1,000 diverse prokaryotic genomes, DefensePredictor identified more than 5,000 predicted defense proteins that were not clear homologs of known defense proteins. (science.org) Researchers care about bacterial virus defenses because earlier finds in the field produced tools such as CRISPR-Cas9 and restriction enzymes. A Nature news article published April 2, 2026 said two Science papers together described hundreds of thousands of candidate antiviral proteins across bacteria. (nature.com) MIT’s team has also released DefensePredictor as open-source software, with a public GitHub repository and a PyPI package updated on April 20, 2026. The next test is whether other labs can reproduce the hit rate and turn some of these newly flagged proteins into well-characterized systems rather than just promising predictions. (github.com, pypi.org)