DrugReflector model reports 17x hit-finding improvement

- Cellarity researchers’ DrugReflector model, highlighted again in social posts on May 23, uses transcriptomic signatures to rank compounds for phenotypic screening. (science.org) - The key benchmark is a 13-fold to 17-fold increase in hit rate versus random compound selection across two hematologic discovery campaigns. (cellarity.com) - The code repository and model checkpoints are already public on GitHub and Zenodo for researchers who want to reproduce the ranking workflow. (github.com)

Cellarity’s DrugReflector model is drawing renewed attention after social posts on May 23 pointed to benchmark results showing a roughly 17-fold improvement in hit finding during compound screening. The model is not a molecule generator and not a conventional target-based screener. (science.org) It is a ranking system that takes a gene-expression signature — a transcriptomic description of the cell-state change a researcher wants — and scores compounds by how likely they are to induce that shift. (cellarity.com) The underlying study was published in *Science* on October 23, 2025, by Benjamin DeMeo, Charlotte Nesbitt, Samuel A. (github.com) Miller and co-authors, including researchers at Cellarity, MIT and affiliated institutions. In that paper, the authors described DrugReflector as a deep-learning model trained on Connectivity Map perturbation data covering 9,597 perturbations across 52 cell lines. ### What exactly is the “17x” claim measuring? The 17x figure refers to hit rate, not to overall drug-development speed or clinical success. In the paper’s reported hematologic discovery campaigns, the authors said DrugReflector produced a 13-fold to 17-fold increase in hit rate compared with random compound selection. (science.org) MIT’s news clip summarizing the work described the result as “up to 17 times more effective at finding relevant compounds” than brute-force screening based on random picks from a chemical library. That wording tracks with the paper’s benchmark framing and helps explain why the number has circulated in recent social posts. (science.org) ### How does DrugReflector actually work? DrugReflector starts with transcriptomic input. The *Science* summary says the model ranks molecules by their likelihood of inducing a user-defined change in gene expression, using signatures derived from single-cell atlases to prioritize chemical interventions. (cellarity.com) The public GitHub repository describes the package as “compound ranking predictions from gene expression signatures using ensemble neural network models.” The repository also says the package includes signature-refinement tools, v-score computation utilities and data-processing functions for virtual drug screening with transcriptional signatures. (news.mit.edu) Zenodo’s model description says DrugReflector is an ensemble of three multilayer perceptron classifiers trained on Connectivity Map transcriptional signatures to predict compound classes from transcriptional signatures. In practical terms, the system is built to sort large compound libraries so lab teams can test a narrower, higher-priority set first. (science.org) ### Why are researchers pairing it with active learning? The paper did not stop at one-pass ranking. The authors said they added an active-learning loop that uses paired transcriptional and phenotypic readouts to refine target signatures and improve hit identification. (github.com) The Shalek Lab summary of the work said that closed-loop refinement in a megakaryocyte screen doubled the hit rate in new screens after researchers fed back data from 12 hit and 8 non-hit compounds. That matters because it frames DrugReflector as a lab-in-the-loop workflow, not just a static model. (zenodo.org) ### Where was it tested beyond the headline benchmark? The *Science* paper says the team ran two hematopoietic campaigns and also deployed DrugReflector in two oncology indications. In those external cancer datasets, the authors reported recovery of clinical standards of care and modulators of known indication-specific pathways. (science.org) The same study also introduced a perturbational single-cell RNA-sequencing dataset with 1.2 million cells spanning 88 perturbations across 10 primary and cancer cell lines. The authors said those data, together with held-out CMap and SciPlex signatures, showed the model could prioritize compounds outside its training context. (shaleklab.com) ### Can outside researchers inspect or run it now? The code is already public. Cellarity’s GitHub repository provides the software package, and the README links to Zenodo-hosted model checkpoints and transition-signature files used to reproduce the published workflow. (science.org) A Hugging Face Space labeled DrugReflector API also offers a research interface that accepts either an `.h5ad` matrix or a prepared gene-score signature and returns ranked candidate compounds. The next concrete step for outside groups is replication: testing whether the reported hit-rate gains hold in their own assay systems, cell contexts and screening libraries. (science.org) (huggingface.co) (github.com)

DrugReflector model reports 17x hit-finding improvement

Get your own daily briefing