Google DeepMind has developed and released Enformer 2, a next-generation AI model that analyses DNA sequences to predict disease risks with far greater precision than previous methods. By learning long-range interactions in the genome, the model identifies subtle regulatory patterns linked to conditions such as cancer, heart disease, diabetes, and rare genetic disorders potentially enabling earlier risk stratification, personalised prevention strategies, and more targeted research.
Glimpse:
Announced on January 26, 2026, Enformer 2 builds on DeepMind’s original Enformer model by incorporating larger context windows, multimodal training on human and model organism data, and improved interpretability. In benchmark tests, it outperformed state-of-the-art methods in predicting gene expression from DNA sequence and linking non-coding variants to clinical phenotypes. The open-source release includes pre-trained weights, inference code, and tools for researchers to apply the model to their own datasets, accelerating discovery in human genetics and drug target validation.
Google DeepMind has introduced Enformer 2, a significantly enhanced AI model that reads raw DNA sequences to predict how genetic variations influence gene regulation and disease risk. The announcement, made on January 26, 2026, follows years of iterative development since the original Enformer model debuted in 2021, and represents one of the most powerful publicly available tools for interpreting non-coding DNA the “dark matter” of the genome that controls when and where genes are expressed.
Enformer 2 dramatically expands the model’s context window allowing it to consider interactions across much longer genomic distances (up to 1 megabase) and incorporates multimodal training on diverse datasets, including human cell lines, mouse models, and primate genomes. This enables the model to capture long-range regulatory effects, enhancer-promoter loops, and tissue-specific patterns that previous methods often missed. In rigorous benchmarks, Enformer 2 achieved state-of-the-art performance in predicting gene expression levels from sequence alone and in linking non-coding variants to clinical traits, outperforming polygenic risk scores and other sequence-based predictors in many cases.
The model’s ability to “read” DNA has immediate implications for precision medicine. It can prioritise variants of unknown significance (VUS) in rare disease diagnostics, identify causal non-coding mutations in cancer, uncover regulatory drivers of common diseases like type 2 diabetes and coronary artery disease, and support in silico screening of therapeutic targets. By making the model fully open-source with pre-trained weights, inference code, and detailed documentation—DeepMind has ensured that researchers worldwide can apply Enformer 2 to their own cohorts and integrate it into clinical pipelines.
Demis Hassabis, CEO of Google DeepMind, emphasised the scientific and societal impact: “Most disease risk is hidden in the non-coding genome. Enformer 2 gives us a much clearer lens to see how DNA sequence influences biology and disease. By open-sourcing it, we hope to accelerate discoveries that lead to better prevention, earlier diagnosis, and more effective treatments for millions of people.”
The release has already generated excitement in the global genomics community. Early adopters report improved performance in fine-mapping GWAS loci, prioritising variants in rare disease exome sequencing, and identifying regulatory mechanisms in complex traits. The model’s interpretability features highlighting which sequence motifs and long-range interactions drive predictions also make it valuable for mechanistic research and drug target validation.
While Enformer 2 is a research tool rather than a clinical product, its open availability is expected to influence commercial diagnostics, polygenic risk scoring services, and pharmaceutical R&D pipelines in the coming years. DeepMind has committed to ongoing updates and community-driven fine-tuning to keep the model aligned with emerging genomic datasets.
“The non-coding genome is where most disease risk lives. Enformer 2 lets us read that hidden language at scale turning DNA sequences into precise predictions about health and disease.”
By
HB Team
