Google has introduced MedGemma 1.5, a multimodal medical AI model family built on the Gemma architecture, alongside MedASR, a specialised automatic speech recognition system optimised for clinical conversations. The release significantly broadens access to high-performance, open-weight healthcare AI tools that support tasks ranging from radiology report generation and clinical summarisation to voice-enabled documentation and multilingual medical transcription.
Glimpse:
Announced on January 21, 2026, MedGemma 1.5 comes in 4B and 27B parameter variants (text-only and vision-language versions), achieving state-of-the-art or near state-of-the-art performance on 20+ medical benchmarks while remaining open for research and commercial fine-tuning. MedASR, a 600M-parameter model, delivers high-accuracy transcription of doctor-patient dialogues, including medical terminology, abbreviations, and noisy environments. Both models are now available via Hugging Face and Google Cloud, with MedGemma also offered through Google AI Studio for rapid prototyping.
Google has significantly expanded its portfolio of specialised healthcare AI models with the release of MedGemma 1.5 and MedASR, announced on January 21, 2026. The new models build directly on the lightweight, open-weight Gemma family and are designed to empower developers, researchers, healthcare institutions, and technology partners to build advanced medical applications with greater speed and lower cost than proprietary closed systems.
MedGemma 1.5 is available in two core sizes 4 billion and 27 billion parameters and in both text-only and vision-language configurations. The vision-language variant can interpret medical images (X-rays, CT, MRI, pathology slides, dermatology photos) alongside clinical text, enabling tasks such as automated report generation, visual question answering, and multimodal clinical reasoning. Early benchmarks show MedGemma 1.5 matching or surpassing larger closed models on many medical evaluation sets, including MedQA, PubMedQA, SLAKE, PathVQA, and multimodal radiology challenges, while using far fewer parameters and compute resources.
MedASR, a dedicated 600-million-parameter automatic speech recognition model, has been optimised specifically for clinical dialogue. It handles medical terminology, abbreviations, acronyms, drug names, and noisy real-world environments (clinics, emergency rooms, telehealth calls) with significantly higher accuracy than general-purpose ASR models. The model supports English and is being fine-tuned for additional languages commonly used in global healthcare settings.
Both models are released under permissive open weights licences, allowing commercial use, fine-tuning, and derivative work creation. They are immediately available for download on Hugging Face and can be run on Google Cloud, local hardware, or through Google AI Studio for rapid experimentation. Google has also published detailed technical reports and evaluation results, including comparisons against closed-source medical LLMs, to support transparency and independent verification.
The launch reflects Google’s continued investment in open, developer-friendly medical AI tools following earlier releases such as Med-PaLM, Med-Gemini, and MedLM. By offering high-performing, lightweight models under open licences, Google aims to democratise access to advanced healthcare AI capabilities, particularly for academic researchers, startups, and institutions in resource-constrained settings.
“MedGemma 1.5 and MedASR are designed to be practical, performant building blocks that developers and healthcare organisations can adapt quickly and responsibly to real clinical needs.”
By
HB Team

