Google Releases MedGemma 1.5, Giving Open-Source AI Native 3D Medical Scan Vision

Google’s open-source suite introduces native 3D vision and hyper-accurate medical speech-to-text capabilities.

January 14, 2026

Google Releases MedGemma 1.5, Giving Open-Source AI Native 3D Medical Scan Vision
Google has dramatically advanced the capabilities of open-source medical artificial intelligence with the release of MedGemma 1.5, an updated multimodal model that introduces the ability to interpret complex, high-dimensional medical imaging data, including three-dimensional Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) scans. This iteration marks a pivotal moment for developers and researchers, as it is the first publicly available open multimodal large language model to offer native support for full volumetric data analysis, moving beyond the two-dimensional image interpretation of its predecessor. The company’s announcement also included a companion specialized speech-to-text tool, MedASR, which demonstrates significant gains in accuracy for medical dictation tasks, positioning the two models as a foundational, interoperable suite for the next generation of health AI applications.
The most transformative feature of MedGemma 1.5 is its expanded vision capabilities, which now encompass the complex data formats essential to modern clinical practice. While MedGemma 1 was restricted to interpreting 2D images such as chest X-rays and dermatological photos, the new 4B parameter model can process entire 3D CT and MRI volumes by analyzing multiple slices together with a natural language prompt, allowing for full context understanding of a patient’s internal anatomy.[1][2][3] This is further augmented by support for whole-slide histopathology imaging, which involves stitching together and analyzing numerous patches from a single high-resolution slide.[4][3] The performance improvements on internal benchmarks underscore the significance of this upgrade. The model demonstrated a substantial 14 percentage point increase in accuracy for classifying disease-related MRI findings, rising from 51 percent to 65 percent, and an improvement in CT disease findings from 58 percent to 61 percent.[4][2][3] Beyond static 3D interpretation, MedGemma 1.5 also introduces support for longitudinal analysis, enabling developers to build applications capable of tracking changes over time, such as comparing a current chest X-ray to a historical series of scans.[3]
Complementing the multimodal vision capabilities is the introduction of MedASR, a specialized speech-to-text model designed exclusively for the medical domain. This tool addresses a critical pain point in healthcare—the time-consuming and error-prone process of clinical dictation and note-taking.[2] Trained specifically on specialized healthcare vocabulary, MedASR is designed to seamlessly integrate with MedGemma 1.5, allowing a clinician to dictate a query about a specific 3D scan and have the AI process both the speech and the image data simultaneously.[3] The performance metrics provided by Google indicate a substantial leap over generalist models. When benchmarked against OpenAI’s Whisper large-v3, MedASR achieved a drastic reduction in errors, recording 58 percent fewer errors on chest X-ray dictations and an even more significant 82 percent fewer errors on an internal medical dictation benchmark that included diverse specialties and speakers.[5][2][3][6] For healthcare developers, this level of accuracy translates directly into more reliable and efficient clinical workflows, significantly reducing the Word Error Rate (WER) that can plague transcription in a field where a single misheard term can have serious consequences.[5]
The models are released as part of the Health AI Developer Foundations (HAI-DEF) program, reinforcing Google’s commitment to an open-source strategy for specialized AI foundational models.[4][2] This open approach is a strategic move, particularly in the highly regulated and sensitive healthcare sector. By making the models available on platforms like Hugging Face and Google Cloud's Vertex AI, the company allows for both research and commercial use.[2] The key differentiator of an open-source foundation model in medicine is the high degree of flexibility and customization it offers. Developers can download the model weights and run the compact 4B parameter version on their own hardware, or within private cloud environments, addressing critical institutional concerns regarding data privacy and security.[2][7] Furthermore, the open nature allows organizations to fine-tune and adapt the models on their own proprietary, localized datasets, which is often essential for achieving high performance in specific clinical use cases or within distinct patient populations.[8][7]
Crucially, the release is accompanied by strict disclaimers regarding clinical deployment, a standard but vital measure for pre-certification medical AI tools.[1] Google explicitly states that MedGemma 1.5 is intended to be used as a "starting point" for development and is "not yet clinical-grade."[8][7] The outputs generated by the models are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications without "appropriate validation, adaptation and/or making meaningful modification by developers for their specific use case."[7] This cautionary stance highlights the regulatory and ethical pathway ahead, emphasizing that the models are tools for developers to accelerate research and product creation, not finished, deployable clinical solutions. Despite this, the open availability of a powerful, multimodal foundation model is expected to accelerate AI adoption in healthcare, a sector already moving at approximately twice the pace of the broader economy in integrating artificial intelligence.[1][2][3] The MedGemma 1.5 and MedASR release is a significant signal in the competitive race among major AI labs, moving the goalposts for what is possible in open-access medical AI and creating a powerful new resource for global health innovation.

Sources
Share this article