As genomic sequencing becomes routine in oncology, clinicians are increasingly inundated with tumor mutation data—but often without clear guidance on how to use it. A new artificial intelligence foundation model called MutationProjector may help bridge that gap by transforming complex tumor mutation profiles into clinically meaningful representations that can predict treatment response, classify cancer subtypes, and potentially guide therapeutic decisions across tumor types.
The study, led by researchers at the University of California, San Diego (UCSD) and published in Cancer Discovery, describes how the model was trained on more than 30,000 tumor genomes spanning 10 solid tumor types and then applied to a range of downstream clinical tasks, including prediction of immunotherapy and chemotherapy response.
“Right now, increasingly physicians will order gene sequencing panels, and in a couple of years it’ll be the whole genome,” senior author UCSD’s Trey Ideker, PhD. “What’s interesting is that as prolific as that information is becoming in the cancer clinic, relatively few treatment decisions are actually made based on it.”
According to the paper, FDA-approved targeted therapies currently match only about 8% of patients based on sequencing findings. Yet the average tumor contains roughly 11 distinct genomic alterations identified through clinical sequencing. “The mutations are there,” Ideker said. “People just don’t know what to do with them, or whether they’re informative.”
MutationProjector was designed to address that challenge with architecture inspired by large-scale AI systems used in natural language processing and computer vision. Rather than focusing on single-gene biomarkers, the model attempts to capture broader mutational patterns associated with hallmark cancer pathways.
“We’ve known for years that cancer is really a disease of pathways,” Ideker said. “If you could move beyond individual gene biomarkers and instead understand the overall state of the tumor from its mutations, then you could potentially treat that tumor better.”
The model translates tumor genomic information into a compact coordinate-based representation—similar to a UMAP embedding—that places tumors with similar molecular features near one another. Ideker described the output as an “XY coordinate system for a tumor” that integrates information from thousands of mutations simultaneously.
The approach relies heavily on pretraining, a hallmark of modern foundation models. MutationProjector first learned generalizable representations from large collections of tumor genomic data, even when detailed clinical outcome information was unavailable. Researchers then fine-tuned the model using smaller, highly annotated datasets tied to specific therapeutic responses.
Ideker compared the strategy to language learning. “If I want to learn Italian but I only have one book in Italian, I’m going to learn a lot first from English and Spanish, where I have much more data,” he said. “Then I can transfer that knowledge to Italian much more effectively.”
Similarly, the model learned from tens of thousands of tumor genomes before being tuned to predict outcomes such as immunotherapy response using comparatively small patient cohorts.
The researchers reported that the pretrained embeddings achieved strong performance across multiple downstream tasks, including immunotherapy response prediction, chemotherapy response prediction, and metastasis classification. In one analysis, the model predicted immunotherapy response after fine-tuning on only 94 samples, yet still outperformed competing approaches across independent cohorts.
Importantly, the same pretrained model was applied across multiple tasks rather than building separate models for each application. “These findings reflect the current movement toward general-purpose AI systems capable of addressing a broad range of challenges,” the authors wrote.
The study also highlighted the model’s interpretability. By examining how tumors clustered within the embedding space, researchers identified biologically meaningful patterns associated with treatment response and tumor subtype.
For example, MutationProjector distinguished HPV-positive from HPV-negative tumors in cervical and head and neck cancers, partly through pathway signatures involving apoptosis regulation and Wnt signaling. The model also differentiated basal and luminal subtypes in bladder and breast cancers.
Perhaps more significantly for clinical oncology, the system identified both individual and combinatorial biomarkers linked to immunotherapy outcomes.
Among the findings were associations involving KMT2A and SMARCA4 alterations, which may influence immune response through chromatin remodeling and DNA methylation pathways. The model also captured co-alteration patterns such as KRAS-STK11 and STK11-KEAP1, combinations previously associated with resistance to immunotherapy.
“Those mutations individually may not tell you very much,” Ideker said. “But together, they can become highly predictive of treatment response.”
The authors argue that this ability to model combinatorial mutation patterns could help address one of precision oncology’s central limitations: the difficulty of interpreting rare or co-occurring mutations that do not fit established single-gene biomarker frameworks.
Although the current study focused on 10 solid tumor types, the researchers see considerable room for expansion. Ideker said the team has already scaled training datasets from 30,000 to roughly 300,000 tumor genomes while also increasing the size of the neural network itself.
“We’re now scaling this idea substantially,” he said. “The original work was really a demonstration that the concept works.”
Future directions include expanding into additional tumor types such as pancreatic cancer, prostate cancer, and sarcoma, integrating multimodal data sources including transcriptomics and radiology, and potentially applying the framework to liquid biopsy analysis.
Ideker also envisions the technology eventually supporting molecular tumor boards, where oncologists currently debate difficult treatment decisions based on limited genomic evidence.
“We want to get models like this into those discussions to assist in decision-making,” he said.
The long-term goal, he added, is not simply prediction, but deeper biological understanding. “Once you can interpret the mutational patterns that place tumors into different regions of the map, you start learning the pathways of drug resistance,” Ideker said. “And understanding those pathways is ultimately what drives the next generation of cancer therapeutics.”
The post AI Foundation Model Maps Tumor Mutations to Predict Cancer Therapy Response appeared first on Inside Precision Medicine.


