

A newly complete database of human protein kinases and their preferred binding sites provides a powerful new platform to investigate cell signaling pathways.
Culminating 25 years of research, MIT, Harvard University, and Yale University scientists and collaborators have unveiled a comprehensive atlas of human tyrosine kinases — enzymes that regulate a wide variety of cellular activities — and their binding sites.
The addition of tyrosine kinases to a previously published dataset from the same group now completes a free, publicly available atlas of all human kinases and their specific binding sites on proteins, which together orchestrate fundamental cell processes such as growth, cell division, and metabolism.
Now, researchers can use data from mass spectrometry, a common laboratory technique, to identify the kinases involved in normal and dysregulated cell signaling in human tissue, such as during inflammation or cancer progression.
“I am most excited about being able to apply this to individual patients’ tumors and learn about the signaling states of cancer and heterogeneity of that signaling,” says Michael Yaffe, who is the David H. Koch Professor of Science at MIT, the director of the MIT Center for Precision Cancer Medicine, a member of MIT’s Koch Institute for Integrative Cancer Research, and a senior author of the new study. “This could reveal new druggable targets or novel combination therapies.”
The study, published in Nature, is the product of a long-standing collaboration with senior authors Lewis Cantley at Harvard Medical School and Dana-Farber Cancer Institute, Benjamin Turk at Yale School of Medicine, and Jared Johnson at Weill Cornell Medical College.
The paper’s lead authors are Tomer Yaron-Barir at Columbia University Irving Medical Center, and MIT’s Brian Joughin, with contributions from Kontstantin Krismer, Mina Takegami, and Pau Creixell.
Kinase kingdom
Human cells are governed by a network of diverse protein kinases that alter the properties of other proteins by adding or removing chemical compounds called phosphate groups. Phosphate groups are small but powerful: When attached to proteins, they can turn proteins on or off, or even dramatically change their function. Identifying which of the almost 400 human kinases phosphorylate a specific protein at a particular site on the protein was traditionally a lengthy, laborious process.
Beginning in the mid 1990s, the Cantley laboratory developed a method using a library of small peptides to identify the optimal amino acid sequence — called a motif, similar to a scannable barcode — that a kinase targets on its substrate proteins for the addition of a phosphate group. Over the ensuing years, Yaffe, Turk, and Johnson, all of whom spent time as postdocs in the Cantley lab, made seminal advancements in the technique, increasing its throughput, accuracy, and utility.
Johnson led a massive experimental effort exposing batches of kinases to these peptide libraries and observed which kinases phosphorylated which subsets of peptides. In a corresponding Nature paper published in January 2023, the team mapped more than 300 serine/threonine kinases, the other main type of protein kinase, to their motifs. In the current paper, they complete the human “kinome” by successfully mapping 93 tyrosine kinases to their corresponding motifs.
Next, by creating and using advanced computational tools, Yaron-Barir, Krismer, Joughin, Takegami, and Yaffe tested whether the results were predictive of real proteins, and whether the results might reveal unknown signaling events in normal and cancer cells. By analyzing phosphoproteomic data from mass spectrometry to reveal phosphorylation patterns in cells, their atlas accurately predicted tyrosine kinase activity in previously studied cell signaling pathways.
For example, using recently published phosphoproteomic data of human lung cancer cells treated with two targeted drugs, the atlas identified that treatment with erlotinib, a known inhibitor of the protein EGFR, downregulated sites matching a motif for EGFR. Treatment with afatinib, a known HER2 inhibitor, downregulated sites matching the HER2 motif. Unexpectedly, afatinib treatment also upregulated the motif for the tyrosine kinase MET, a finding that helps explain patient data linking MET activity to afatinib drug resistance.
Actionable results
There are two key ways researchers can use the new atlas. First, for a protein of interest that is being phosphorylated, the atlas can be used to narrow down hundreds of kinases to a short list of candidates likely to be involved. “The predictions that come from using this will still need to be validated experimentally, but it’s a huge step forward in making clear predictions that can be tested,” says Yaffe.
Second, the atlas makes phosphoproteomic data more useful and actionable. In the past, researchers might gather phosphoproteomic data from a tissue sample, but it was difficult to know what that data was saying or how to best use it to guide next steps in research. Now, that data can be used to predict which kinases are upregulated or downregulated and therefore which cellular signaling pathways are active or not.
“We now have a new tool now to interpret those large datasets, a Rosetta Stone for phosphoproteomics,” says Yaffe. “It is going to be particularly helpful for turning this type of disease data into actionable items.”
In the context of cancer, phosophoproteomic data from a patient’s tumor biopsy could be used to help doctors quickly identify which kinases and cell signaling pathways are involved in cancer expansion or drug resistance, then use that knowledge to target those pathways with appropriate drug therapy or combination therapy.
Yaffe’s lab and their colleagues at the National Institutes of Health are now using the atlas to seek out new insights into difficult cancers, including appendiceal cancer and neuroendocrine tumors. While many cancers have been shown to have a strong genetic component, such as the genes BRCA1 and BRCA2 in breast cancer, other cancers are not associated with any known genetic cause. “We’re using this atlas to interrogate these tumors that don’t seem to have a clear genetic driver to see if we can identify kinases that are driving cancer progression,” he says.
Biological insights
In addition to completing the human kinase atlas, the team made two biological discoveries in their recent study. First, they identified three main classes of phosphorylation motifs, or barcodes, for tyrosine kinases. The first class is motifs that map to multiple kinases, suggesting that numerous signaling pathways converge to phosphorylate a protein boasting that motif. The second class is motifs with a one-to-one match between motif and kinase, in which only a specific kinase will activate a protein with that motif. This came as a partial surprise, as tyrosine kinases have been thought to have minimal specificity by some in the field.
The final class includes motifs for which there is no clear match to one of the 78 classical tyrosine kinases. This class includes motifs that match to 15 atypical tyrosine kinases known to also phosphorylate serine or threonine residues. “This means that there’s a subset of kinases that we didn’t recognize that are actually playing an important role,” says Yaffe. It also indicates there may be other mechanisms besides motifs alone that affect how a kinase interacts with a protein.
The team also discovered that tyrosine kinase motifs are tightly conserved between humans and the worm species C. elegans, despite the species being separated by more than 600 million years of evolution. In other words, a worm kinase and its human homologue are phosphorylating essentially the same motif. That sequence preservation suggests that tyrosine kinases are highly critical to signaling pathways in all multicellular organisms, and any small change would be harmful to an organism.
The research was funded by the Charles and Marjorie Holloway Foundation, the MIT Center for Precision Cancer Medicine, the Koch Institute Frontier Research Program via L. Scott Ritterbush, the Leukemia and Lymphoma Society, the National Institutes of Health, Cancer Research UK, the Brain Tumour Charity, and the Koch Institute Support (core) grant from the National Cancer Institute.
Tumors can carry mutations in hundreds of different genes, and each of those genes may be mutated in different ways — some mutations simply replace one DNA nucleotide with another, while others insert or delete larger sections of DNA.
Until now, there has been no way to quickly and easily screen each of those mutations in their natural setting to see what role they may play in the development, progression, and treatment response of a tumor. Using a variant of CRISPR genome-editing known as prime editing, MIT researchers have now come up with a way to screen those mutations much more easily.
The researchers demonstrated their technique by screening cells with more than 1,000 different mutations of the tumor suppressor gene p53, all of which have been seen in cancer patients. This method, which is easier and faster than any existing approach, and edits the genome rather than introducing an artificial version of the mutant gene, revealed that some p53 mutations are more harmful than previously thought.
This technique could also be applied to many other cancer genes, the researchers say, and could eventually be used for precision medicine, to determine how an individual patient’s tumor will respond to a particular treatment.
“In one experiment, you can generate thousands of genotypes that are seen in cancer patients, and immediately test whether one or more of those genotypes are sensitive or resistant to any type of therapy that you’re interested in using,” says Francisco Sanchez-Rivera, an MIT assistant professor of biology, a member of the Koch Institute for Integrative Cancer Research, and the senior author of the study.
MIT graduate student Samuel Gould is the lead author of the paper, which appears today in Nature Biotechnology.
Editing cells
The new technique builds on research that Sanchez-Rivera began 10 years ago as an MIT graduate student. At that time, working with Tyler Jacks, the David H. Koch Professor of Biology, and then-postdoc Thales Papagiannakopoulos, Sanchez-Rivera developed a way to use CRISPR genome-editing to introduce into mice genetic mutations linked to lung cancer.
In that study, the researchers showed that they could delete genes that are often lost in lung tumor cells, and the resulting tumors were similar to naturally arising tumors with those mutations. However, this technique did not allow for the creation of point mutations (substitutions of one nucleotide for another) or insertions.
“While some cancer patients have deletions in certain genes, the vast majority of mutations that cancer patients have in their tumors also include point mutations or small insertions,” Sanchez-Rivera says.
Since then, David Liu, a professor in the Harvard University Department of Chemistry and Chemical Biology and a core institute member of the Broad Institute, has developed new CRISPR-based genome editing technologies that can generate additional types of mutations more easily. With base editing, developed in 2016, researchers can engineer point mutations, but not all possible point mutations. In 2019, Liu, who is also an author of the Nature Biotechnology study, developed a technique called prime editing, which enables any kind of point mutation to be introduced, as well as insertions and deletions.
“Prime editing in theory solves one of the major challenges with earlier forms of CRISPR-based editing, which is that it allows you to engineer virtually any type of mutation,” Sanchez-Rivera says.
When they began working on this project, Sanchez-Rivera and Gould calculated that if performed successfully, prime editing could be used to generate more than 99 percent of all small mutations seen in cancer patients.
However, to achieve that, they needed to find a way to optimize the editing efficiency of the CRISPR-based system. The prime editing guide RNAs (pegRNAs) used to direct CRISPR enzymes to cut the genome in certain spots have varying levels of efficiency, which leads to “noise” in the data from pegRNAs that simply aren’t generating the correct target mutation. The MIT team devised a way to reduce that noise by using synthetic target sites to help them calculate how efficiently each guide RNA that they tested was working.
“We can design multiple prime-editing guide RNAs with different design properties, and then we get an empirical measurement of how efficient each of those pegRNAs is. It tells us what percentage of the time each pegRNA is actually introducing the correct edit,” Gould says.
Analyzing mutations
The researchers demonstrated their technique using p53, a gene that is mutated in more than half of all cancer patients. From a dataset that includes sequencing information from more than 40,000 patients, the researchers identified more than 1,000 different mutations that can occur in p53.
“We wanted to focus on p53 because it’s the most commonly mutated gene in human cancers, but only the most frequent variants in p53 have really been deeply studied. There are many variants in p53 that remain understudied,” Gould says.
Using their new method, the researchers introduced p53 mutations in human lung adenocarcinoma cells, then measured the survival rates of these cells, allowing them to determine each mutation’s effect on cell fitness.
Among their findings, they showed that some p53 mutations promoted cell growth more than had been previously thought. These mutations, which prevent the p53 protein from forming a tetramer — an assembly of four p53 proteins — had been studied before, using a technique that involves inserting artificial copies of a mutated p53 gene into a cell.
Those studies found that these mutations did not confer any survival advantage to cancer cells. However, when the MIT team introduced those same mutations using the new prime editing technique, they found that the mutation prevented the tetramer from forming, allowing the cells to survive. Based on the studies done using overexpression of artificial p53 DNA, those mutations would have been classified as benign, while the new work shows that under more natural circumstances, they are not.
“This is a case where you could only observe these variant-induced phenotypes if you’re engineering the variants in their natural context and not with these more artificial systems,” Gould says. “This is just one example, but it speaks to a broader principle that we’re going to be able to access novel biology using these new genome-editing technologies.”
Because it is difficult to reactivate tumor suppressor genes, there are few drugs that target p53, but the researchers now plan to investigate mutations found in other cancer-linked genes, in hopes of discovering potential cancer therapies that could target those mutations. They also hope that the technique could one day enable personalized approaches to treating tumors.
“With the advent of sequencing technologies in the clinic, we’ll be able to use this genetic information to tailor therapies for patients suffering from tumors that have a defined genetic makeup,” Sanchez-Rivera says. “This approach based on prime editing has the potential to change everything.”
The research was funded, in part, by the National Institute of General Medical Sciences, an MIT School of Science Fellowship in Cancer Research, a Howard Hughes Medical Institute Hanna Gray Fellowship, the V Foundation for Cancer Research, a National Cancer Institute Cancer Center Support Grant, the Ludwig Center at MIT, a Koch Institute Frontier Award, the MIT Research Support Committee, and the Koch Institute Support (core) Grant from the National Cancer Institute.