Locally produced proteins help mitochondria function

One of the ways that cells ensure proteins end up where they're needed is creating them at that location, through a process called localized translation. New research from the Weissman Lab has expanded our understanding localized translation at mitochondria and sheds light on the organizational principles of genes and the proteins they encode.

Greta Friar | Whitehead Institute
August 27, 2025

Now, Weissman, who is also a professor of biology at the Massachusetts Institute of Technology and an HHMI Investigator, and postdoc in his lab Jingchuan Luo have expanded our knowledge of localized translation at mitochondria, structures that generate energy for the cell. In a paper published in Cell on August 27, they share a new tool, LOCL-TL, for studying localized translation in close detail, and describe the discoveries it enabled about two classes of proteins that are locally translated at mitochondria.

The importance of localized translation at mitochondria relates to their unusual origin. Mitochondria were once bacteria that lived within our ancestors’ cells. Over time the bacteria lost their autonomy and became part of the larger cells, which included migrating most of their genes into the larger cell’s genome in the nucleus. Cells evolved processes to ensure that proteins needed by mitochondria that are encoded in genes in the larger cell’s genome get transported to the mitochondria. Mitochondria retain a few genes in their own genome, so production of proteins from the mitochondrial genome and that of the larger cell’s genome must be coordinated to avoid mismatched production of mitochondrial parts. Localized translation may help cells to manage the interplay between mitochondrial and nuclear protein production—among other purposes.

How to detect local protein production

For a protein to be made, genetic code stored in DNA is read into RNA, and then the RNA is read or translated by a ribosome, a cellular machine that builds a protein according to the RNA code. Weissman’s lab previously developed a method to study localized translation by tagging ribosomes near a structure of interest, and then capturing the tagged ribosomes in action and observing the proteins they are making. This approach, called proximity-specific ribosome profiling, allows researchers to see what proteins are being made where in the cell. The challenge that Luo faced was how to tweak this method to capture only ribosomes at work near mitochondria.

Ribosomes work quickly, so a ribosome that gets tagged while making a protein at the mitochondria can move on to making other proteins elsewhere in the cell in a matter of minutes. The only way researchers can guarantee that the ribosomes they capture are still working on proteins made near the mitochondria is if the experiment happens very quickly.

Weissman and colleagues had previously solved this time sensitivity problem in yeast cells with a ribosome-tagging tool called BirA that is activated by the presence of the molecule biotin. BirA is fused to the cellular structure of interest, and tags ribosomes it can touch—but only once activated. Researchers keep the cell depleted of biotin until they are ready to capture the ribosomes, to limit the time when tagging occurs. However, this approach does not work with mitochondria in mammalian cells because they need biotin to function normally, so it cannot be depleted.

Luo and Weissman adapted the existing tool to respond to blue light instead of biotin. The new tool, LOV-BirA, is fused to the mitochondria’s outer membrane. Cells are kept in the dark until the researchers are ready. Then they expose the cells to blue light, activating LOV-BirA to tag ribosomes. They give it a few minutes and then quickly extract the ribosomes. This approach proved very accurate at capturing only ribosomes working at mitochondria.

The researchers then used a method originally developed by the Weissman lab to extract the sections of RNA inside of the ribosomes. This allows them to see exactly how far along in the process of making a protein the ribosome is when captured, which can reveal whether the entire protein is made at the mitochondria, or whether it is partly produced elsewhere and only gets completed at the mitochondria.

“One advantage of our tool is the granularity it provides,” Luo says. “Being able to see what section of the protein is locally translated helps us understand more about how localized translation is regulated, which can then allow us to understand its dysregulation in disease and to control localized translation in future studies.”

Two protein groups are made at mitochondria

Using these approaches, the researchers found that about twenty percent of the genes needed in mitochondria that are located in the main cellular genome are locally translated at mitochondria. These proteins can be divided into two distinct groups with different evolutionary histories and mechanisms for localized translation.

One group consists of relatively long proteins, each containing more than 400 amino acids or protein building blocks. These proteins tend to be of bacterial origin—present in the ancestor of mitochondria—and they are locally translated in both mammalian and yeast cells, suggesting that their localized translation has been maintained through a long evolutionary history.

Like many mitochondrial proteins encoded in the nucleus, these proteins contain a mitochondrial targeting sequence (MTS), a zip code that tells the cell where to bring them. The researchers discovered that most proteins containing an MTS also contain a nearby inhibitory sequence that prevents transportation until they are done being made. This group of locally translated proteins lacks the inhibitory sequence, so they are brought to the mitochondria during their production.

Production of these longer proteins begins anywhere in the cell, and then after approximately the first 250 amino acids are made, they get transported to the mitochondria. While the rest of the protein gets made, it is simultaneously fed into a channel that brings it inside the mitochondria. This ties up the channel for a long time, limiting import of other proteins, so cells can only afford to do this simultaneous production and import for select proteins. The researchers hypothesize that these bacterial-origin proteins are given priority as an ancient mechanism to ensure that they are accurately produced and placed within mitochondria.

The second locally translated group consists of short proteins, each less than 200 amino acids long. These proteins are more recently evolved, and correspondingly, the researchers found that the mechanism for their localized translation is not shared by yeast. Their mitochondrial recruitment happens at the RNA level. Two sequences within regulatory sections of each RNA molecule that do not encode the final protein instead code for the cell’s machinery to recruit the RNAs to the mitochondria.

The researchers searched for molecules that might be involved in this recruitment, and identified the RNA binding protein AKAP1, which exists at mitochondria. When they eliminated AKAP1, the short proteins were translated indiscriminately around the cell. This provided an opportunity to learn more about the effects of localized translation, by seeing what happens in its absence. When the short proteins were not locally translated, this led to the loss of various mitochondrial proteins, including those involved in oxidative phosphorylation, our cells’ main energy generation pathway.

In future research, Weissman and Luo will delve deeper into how localized translation affects mitochondrial function and dysfunction in disease. The researchers also intend to use LOCL-TL to study localized translation in other cellular processes, including in relation to embryonic development, neural plasticity, and disease.

“This approach should be broadly applicable to different cellular structures and cell types, providing many opportunities to understand how localized translation contributes to biological processes,” Weissman says. “We’re particularly interested in what we can learn about the roles it may play in diseases including neurodegeneration, cardiovascular diseases, and cancers.”

Luo et al. “Proximity-specific ribosome profiling reveals the logic of localized mitochondrial translation.” Cell, August 27, 2025. https://doi.org/10.1016/j.cell.2025.08.002

Mapping cells in time and space: a new tool reveals a detailed history of tumor growth

Weissman and colleagues have developed an advanced lineage tracing tool that not only captures an accurate family tree of cell divisions, but also combines that with spatial information: identifying where each cell ends up within a tissue.

Greta Friar | Whitehead Institute
July 24, 2025

All life is connected in a vast family tree. Every organism exists in relationship to its ancestors, descendants, and cousins, and the path between any two individuals can be traced. The same is true of cells within organisms—each of the trillions of cells in the human body is produced through successive divisions from a fertilized egg, and can all be related to one another through a cellular family tree. In simpler organisms such as the worm C. elegans, this cellular family tree has been fully mapped, but the cellular family tree of a human is many times larger and more complex.

In the past, Whitehead Institute Member Jonathan Weissman and other researchers have developed lineage tracing methods to track and reconstruct the family trees of cell divisions in model organisms in order to understand more about the relationships between cells and how they assemble into tissues, organs, and—in some cases—tumors. These methods could help to answer many questions about how organisms develop and diseases like cancer are initiated and progress.

Now, Weissman and colleagues have developed an advanced lineage tracing tool that not only captures an accurate family tree of cell divisions, but also combines that with spatial information: identifying where each cell ends up within a tissue. The researchers used their tool, PEtracer, to observe the growth of metastatic tumors in mice. Combining lineage tracing and spatial data provided the researchers with a detailed view of how elements intrinsic to the cancer cells and from their environments influenced tumor growth, as Weissman and postdocs in his lab Luke Koblan, Kathryn Yost, and Pu Zheng, and graduate student William Colgan share in a paper published in the journal Science on July 24.

“Developing this tool required combining diverse skillsets through the sort of ambitious interdisciplinary collaboration that’s only possible at a place like Whitehead Institute,” says Weissman, who is also a professor of biology at the Massachusetts Institute of Technology and an HHMI Investigator. “Luke came in with an expertise in genetic engineering, Pu in imaging, Katie in cancer biology, and William in computation but the real key to their success was their ability to work together to build PEtracer.”

“Understanding how cells move in time and space is an important way to look at biology, and here we were able to see both of those things in high resolution. The idea is that by understanding both a cell’s past and where it ends up, you can see how different factors throughout its life influenced its behaviors. In this study we use these approaches to look at tumor growth, though in principle we can now begin to apply these tools to study other biology of interest like embryonic development,” Koblan says.

Designing a tool to track cells in space and time

PEtracer tracks cells’ lineages by repeatedly adding short, predetermined codes to the DNA of cells over time. Each piece of code, called a lineage tracing mark, is made up of 5 bases, the building blocks of DNA. These marks are inserted using a gene editing technology called prime editing, which directly rewrites stretches of DNA with minimal undesired byproducts. Over time, each cell acquires more lineage tracing marks, while also maintaining the marks of its ancestors. The researchers can then compare cells’ combinations of marks to figure out relationships and reconstruct the family tree.

“We used computational modeling to design the tool from first principles, to make sure that it was highly accurate, and compatible with imaging technology. We ran many simulations to land on the optimal parameters for a new lineage tracing tool, and then engineered our system to fit those parameters,” Colgan says.

When the tissue—in this case, a tumor growing in the lung of a mouse—had sufficiently grown, the researchers collected these tissues and used advanced imaging approaches to look at each cell’s lineage relationship to other cells via the lineage tracing marks, along with its spatial position within the imaged tissue and its identity (as determined by the levels of different RNAs expressed in each cell). PEtracer is compatible with both imaging approaches and sequencing methods that capture genetic information from single cells.

“Making it possible to collect and analyze all of this data from the imaging was a large challenge,” Zheng says. “What’s particularly exciting to me is not just that we were able to collect terabytes of data, but that we designed the project to collect data that we knew we could use to answer important questions and drive biological discovery.”

Reconstructing the history of a tumor

Combining the lineage tracing, gene expression, and spatial data let the researchers understand how the tumor grew. They could tell how closely related neighboring cells are and compare their traits. Using this approach, the researchers found that the tumors they were analyzing were made up of four distinct modules, or neighborhoods, of cells.

The tumor cells closest to the lung, the most nutrient-dense region, were the most fit, meaning their lineage history indicated the highest rate of cell division over time. Fitness in cancer cells tends to correlate to how aggressively tumors will grow.

The cells at the “leading edge” of the tumor, the far side from the lung, were more diverse and not as fit. Below the leading edge was a low-oxygen neighborhood of cells that might once have been leading edge cells, now trapped in a less desirable spot. Between these cells and the lung-adjacent cells was the tumor core, a region with both living and dead cells as well as cellular debris.

The researchers found that cancer cells across the family tree were equally likely to end up in most of the regions, with the exception of the lung adjacent region, where a few branches of the family tree dominated. This suggests that the cancer cells’ differing traits were heavily influenced by their environments, or the conditions in their local neighborhoods, rather than their family history. Further evidence of this point was that expression of certain fitness-related genes, such as Fgf1/Fgfbp1, correlated to a cell’s location rather than its ancestry. However, lung adjacent cells also had inherited traits that gave them an edge, including expression of the fitness-related gene Cldn4­—showing that family history influenced outcomes as well.

These findings demonstrate how cancer growth is influenced both by factors intrinsic to certain lineages of cancer cells and by environmental factors that shape the behavior of cancer cells exposed to them.

“By looking at so many dimensions of the tumor in concert, we could gain insights that would not have been possible with a more limited view,” Yost says. “Being able to characterize different populations of cells within a tumor will enable researchers to develop therapies that target the most aggressive populations more effectively.”

“Now that we’ve done the hard work of designing the tool, we’re excited to apply it to look at all sorts of questions in health and disease, in embryonic development, and across other model species, with an eye toward understanding important problems in human health,” Koblan says. “The data we collect will also be useful for training AI models of cellular behavior. We’re excited to share this technology with other researchers and see what we all can discover.”

Luke W. Koblan, Kathryn E. Yost, Pu Zheng, William N. Colgan, Matthew G. Jones, Dian Yang, Arhan Kumar, Jaspreet Sandhu, Alexandra Schnell, Dawei Sun, Can Ergen, Reuben A. Saunders, Xiaowei Zhuang, William E. Allen, Nir Yosef, Jonathan S. Weissman. “High-resolution spatial mapping of cell state and lineage dynamics in vivo with PEtracer.” Science, online July 24, 2025. https://doi.org/10.1126/science.adx3800

Yunha Hwang

Education 

  • PhD, 2024, Evolutionary and Organismic Biology, Harvard University
  • MS, 2018, Earth Systems, Stanford University
  • B.Sc, 2018, Computer Science, Stanford University

Research Summary

Microbial genomes encode the largest molecular, biochemical, and functional diversity on Earth. We focus on developing machine learning models and experimental approaches to discover and design novel biological functions. We integrate computation with expertise in evolution, ecology, and biochemistry to characterize and harness the functional potential of microbes.

Putting liver cells in context: new method combines imaging and sequencing to study gene function in living tissue

Researchers in the Weissman Lab have developed a powerful approach that simultaneously measures how genetic changes such as turning off individual genes affect both gene expression and cell structure in intact liver tissue, with the goal of discovering how genes control organ function and disease.

Whitehead Institute
June 12, 2025

 

However, capturing both the “visuals and sound” of biological data, such as gene expression and cell structure data, from the same cells requires researchers to develop new approaches. They also have to make sure that the data they capture accurately reflects what happens in living organisms, including how cells interact with each other and their environments.

Whitehead Institute and Harvard University researchers have taken on these challenges and developed Perturb-Multimodal (Perturb-Multi), a powerful new approach that simultaneously measures how genetic changes such as turning off individual genes affect both gene expression and cell structure in intact liver tissue. The method, described in Cell on June 12, aims to accelerate discovery of how genes control organ function and disease.

The research team, led by Whitehead Institute Member Jonathan Weissman and then-graduate student in his lab Reuben Saunders, along with Xiaowei Zhuang, the David B. Arnold Professor of Science at Harvard University, and then-postdoc in her lab Will Allen, created a system that can test hundreds of different genetic modifications within a single mouse liver while capturing multiple types of data from the same cells.

“Understanding how our organs work requires looking at many different aspects of cell biology at once,” Saunders says. “With Perturb-Multi, we can see how turning off specific genes changes not just what other genes are active, but also how proteins are distributed within cells, how cellular structures are organized, and where cells are located in the tissue. It’s like having multiple specialized microscopes all focused on the same experiment.”

“This approach accelerates discovery by both allowing us to test the functions of many different genes at once, and then for each gene, allowing us to measure many different functional outputs or cell properties at once—and we do that in intact tissue from animals,” says Zhuang, who is also an HHMI Investigator.

A more efficient approach to genetic studies

Traditional genetic studies in mice often turn off one gene in an animal, and then observe what changes in that gene’s absence to learn about what the gene does. The researchers designed their approach to turn off hundreds of different genes across a single liver, while still only turning off one gene per cell—using what is known as a mosaic approach. This allowed them to study the roles of hundreds of individual genes at once in a single animal. The researchers then collected diverse types of data from cells across the same liver to get a full picture of the consequences of turning off the genes.

“Each cell serves as its own experiment, and because all the cells are in the same animal, we eliminate the variability that comes from comparing different mice,” Saunders says. “Every cell experiences the same physiological conditions, diet, and environment, making our comparisons much more precise.”

“The challenge we faced was that tissues, to perform their functions, rely on thousands of genes, expressed in many different cells, working together. Each gene, in turn, can control many aspects of a cell’s function. Testing these hundreds of genes in mice using current methods would be extremely slow and expensive—near impossible in practice,” Allen says.

Revealing new biology through combined measurements

The team applied Perturb-Multi to study genetic controls of liver physiology and function. Their study led to discoveries in three important aspects of liver biology: fat accumulation in liver cells—a precursor to liver disease; stress responses; and hepatocyte zonation (how liver cells specialize, assuming different traits and functions, based on their location within the liver).

“Overcoming the inherent complexity of biology in living animals required developing new tools that bridge multiple disciplines – including, in this case, genomics, imaging, and AI,” Allen says.

One striking finding emerged from studying genes that, when disrupted, cause fat accumulation in liver cells. The imaging data revealed that four different genes all led to similar fat droplet accumulation, but the sequencing data showed they did so through three completely different mechanisms.

“Without combining imaging and sequencing, we would have missed this complexity entirely,” Saunders says. “The imaging told us which genes affect fat accumulation, while the sequencing revealed whether this was due to increased fat production, cellular stress, or other pathways. This kind of mechanistic insight could be crucial for developing targeted therapies for fatty liver disease.”

The researchers also discovered new regulators of liver cell zonation. Unexpectedly, the newly discovered regulators include genes involved in modifying the extracellular matrix—the scaffolding between cells. “We found that cells can change their specialized functions without physically moving to a different zone,” Saunders says. “This suggests that liver cell identity is more flexible than previously thought.”

Technical innovation enables new science

Developing Perturb-Multi required solving several technical challenges. The team created new methods for preserving the content of interest in cells—RNA and proteins—during tissue processing, for collecting many types of imaging data and single-cell gene expression data from tissue samples that have been fixed with a preservative, and for integrating multiple types of data from the same cells.

“Overcoming the inherent complexity of biology in living animals required developing new tools that bridge multiple disciplines – including, in this case, genomics, imaging, and AI,” Allen says.

The two components of Perturb-Multi—the imaging and sequencing assays—together, applied to the same tissue, provide insights that are unattainable through either assay alone.

“Each component had to work perfectly while not interfering with the others,” says Weissman, who is also a professor of biology at the Massachusetts Institute of Technology and an HHMI Investigator. “The technical development took considerable effort, but the payoff is a system that can reveal biology we simply couldn’t see before.”

Expanding to new organs and other contexts

The researchers plan to expand Perturb-Multi to other organs, including the brain, and to study how genetic changes affect organ function under different conditions like disease states or dietary changes.

“Without combining imaging and sequencing, we would have missed this complexity entirely,” Saunders says.

“We’re also excited about using the data we generate to train machine learning models,” adds Saunders. “With enough examples of how genetic changes affect cells, we could eventually predict the effects of mutations without having to test them experimentally—a ‘virtual cell’ that could accelerate both research and drug development.”

“Perturbation data are critical for training such AI models and the paucity of existing perturbation data represents a major hindrance in such ‘virtual cell’ efforts,” Zhuang says. “We hope Perturb-Multi will fill this gap by accelerating the collection of perturbation data.”

The approach is designed to be scalable, with the potential for genome-wide studies that test thousands of genes simultaneously. As sequencing and imaging technologies continue to improve, the researchers anticipate that Perturb-Multi will become even more powerful and accessible to the broader research community.

“Our goal is to keep scaling up. We plan to do genome-wide perturbations, study different physiological conditions, and look at different organs,” says Weissman. “That we can now collect so many types of data from so many cells, at speed, is going to be critical for building AI models like virtual cells, and I think it’s going to help us answer previously unsolvable questions about health and disease.”

Notes

Reuben A. Saunders, William E. Allen, Xingjie Pan, Jaspreet Sandhu, Jiaqi Lu, Thomas K. Lau, Karina Smolyar, Zuri A. Sullivan, Catherine Dulac, Jonathan S. Weissman, Xiaowei Zhuang. “Perturb-Multimodal: a Platform for Pooled Genetic Screens with Sequencing and Imaging in Intact Mammalian Tissue.” Cell, June 12, 2025. DOI: 10.1016/j.cell.2025.05.022.

Taking the pulse of sex differences in the heart

Work led by Talukdar and Page Lab postdoc Lukáš Chmátal shows that there are differences in how healthy male and female heart cells—specifically, cardiomyocytes, the muscle cells responsible for making the heart beat—generate energy.

Greta Friar | Whitehead Institute
February 18, 2025

Heart disease is the number one killer of men and women, but it often presents differently depending on sex. There are sex differences in the incidence, outcomes, and age of onset of different types of heart problems. Some of these differences can be explained by social factors—for example, women experience less-well recognized symptoms when having heart attacks, and so may take longer to be diagnosed and treated—but others are likely influenced by underlying differences in biology. Whitehead Institute Member David Page and colleagues have now identified some of these underlying biological differences in healthy male and female hearts, which may contribute to the observed differences in disease.

“My sense is that clinicians tend to think that sex differences in heart disease are due to differences in behavior,” says Harvard-MIT MD-PhD student Maya Talukdar, a graduate student in Page’s lab. “Behavioral factors do contribute, but even when you control for them, you still see sex differences. This implies that there are more basic physiological differences driving them.”

Page, who is also an HHMI Investigator and a professor of biology at the Massachusetts Institute of Technology, and members of his lab study the underlying biology of sex differences in health and disease, and recently they have turned their attention to the heart. In a paper published on February 17 in the women’s health edition of the journal Circulation, work led by Talukdar and Page lab postdoc Lukáš Chmátal shows that there are differences in how healthy male and female heart cells—specifically, cardiomyocytes, the muscle cells responsible for making the heart beat—generate energy.

“The heart is a hard-working pump, and heart failure often involves an energy crisis in which the heart can’t summon enough energy to pump blood fast enough to meet the body’s needs,” says Page. “What is intriguing about our current findings and their relationship to heart disease is that we’ve discovered sex differences in the generation of energy in cardiomyocytes, and this likely sets up males and females differently for an encounter with heart failure.”

Page and colleagues began their work by looking for sex differences in healthy hearts because they hypothesize that these impact sex differences in heart disease. Differences in baseline biology in the healthy state often affect outcomes when challenged by disease; for example, people with one copy of the sickle cell trait are more resistant to malaria, certain versions of the HLA gene are linked to slower progression of HIV, and variants of certain genes may protect against developing dementia.

Identifying baseline traits in the heart and figuring out how they interact with heart disease could not only reveal more about heart disease, but could also lead to new therapeutic strategies. If one group has a trait that naturally protects them against heart disease, then researchers can potentially develop medical therapies that induce or recreate that protective feature in others. In such a manner, Page and colleagues hope that their work to identify baseline sex differences could ultimately contribute to advances in prevention and treatment of heart disease.

The new work takes the first step by identifying relevant baseline sex differences. The researchers combined their expertise in sex differences with heart expertise provided by co-authors Christine Seidman, a Harvard Medical School professor and director of the Cardiovascular Genetics Center at Brigham and Women’s Hospital; Harvard Medical School Professor Jonathan Seidman; and Zoltan Arany, a professor and director of the Cardiovascular Metabolism Program at the University of Pennsylvania.

Along with providing heart expertise, the Seidmans and Arany provided data collected from healthy hearts. Gaining access to healthy heart tissue is difficult, and so the researchers felt fortunate to be able to perform new analyses on existing datasets that had not previously been looked at in the context of sex differences. The researchers also used data from the publicly available Genotype-Tissue Expression Project. Collectively, the datasets provided information on bulk and single cell gene expression, as well as metabolomics, of heart tissue—and in particular, of cardiomyocytes.

The researchers searched these datasets for differences between male and female hearts, and found evidence that female cardiomyocytes have higher activity of the primary pathway for energy generation than male cardiomyocytes. Fatty acid oxidation (FAO) is the pathway that produces most of the energy that powers the heart, in the form of the energy molecule ATP. The researchers found that many genes involved in FAO have higher expression levels in female cardiomyocytes. Metabolomic data reinforced these findings by showing that female hearts had greater flux of free fatty acids, the molecules used in FAO, and that female hearts used more free fatty acids than did males in the generation of ATP.

Altogether, these findings show that there are fundamental differences in how female and male hearts generate energy to pump blood. Further experiments are needed to explore whether these differences contribute to the sex differences seen in heart disease. The researchers suspect that an association is likely, because energy production is essential to heart function and failure.

In the meantime, Page and his lab members continue to investigate the biology underlying sex differences in tissues and organs throughout the body.

“We have a lot to learn about the molecular origins of sex differences in health and disease,” Chmátal says. “What’s exciting to me is that the knowledge that comes from these basic science discoveries could lead to treatments that benefit men and women, as well as to policy changes that take sex differences into account when determining how doctors are trained and patients are diagnosed and treated.”

MIT biologists discover a new type of control over RNA splicing

They identified proteins that influence splicing of about half of all human introns, allowing for more complex types of gene regulation.

Anne Trafton | MIT News
February 20, 2025

RNA splicing is a cellular process that is critical for gene expression. After genes are copied from DNA into messenger RNA, portions of the RNA that don’t code for proteins, called introns, are cut out and the coding portions are spliced back together.

This process is controlled by a large protein-RNA complex called the spliceosome. MIT biologists have now discovered a new layer of regulation that helps to determine which sites on the messenger RNA molecule the spliceosome will target.

The research team discovered that this type of regulation, which appears to influence the expression of about half of all human genes, is found throughout the animal kingdom, as well as in plants. The findings suggest that the control of RNA splicing, a process that is fundamental to gene expression, is more complex than previously known.

“Splicing in more complex organisms, like humans, is more complicated than it is in some model organisms like yeast, even though it’s a very conserved molecular process. There are bells and whistles on the human spliceosome that allow it to process specific introns more efficiently. One of the advantages of a system like this may be that it allows more complex types of gene regulation,” says Connor Kenny, an MIT graduate student and the lead author of the study.

Christopher Burge, the Uncas and Helen Whitaker Professor of Biology at MIT, is the senior author of the study, which appears today in Nature Communications.

Building proteins

RNA splicing, a process discovered in the late 1970s, allows cells to precisely control the content of the mRNA transcripts that carry the instructions for building proteins.

Each mRNA transcript contains coding regions, known as exons, and noncoding regions, known as introns. They also include sites that act as signals for where splicing should occur, allowing the cell to assemble the correct sequence for a desired protein. This process enables a single gene to produce multiple proteins; over evolutionary timescales, splicing can also change the size and content of genes and proteins, when different exons become included or excluded.

The spliceosome, which forms on introns, is composed of proteins and noncoding RNAs called small nuclear RNAs (snRNAs). In the first step of spliceosome assembly, an snRNA molecule known as U1 snRNA binds to the 5’ splice site at the beginning of the intron. Until now, it had been thought that the binding strength between the 5’ splice site and the U1 snRNA was the most important determinant of whether an intron would be spliced out of the mRNA transcript.

In the new study, the MIT team discovered that a family of proteins called LUC7 also helps to determine whether splicing will occur, but only for a subset of introns — in human cells, up to 50 percent.

Before this study, it was known that LUC7 proteins associate with U1 snRNA, but the exact function wasn’t clear. There are three different LUC7 proteins in human cells, and Kenny’s experiments revealed that two of these proteins interact specifically with one type of 5’ splice site, which the researchers called “right-handed.” A third human LUC7 protein interacts with a different type, which the researchers call “left-handed.”

The researchers found that about half of human introns contain a right- or left-handed site, while the other half do not appear to be controlled by interaction with LUC7 proteins. This type of control appears to add another layer of regulation that helps remove specific introns more efficiently, the researchers say.

“The paper shows that these two different 5’ splice site subclasses exist and can be regulated independently of one another,” Kenny says. “Some of these core splicing processes are actually more complex than we previously appreciated, which warrants more careful examination of what we believe to be true about these highly conserved molecular processes.”

“Complex splicing machinery”

Previous work has shown that mutation or deletion of one of the LUC7 proteins that bind to right-handed splice sites is linked to blood cancers, including about 10 percent of acute myeloid leukemias (AMLs). In this study, the researchers found that AMLs that lost a copy of the LUC7L2 gene have inefficient splicing of right-handed splice sites. These cancers also developed the same type of altered metabolism seen in earlier work.

“Understanding how the loss of this LUC7 protein in some AMLs alters splicing could help in the design of therapies that exploit these splicing differences to treat AML,” Burge says. “There are also small molecule drugs for other diseases such as spinal muscular atrophy that stabilize the interaction between U1 snRNA and specific 5’ splice sites. So the knowledge that particular LUC7 proteins influence these interactions at specific splice sites could aid in improving the specificity of this class of small molecules.”

Working with a lab led by Sascha Laubinger, a professor at Martin Luther University Halle-Wittenberg, the researchers found that introns in plants also have right- and left-handed 5’ splice sites that are regulated by Luc7 proteins.

The researchers’ analysis suggests that this type of splicing arose in a common ancestor of plants, animals, and fungi, but it was lost from fungi soon after they diverged from plants and animals.

“A lot what we know about how splicing works and what are the core components actually comes from relatively old yeast genetics work,” Kenny says. “What we see is that humans and plants tend to have more complex splicing machinery, with additional components that can regulate different introns independently.”

The researchers now plan to further analyze the structures formed by the interactions of Luc7 proteins with mRNA and the rest of the spliceosome, which could help them figure out in more detail how different forms of Luc7 bind to different 5’ splice sites.

The research was funded by the U.S. National Institutes of Health and the German Research Foundation.

A planarian’s guide to growing a new head

Researchers at the Whitehead Institute have described a pathyway by which planarians, freshwater flatworms with spectacular regenerative capabilities, can restore large portions of their nervous system, even regenerating a new head with a fully functional brain.

Shafaq Zia | Whitehead Institute
February 6, 2025

Cut off any part of this worm’s body and it will regrow. This is the spectacular yet mysterious regenerative ability of freshwater flatworms known as planarians. The lab of Whitehead Institute Member Peter Reddien investigates the principles underlying this remarkable feat. In their latest study, published in PLOS Genetics on February 6, first author staff scientist M. Lucila Scimone, Reddien, and colleagues describe how planarians restore large portions of their nervous system—even regenerating a new head with a fully functional brain—by manipulating a signaling pathway.

This pathway, called the Delta-Notch signaling pathway, enables neurons to guide the differentiation of a class of progenitors—immature cells that will differentiate into specialized types—into glia, the non-neuronal cells that support and protect neurons. The mechanism ensures that the spatial pattern and relative numbers of neurons and glia at a given location are precisely restored following injury.

“This process allows planarians to regenerate neural circuits more efficiently because glial cells form only where needed, rather than being produced broadly within the body and later eliminated,” said Reddien, who is also a professor of biology at Massachusetts Institute of Technology and an Investigator with the Howard Hughes Medical Institute.

Coordinating regeneration

Multiple cell types work together to form a functional human brain. These include neurons and a more abundant group of cells called glial cells—astrocytes, microglia, and oligodendrocytes. Although glial cells are not the fundamental units of the nervous system, they perform critical functions in maintaining the connections between neurons, called synapses, clearing away dead cells and other debris, and regulating neurotransmitter levels, effectively holding the nervous system together like glue. A few years ago, Reddien and colleagues discovered cells in planarians that looked like glial cells and performed similar neuro-supportive functions. This led to the first characterization of glial cells in planarians in 2016.

Unlike in mammals where the same set of neural progenitors give rise to both neurons and glia, glial cells in planarians originate from a separate, specialized group of progenitors. These progenitors, called phagocytic progenitors, can not only give rise to glial cells but also pigment cells that determine the worm’s coloration, as well as other, lesser understood cell types.

Why neurons and glia in planarians originate from distinct progenitors—and what factors ultimately determine the differentiation of phagocytic progenitors into glia—are questions that still puzzled Reddien and team members. Then, a study showing that planarian neurons regenerate before glia formation led the researchers to wonder whether a signaling mechanism between neurons and phagocytic progenitors guides the specification of glia in planarians.

The first step to unravel this mystery was to look at the Notch signaling pathway, which is known to play a crucial role in the development of neurons and glia in other organisms, and determine its role in planarian glia regeneration. To do this, the researchers used RNA interference (RNAi)—a technique that decreases or completely silences the expression of genes—to turn off key genes involved in the Notch pathway and amputated the planarian’s head. It turned out Notch signaling is essential for glia regeneration and maintenance in planarians—no glial cells were found in the animal following RNAi, while the differentiation of other types of phagocytic cells was unaffected.

Of the different Notch signaling pathway components the researchers tested, turning of the genes notch-1delta-2, and suppressor of hairless produced this phenotype. Interestingly, the signaling molecules Delta-2 was found on the surface of neurons, whereas Notch-1 was expressed in phagocytic progenitors.

With these findings in hand, the researchers hypothesized that interaction between Delta-2 on neurons and Notch-1 on phagocytic progenitors could be governing the final fate determination of glial cells in planarians.

To test the hypothesis, the researchers transplanted eyes either from planarians lacking the notch-1 gene or from planarians lacking the delta-2 gene into wild-type animals and assessed the formation of glial cells around the transplant site. They observed that glial cells still formed around the notch-1 deficient eyes, as notch-1 was still active in the glial progenitors of the host wild-type animal. However, no glial cells formed around the delta-2 deficient eyes, even with the Notch signaling pathway intact in phagocytic progenitors, confirming that delta-2 in the photoreceptor neurons is required for the differentiation of phagocytic progenitors into glia near the eye.

“This experiment really showed us that you have two faces of the same coin—one is the phagocytic progenitors expressing Notch-1, and one is the neurons expressing Delta-2—working together to guide the specification of glia in the organism,”said Scimone.

The researchers have named this phenomenon coordinated regeneration, as it allows neurons to influence the pattern and number of glia at specific locations without the need for a separate mechanism to adjust the relative numbers of neurons and glia.

The group is now interested in investigating whether the same phenomenon might also be involved in the regeneration of other tissue types.

A sum of their parts

Researchers in the Department of Biology at MIT use an AI-driven approach to computationally predict short amino acid sequences that can bind to or inhibit a target, with a potential for great impact on fundamental biological research and therapeutic applications.

Lillian Eden | Department of Biology
February 6, 2025

All biological function is dependent on how different proteins interact with each other. Protein-protein interactions facilitate everything from transcribing DNA and controlling cell division to higher-level functions in complex organisms.

Much remains unclear about how these functions are orchestrated on the molecular level, however, and how proteins interact with each other — either with other proteins or with copies of themselves. 

Recent findings have revealed that small protein fragments have a lot of functional potential. Even though they are incomplete pieces, short stretches of amino acids can still bind to interfaces of a target protein, recapitulating native interactions. Through this process, they can alter that protein’s function or disrupt its interactions with other proteins. 

Protein fragments could therefore empower both basic research on protein interactions and cellular processes and could potentially have therapeutic applications. 

Recently published in Proceedings of the National Academy of Sciences, a new computational method developed in the Department of Biology at MIT builds on existing AI models to computationally predict protein fragments that can bind to and inhibit full-length proteins in E. coli. Theoretically, this tool could lead to genetically encodable inhibitors against any protein. 

The work was done in the lab of Associate Professor of Biology and HHMI Investigator Gene-Wei Li in collaboration with the lab of Jay A. Stein (1968) Professor of Biology, Professor of Biological Engineering and Department Head Amy Keating.

Leveraging Machine Learning

The program, called FragFold, leverages AlphaFold, an AI model that has led to phenomenal advancements in biology in recent years due to its ability to predict protein folding and protein interactions. 

The goal of the project was to predict fragment inhibitors, which is a novel application of AlphaFold. The researchers on this project confirmed experimentally that more than half of FragFold’s predictions for binding or inhibition were accurate, even when researchers had no previous structural data on the mechanisms of those interactions. 

“Our results suggest that this is a generalizable approach to find binding modes that are likely to inhibit protein function, including for novel protein targets, and you can use these predictions as a starting point for further experiments,” says co-first and corresponding author Andrew Savinov, a postdoc in the Li Lab. “We can really apply this to proteins without known functions, without known interactions, without even known structures, and we can put some credence in these models we’re developing.”

One example is FtsZ, a protein that is key for cell division. It is well-studied but contains a region that is intrinsically disordered and, therefore, especially challenging to study. Disordered proteins are dynamic, and their functional interactions are very likely fleeting — occurring so briefly that current structural biology tools can’t capture a single structure or interaction. 

The researchers leveraged FragFold to explore the activity of fragments of FtsZ, including fragments of the intrinsically disordered region, to identify several new binding interactions with various proteins. This leap in understanding confirms and expands upon previous experiments measuring FtsZ’s biological activity. 

This progress is significant in part because it was made without solving the disordered region’s structure, and because it exhibits the potential power of FragFold.

“This is one example of how AlphaFold is fundamentally changing how we can study molecular and cell biology,” Keating says. “Creative applications of AI methods, such as our work on FragFold, open up unexpected capabilities and new research directions.”

Inhibition, and beyond

The researchers accomplished these predictions by computationally fragmenting each protein and then modeling how those fragments would bind to interaction partners they thought were relevant.

They compared the maps of predicted binding across the entire sequence to the effects of those same fragments in living cells, determined using high-throughput experimental measurements in which millions of cells each produce one type of protein fragment. 

AlphaFold uses co-evolutionary information to predict folding, and typically evaluates the evolutionary history of proteins using something called multiple sequence alignments for every single prediction run. The MSAs are critical, but are a bottleneck for large-scale predictions — they can take a prohibitive amount of time and computational power. 

For FragFold, the researchers instead pre-calculated the MSA for a full-length protein once and used that result to guide the predictions for each fragment of that full-length protein. 

Savinov, together with Keating Lab alum Sebastian Swanson, PhD ‘23, predicted inhibitory fragments of a diverse set of proteins in addition to FtsZ. Among the interactions they explored was a complex between lipopolysaccharide transport proteins LptF and LptG. A protein fragment of LptG inhibited this interaction, presumably disrupting the delivery of lipopolysaccharide, which is a crucial component of the E. coli outer cell membrane essential for cellular fitness.

“The big surprise was that we can predict binding with such high accuracy and, in fact, often predict binding that corresponds to inhibition,” Savinov says. “For every protein we’ve looked at, we’ve been able to find inhibitors.”

The researchers initially focused on protein fragments as inhibitors because whether a fragment could block an essential function in cells is a relatively simple outcome to measure systematically. Looking forward, Savinov is also interested in exploring fragment function outside inhibition, such as fragments that can stabilize the protein they bind to, enhance or alter its function, or trigger protein degradation. 

Design, in principle 

This research is a starting point for developing a systemic understanding of cellular design principles, and what elements deep-learning models may be drawing on to make accurate predictions. 

“There’s a broader, further-reaching goal that we’re building towards,” Savinov says. “Now that we can predict them, can we use the data we have from predictions and experiments to pull out the salient features to figure out what AlphaFold has actually learned about what makes a good inhibitor?” 

Savinov and collaborators also delved further into how protein fragments bind, exploring other protein interactions and mutating specific residues to see how those interactions change how the fragment interacts with its target. 

Experimentally examining the behavior of thousands of mutated fragments within cells, an approach known as deep mutational scanning, revealed key amino acids that are responsible for inhibition. In some cases, the mutated fragments were even more potent inhibitors than their natural, full-length sequences. 

“Unlike previous methods, we are not limited to identifying fragments in experimental structural data,” says Swanson. “The core strength of this work is the interplay between high-throughput experimental inhibition data and the predicted structural models: the experimental data guides us towards the fragments that are particularly interesting, while the structural models predicted by FragFold provide a specific, testable hypothesis for how the fragments function on a molecular level.”

Savinov is excited about the future of this approach and its myriad applications.

“By creating compact, genetically encodable binders, FragFold opens a wide range of possibilities to manipulate protein function,” Li agrees. “We can imagine delivering functionalized fragments that can modify native proteins, change their subcellular localization, and even reprogram them to create new tools for studying cell biology and treating diseases.” 

A new approach to modeling complex biological systems

MIT engineers’ new model could help researchers glean insights from genomic data and other huge datasets. This is potentially critical to researchers who study any kind of complex biological system, according to senior author Douglas Lauffenburger.

Anne Trafton | MIT News
November 5, 2024

Over the past two decades, new technologies have helped scientists generate a vast amount of biological data. Large-scale experiments in genomics, transcriptomics, proteomics, and cytometry can produce enormous quantities of data from a given cellular or multicellular system.

However, making sense of this information is not always easy. This is especially true when trying to analyze complex systems such as the cascade of interactions that occur when the immune system encounters a foreign pathogen.

MIT biological engineers have now developed a new computational method for extracting useful information from these datasets. Using their new technique, they showed that they could unravel a series of interactions that determine how the immune system responds to tuberculosis vaccination and subsequent infection.

This strategy could be useful to vaccine developers and to researchers who study any kind of complex biological system, says Douglas Lauffenburger, the Ford Professor of Engineering in the departments of Biological Engineering, Biology, and Chemical Engineering.

“We’ve landed on a computational modeling framework that allows prediction of effects of perturbations in a highly complex system, including multiple scales and many different types of components,” says Lauffenburger, the senior author of the new study.

Shu Wang, a former MIT postdoc who is now an assistant professor at the University of Toronto, and Amy Myers, a research manager in the lab of University of Pittsburgh School of Medicine Professor JoAnne Flynn, are the lead authors of a new paper on the work, which appears today in the journal Cell Systems.

Modeling complex systems

When studying complex biological systems such as the immune system, scientists can extract many different types of data. Sequencing cell genomes tells them which gene variants a cell carries, while analyzing messenger RNA transcripts tells them which genes are being expressed in a given cell. Using proteomics, researchers can measure the proteins found in a cell or biological system, and cytometry allows them to quantify a myriad of cell types present.

Using computational approaches such as machine learning, scientists can use this data to train models to predict a specific output based on a given set of inputs — for example, whether a vaccine will generate a robust immune response. However, that type of modeling doesn’t reveal anything about the steps that happen in between the input and the output.

“That AI approach can be really useful for clinical medical purposes, but it’s not very useful for understanding biology, because usually you’re interested in everything that’s happening between the inputs and outputs,” Lauffenburger says. “What are the mechanisms that actually generate outputs from inputs?”

To create models that can identify the inner workings of complex biological systems, the researchers turned to a type of model known as a probabilistic graphical network. These models represent each measured variable as a node, generating maps of how each node is connected to the others.

Probabilistic graphical networks are often used for applications such as speech recognition and computer vision, but they have not been widely used in biology.

Lauffenburger’s lab has previously used this type of model to analyze intracellular signaling pathways, which required analyzing just one kind of data. To adapt this approach to analyze many datasets at once, the researchers applied a mathematical technique that can filter out any correlations between variables that are not directly affecting each other. This technique, known as graphical lasso, is an adaptation of the method often used in machine learning models to strip away results that are likely due to noise.

“With correlation-based network models generally, one of the problems that can arise is that everything seems to be influenced by everything else, so you have to figure out how to strip down to the most essential interactions,” Lauffenburger says. “Using probabilistic graphical network frameworks, one can really boil down to the things that are most likely to be direct and throw out the things that are most likely to be indirect.”

Mechanism of vaccination

To test their modeling approach, the researchers used data from studies of a tuberculosis vaccine. This vaccine, known as BCG, is an attenuated form of Mycobacterium bovis. It is used in many countries where TB is common but isn’t always effective, and its protection can weaken over time.

In hopes of developing more effective TB protection, researchers have been testing whether delivering the BCG vaccine intravenously or by inhalation might provoke a better immune response than injecting it. Those studies, performed in animals, found that the vaccine did work much better when given intravenously. In the MIT study, Lauffenburger and his colleagues attempted to discover the mechanism behind this success.

The data that the researchers examined in this study included measurements of about 200 variables, including levels of cytokines, antibodies, and different types of immune cells, from about 30 animals.

The measurements were taken before vaccination, after vaccination, and after TB infection. By analyzing the data using their new modeling approach, the MIT team was able to determine the steps needed to generate a strong immune response. They showed that the vaccine stimulates a subset of T cells, which produce a cytokine that activates a set of B cells that generate antibodies targeting the bacterium.

“Almost like a roadmap or a subway map, you could find what were really the most important paths. Even though a lot of other things in the immune system were changing one way or another, they were really off the critical path and didn’t matter so much,” Lauffenburger says.

The researchers then used the model to make predictions for how a specific disruption, such as suppressing a subset of immune cells, would affect the system. The model predicted that if B cells were nearly eliminated, there would be little impact on the vaccine response, and experiments showed that prediction was correct.

This modeling approach could be used by vaccine developers to predict the effect their vaccines may have, and to make tweaks that would improve them before testing them in humans. Lauffenburger’s lab is now using the model to study the mechanism of a malaria vaccine that has been given to children in Kenya, Ghana, and Malawi over the past few years.

“The advantage of this computational approach is that it filters out many biological targets that only indirectly influence the outcome and identifies those that directly regulate the response. Then it’s possible to predict how therapeutically altering those biological targets would change the response. This is significant because it provides the basis for future vaccine and trial designs that are more data driven,” says Kathryn Miller-Jensen, a professor of biomedical engineering at Yale University, who was not involved in the study.

Lauffenburger’s lab is also using this type of modeling to study the tumor microenvironment, which contains many types of immune cells and cancerous cells, in hopes of predicting how tumors might respond to different kinds of treatment.

The research was funded by the National Institute of Allergy and Infectious Diseases.

Sauer & Davis Lab News Brief: structures of molecular woodchippers reveal mechanism for versatility

Rest in pieces: deconstructing polypeptide degradation machinery

Lillian Eden | Department of Biology
November 12, 2024

Research from the Sauer and Davis Labs in the Department of Biology at MIT shows that conformational changes contribute to the specificity of “molecular woodchippers” 

Degradation is a crucial process for maintaining protein homeostasis by culling excess or damaged proteins whose components can then be recycled. It is also a highly regulated process—for good reason. A cell could potentially waste many resources if the degradation machinery destroys proteins it shouldn’t. 

One of the major pathways for protein degradation in bacteria and eukaryotic mitochondria involves a molecular machine called ClpXP. ClpXP is made up of two components: a star-shaped structure made up of six subunits called ClpX that engages and unfolds proteins tagged for degradation, and an associated barrel-shaped enzyme, called ClpP, that chemically breaks up proteins into small pieces called peptides. 

ClpXP is incredibly adaptable and is often compared to a woodchipper — able to take in materials and spit out their broken-down components. Thanks to biochemical experiments, this molecular degradation machine is known to be able to break down hundreds of different proteins in the cell regardless of physical or chemical properties such as size, shape, or charge. ClpX uses energy from ATP hydrolysis to unfold proteins before they are threaded through its central channel, referred to as the axial channel, and into the degradation chamber of ClpP.

In three papers, one in PNAS and two in Nature Communications, researchers from the Department of Biology at MIT have expanded our understanding of how this molecular machinery engages with, unfolds, and degrades proteins — and how that machinery refrains, by design, from unfolding proteins not tagged for degradation. 

Alireza Ghanbarpour, until recently a postdoc in the Sauer Lab and Davis Lab and first author on all three papers, began with a simple question: given the vast repertoire of potential substrates — that is, proteins to be degraded — how is ClpXP so specific?

Ghanbarpour — now an assistant professor in the Department of Biochemistry and Molecular Biology at Washington University School of Medicine in St. Louis — found that the answer to this question lies in conformational changes in the molecular machine as it engages with an ill-fated protein. 

Reverse Engineering using Structural Insights

Ghanbarpour approached the question of ClpXP’s versatility by characterizing conformational changes of the molecular machine using a technique called cryogenic electron microscopy. In cryo-EM, sample particles are frozen in solution, and images are collected; algorithms then create 3D renderings from the 2D images.

“It’s really useful to generate different structures in different conditions and then put them together until you know how a machine works,” he says. “I love structural biology, and these molecular machines make fascinating targets for structural work and biochemistry. Their structural plasticity and precise functions offer exciting opportunities to understand how nature leverages enzyme conformations to generate novel functions and tightly regulate protein degradation within the cell.”

Inside the cell, these proteases do not work alone but instead work together with “adaptor” proteins, which can promote — or inhibit — degradation by ClpXP. One of the adaptor proteins that promotes degradation by ClpXP is SspB. 

In E. coli and most other bacteria, ClpXP and SspB interact with a tag called ssrA that is added to incomplete proteins when their biosynthesis on ribosomes stalls. 

The tagging process frees up the ribosome to make more proteins, but creates a problem: incomplete proteins are prone to aggregation, which could be detrimental to cellular health and can lead to disease. By interacting with the degradation tag, ClpXP and SspB help to ensure the degradation of these incomplete proteins. Understanding this process and how it may go awry may open therapeutic avenues in the future.

“It wasn’t clear how certain adapters were interacting with the substrate and the molecular machines during substrate delivery,” Ghanbarpour notes. “My recent structure reveals that the adapter engages with the enzyme, reaching deep into the axial channel to deliver the substrate.” 

Ghanbarpour and colleagues showed that ClpX engages with both the SspB adaptor and the ssrA degradation tag of an ill-fated protein at the same time. Surprisingly, they also found that this interaction occurs while the upper part of the axial channel through ClpX is closed — in fact, the closed channel allows ClpX to contact both the tag and the adaptor simultaneously.

This result was surprising, according to senior author and Salvador E. Luria Professor of Biology Robert Sauer, whose lab has been working on understanding this molecular machine for more than two decades: it was unclear whether the channel through ClpX closes in response to a substrate interaction, or if the channel is always closed until it opens to pass an unfolded protein down to ClpP to be degraded.

Preventing Rogue Degradation

Throughout this project, Ghanbarpour was co-advised by structural biologist and Associate Professor of Biology Joey Davis and collaborated with members of the Davis Lab to better understand the conformational changes that allow these molecular machines to function. Using a cryo-EM analysis approach developed in the Davis lab called CryoDRGN, the researchers showed that there is an equilibrium between ClpXP in the open and closed states: it’s usually closed but is open in about 10% of the particles in their samples. 

The closed state is almost identical to the conformation ClpXP assumes when it is engaged with an ssrA-tagged substrate and the SspB adaptor. 

To better understand the biological significance of this equilibrium, Ghanbarpour created a mutant of ClpXP that is always in the open position. Compared to normal ClpXP, the mutant degraded some proteins lacking obvious degradation tags faster but degraded ssrA-tagged proteins more slowly. 

According to Ghanbarpour, these results indicate that the closed channel improves ClpXP’s ability to efficiently engage tagged proteins meant to be degraded, whereas the open channel allows more “promiscuous” degradation. 

Pausing the Process

The next question Ghanbarpour wanted to answer was what this molecular machine looks like while engaged with a protein it is attempting to unfold. To do that, he created a substrate with a highly stable protein attached to the degradation tag that is initially pulled into ClpX, but then dramatically slows protein unfolding and degradation.

In the structures where the degradation process stalls, Ghanbarpour found that the degradation tag was pulled far into the molecular machine—through ClpX and into ClpP—and the folded protein part of the substrate was pulled tightly against the axial channel of ClpX. 

The opening of the axial channel, called the axial pore, is made up of looping protein structures called RKH loops. These flexible loops were found to play roles both in recognizing the ssrA degradation tag and in how substrates or the SspB adaptor interact with or are pulled against the channel during degradation. 

The flexibility of these RKH loops allows ClpX to interact with a large number of different proteins and adapters, and these results clarify some previous biochemical and mutational studies of interactions between the substrate and ClpXP. 

Although Ghanbarpour’s recent work focused on just one adaptor and degradation tag, he noted there are many more targets — ClpXP is something akin to a Swiss army knife for breaking down polypeptide chains. 

The way those other substrates interact with ClpXP could differ from the structures solved with the SspB adaptor and ssrA tag. It also stands to reason that the way ClpXP reacts to each substrate may be unique. For example, given that ClpX is occasionally in an open state, some substrates may engage with ClpXP only while it’s in an open conformation. 

In his new position at Washington University, Ghanbarpour intends to continue exploring how ClpXP and other molecular machines locate their target substrates and interact with adaptors, shedding light on how cells regulate protein degradation and maintain protein homeostasis.

The structures Ghanbarpour solved involved free-floating protein degradation machinery, but membrane-bound degradation machinery also exists. The membrane-bound version’s structure and conformational adaptions potentially differ from the structures Ghanbarpour found in his previous three papers. Indeed, in a recent preprint, Ghanbarpour worked on the cryo-EM structure of a nautilus shell-shaped protein assembly that seems to control membrane-bound degradation machinery. This assembly plays a critical role in regulating protein degradation within the bacterial inner membrane.

“The function of these proteases goes beyond simply degrading damaged proteins. They also target transcription factors, regulatory proteins, and proteins that don’t exist in normal conditions,” he says. “My new lab is particularly interested in understanding how cells use these proteases and their accessory adaptors, both under normal and stress conditions, to reshape the proteome and support recovery from cellular distress.”