Sometimes science takes a village
Greta Friar | Whitehead Institute
February 17, 2022

Alexandra Navarro, a graduate student in Whitehead Institute Member Iain Cheeseman’s lab, was studying the gene for CENPR, a protein related to cell division—the Cheeseman lab’s research focus—when she came across something interesting: another molecule hidden in CENPR’s genetic code. The hidden molecule is a peptide only 37 amino acids long, too small to show up in most surveys of the cell. It gets created only when the genetic code for CENPR is translated from an offset start and stopping place—essentially, when a cell reads the instructions for making CENPR in a different way. The Cheeseman lab has become very interested in these sorts of hidden molecules, which they have found lurking in a number of other molecules’ genetic codes. Navarro began studying the peptide as a side project during slow periods in her main research on cell division proteins. However, as her research on the peptide progressed, Navarro eventually found herself unsure of how to proceed. CENPR belongs in the centromere, a part of the cell necessary for cell division, but the alternative peptide ends up in the Golgi, a structure that helps to modify molecules and prep them for delivery to different destinations. In other words, the peptide had nothing to do with the part of the cell that Navarro and Cheeseman typically study.

Usually when Navarro comes across something outside of her area of expertise, she will consult with her lab mates, others in Whitehead Institute, or nearby collaborators. However, none of her usual collaborators’ research focuses on the Golgi, so this time Cheeseman suggested that Navarro share what they had found and ask for input from as wide a circle of researchers as possible—on the internet. Often, researchers guard their work in progress carefully, reluctant to share it lest they be scooped, which means someone else publishes a paper on the same topic first. In the competitive world of academic research, where publishing papers is a key part of getting jobs, tenure, and future funding, the specter of scooping can loom large. But science is also an inherently collaborative practice, with scientists contributing droplets of discovery to a shared pool of knowledge, so that new findings can be built upon what came before. Cheeseman is a board member of ASAPbio (Accelerating Science and Publication in biology), a nonprofit that promotes open communication, the use of preprints, and transparent peer review in the life sciences. Researchers like Cheeseman believe that if science adopts more transparent and collaborative practices, such as more frequently and widely sharing research in progress, this will benefit both the people involved and the quality of the science, and will speed up the search for discoveries with the potential to positively impact humankind. But how helpful are such “open science” practices in reality? Navarro and Cheeseman had the perfect opportunity to find out.

The power of preprints

Navarro and Cheeseman wrote up what they knew so far–they had found a hidden peptide that localizes exclusively to the Golgi, and it stays there throughout the cell cycle–as a “preprint in progress,” an incomplete draft of a paper that acknowledges there is more to come. In December 2020, they posted the preprint in progress to bioRxiv, a website that serves as a repository for biology preprints, or papers that have not yet

been published. The site was inspired by arXiv, a similar repository launched in 1991 to provide free and easy access to research in math, physics, computer science, and similar fields. arXiv has become a central hub for research in these fields, with an average of 10-15,000 submissions and 30 million downloads per month. The biology fields were slower to create such a hub: BioRxiv launched in 2013. In December, 2021, it received around 3,000 submissions and 2.3 million downloads.

Navarro and Cheeseman’s decision to post a preprint in progress to bioRvix is not common practice, but a lot of researchers have started posting preprints that resemble the final paper closer to publication. Some journals even require it. This type of early sharing has many benefits: contrary to the fear that sharing research before publication will lead to scooping, it allows researchers to stake a claim sooner by making their work public record pre-publication. Preprints enable researchers to show off their most current work during the narrow windows of the academic job cycle. This can be particularly crucial for early career researchers whose biggest project to date—such as graduate thesis work—is still in publication limbo. Preprints also allow new ideas and knowledge to get out into the world sooner, the better to inspire other researchers. Findings that seemed minor at first have provided the key insight for someone else’s major discovery throughout biology’s history. The sooner research is shared, the sooner it can be built upon to develop important advances, like new medicines or a better model of how a disease spreads.

Navarro and Cheeseman weren’t expecting their discovery to have that kind of major impact, but they knew the peptide could be useful to researchers studying the Golgi. The peptide is small and doesn’t disrupt any functions in the cell. Researchers can attach fluorescent proteins to it that make the Golgi glow in imaging. These traits make the peptide a useful potential tool. Since Navarro and Cheeseman posted the preprint, multiple researchers have reached out about using the peptide.

However, the main goal of posting a preprint in progress, as opposed to a polished preprint, is to ask for input to further the research. The morning after the researchers put their preprint on bioRxiv, Cheeseman shared it on Twitter and asked for feedback. Other researchers soon shared the tweet further, and responses started flooding in. Some researchers simply commented that they found the project interesting, which was reassuring for Cheeseman and Navarro.

“It was nice to see that we weren’t the only ones who thought this thing we found was really cool,” Navarro says. “It gave me a lot of motivation to keep moving on with this project.”

Then, some researchers had specific questions and ideas. The topic that seemed of greatest interest was how the peptide ends up at the Golgi, followed by where exactly in the Golgi it ends up. Researchers suggested online tools that might help predict answers to these questions. They proposed different mechanisms that might be involved.

Navarro used these suggestions to design a new series of experiments, in order to better characterize how the peptide associates with the Golgi. She found out that the peptide attaches to the Golgi’s outer-facing membrane. She started developing an understanding of which of peptide’s 37 amino acids were necessary for Golgi localization, and so was able to narrow in on a 14-amino acid sequence within the peptide that was sufficient for this localization.

Her next question was what specific mechanisms were driving the peptide’s Golgi localization. Navarro had a good lead for one mechanism: the evidence and outside input suggested that after the peptide was created, it likely underwent a modification that gave it a sticky tag to anchor it to the Golgi. What would be the best experiments to confirm this mechanism and determine the other mechanisms involved? Navarro and Cheeseman decided it was time to check back in with the crowd online.

Narrowing in on answers

Navarro and Cheeseman updated their preprint with their new findings, and invited further feedback. This time, they had a specific ask: how to test whether the peptide has the modification they suspected. They received suggestions: a probe, an inhibitor. They also received some unexpected feedback that took them in a new direction. Harmit Malik, professor and associate director of the basic sciences division at Fred Hutch, studies the evolutionary changes that occur in genes. Malik found the peptide interesting enough to dig into its evolutionary history across primates. He emailed Cheeseman and Navarro his findings. Versions of the peptide existed in many primates, and some of the variations between species affected where the peptide ended up. This was a rich new vein of inquiry for Navarro to follow in order to pinpoint exactly which parts of the sequence were necessary for Golgi localization, and the researchers might never have come across it if they had not sought input online.

Guided by the latest set of suggestions, Navarro resumed work on the project. She found evidence that the peptide does undergo the suspected modification. She winnowed down to a 10-amino acid sequence within the peptide that appears to be the minimal sequence necessary for this type of Golgi localization. Navarro and Cheeseman rewrote the paper, adding the discovery of a minimal Golgi targeting sequence—basically a postal code that marks a molecule’s destination as the Golgi. They posted a third version of the preprint in September, 2021. This time, Cheeseman did not ask Twitter for feedback: the paper may undergo more changes, but it now contains a complete research story.

The changing face of science

Based on their experience, would Cheeseman and Navarro recommend sharing preprints in progress? The answer is a resounding yes—if the project is a good fit. Both agree that for projects like this, where the subject is outside the expertise of a researcher’s usual circle of collaborators, asking the wider scientific community for help can be extremely valuable.

“I often share my research with other people at Whitehead Institute, and other cell division researchers at conferences, but this process allowed me to share it with people who work in different scientific areas, with whom I would not normally engage,” Navarro says.

Cheeseman hopes that sharing hubs like bioRxiv will develop ways for even larger and more diverse groups of scientists to connect.

If researchers are hesitant to use an open science approach, Cheeseman and Navarro recommend testing the waters by starting with a lower stakes project. In this case, Navarro’s Golgi paper was a side project, something of personal interest but not integral to her career. Having had a positive experience using an open approach on this project, Cheeseman and Navarro agree they would be comfortable using such an approach again in the future.

“I wouldn’t suggest sharing a preprint in progress for every paper, but I think constructive opportunities are more plentiful than researchers may realize,” Cheeseman says.

In general, Cheeseman thinks, the biology field needs to re-envision how its science gets shared.

“The idea that one size fits all, that everything needs to be a multi-figure paper in a high impact journal, is just not compatible with the way that people do research,” Cheeseman says. “We need to get flexible and explore and value scholarship in every form.”

As for the peptide paper? Regardless of where it ends up, Cheeseman and Navarro consider their open science experiment a success. By sharing their research and asking for input, they gained insights, research tools, and points of view that took the project from a curious finding to a rich understanding of the mechanisms behind Golgi localization. Their early realization that the peptide functions outside of their region of expertise could have been a dead end. But by being open about what they were working on and what sort of guidance they needed, the researchers were able to overcome that hurdle and decode their mystery peptide, with a little help from the wider scientific community.

Blending machine learning and biology to predict cell fates and other changes
Greta Friar | Whitehead Institute
February 1, 2022

Imagine a ball thrown in the air: it curves up, then down, tracing an arc to a point on the ground some distance away. The path of the ball can be described with a simple mathematical equation, and if you know the equation, you can figure out where the ball is going to land. Biological systems tend to be harder to forecast, but Whitehead Institute Member Jonathan Weissman, postdoc in his lab Xiaojie Qiu, and collaborators at the University of Pittsburgh School of Medicine are working on making the path taken by cells as predictable as the arc of a ball. Rather than looking at how cells move through space, they are considering how cells change with time.

Weissman, Qiu, and collaborators Jianhua Xing, professor of computational and systems biology at the University of Pittsburgh School of Medicine, and Xing lab graduate student Yan Zhang have built a machine learning framework that can define the mathematical equations describing a cell’s trajectory from one state to another, such as its development from a stem cell into one of several different types of mature cell. The framework, called dynamo, can also be used to figure out the underlying mechanisms—the specific cocktail of gene activity—driving changes in the cell. Researchers could potentially use these insights to manipulate cells into taking one path instead of another, a common goal in biomedical research and regenerative medicine.  

The researchers describe dynamo in a paper published in the journal Cell on February 1. They explain the framework’s many analytical capabilities and use it to help understand mechanisms of human blood cell production, such as why one type of blood cell forms first (appears more rapidly than others).

“Our goal is to move towards a more quantitative version of single cell biology,” Qiu says. “We want to be able to map how a cell changes in relation to the interplay of regulatory genes as accurately as an astronomer can chart a planet’s movement in relation to gravity, and then we want to understand and be able to control those changes.”

How to map a cell’s future journey

 Dynamo uses data from many individual cells to come up with its equations. The main information that it requires is how the expression of different genes in a cell changes from moment to moment. The researchers estimate this by looking at changes in the amount of RNA over time, because RNA is a measurable product of gene expression. In the same way that knowing the starting position and velocity of a ball is necessary to understand the arc it will follow, researchers use the starting levels of RNAs and how those RNA levels are changing to predict the path of the cell. However, calculating changes in the amount of RNA from single cell sequencing data is challenging, because sequencing only measures RNA once. Researchers must then use clues like RNA-being-made at the time of sequencing and equations for RNA turnover to estimate how RNA levels were changing. Qiu and colleagues had to improve on previous methods in several ways in order to get clean enough measurements for dynamo to work. In particular, they used a recently developed experimental method that tags new RNA to distinguish it from old RNA, and combined this with sophisticated mathematical modeling, to overcome limitations of older estimation approaches.

The researchers’ next challenge was to move from observing cells at discrete points in time to a continuous picture of how cells change. The difference is like switching from a map showing only landmarks to a map that shows the uninterrupted landscape, making it possible to trace the paths between landmarks. Led by Qiu and Zhang, the group used machine learning to reveal continuous functions that define these spaces. 

“There have been tremendous advances in methods for broadly profiling transcriptomes and other ‘omic’ information with single-cell resolution. The analytical tools for exploring these data, however, to date have been descriptive instead of predictive. With a continuous function, you can start to do things that weren’t possible with just accurately sampled cells at different states. For example, you can ask: if I changed one transcription factor, how is it going to change the expression of the other genes?” says Weissman, who is also a professor of biology at the Massachusetts Institute of Technology (MIT), a member of the Koch Institute for Integrative Biology Research at MIT, and an investigator of the Howard Hughes Medical Institute.

Dynamo can visualize these functions by turning them into math-based maps. The terrain of each map is determined by factors like the relative expression of key genes. A cell’s starting place on the map is determined by its current gene expression dynamics. Once you know where the cell starts, you can trace the path from that spot to find out where the cell will end up.

The researchers confirmed dynamo’s cell fate predictions by testing it against cloned cells–cells that share the same genetics and ancestry. One of two nearly-identical clones would be sequenced while the other clone went on to differentiate. Dynamo’s predictions for what would have happened to each sequenced cell matched what happened to its clone.

Moving from math to biological insight and non-trivial predictions

With a continuous function for a cell’s path over time determined, dynamo can then gain insights into the underlying biological mechanisms. Calculating derivatives of the function provides a wealth of information, for example by allowing researchers to determine the functional relationships between genes—whether and how they regulate each other. Calculating acceleration can show that a gene’s expression is growing or shrinking quickly even when its current level is low, and can be used to reveal which genes play key roles in determining a cell’s fate very early in the cell’s trajectory. The researchers tested their tools on blood cells, which have a large and branching differentiation tree. Together with blood cell expert Vijay Sankaran of Boston Children’s Hospital, the Dana-Farber Cancer Institute, Harvard Medical School, and Broad Institute of MIT and Harvard, and Eric Lander of Broad Institute, they found that dynamo accurately mapped blood cell differentiation and confirmed a recent finding that one type of blood cell, megakaryocytes, forms earlier than others. Dynamo also discovered the mechanism behind this early differentiation: the gene that drives megakaryocyte differentiation, FLI1, can self-activate, and because of this is present at relatively high levels early on in progenitor cells. This predisposes the progenitors to differentiate into megakaryocytes first.

The researchers hope that dynamo could not only help them understand how cells transition from one state to another, but also guide researchers in controlling this. To this end, dynamo includes tools to simulate how cells will change based on different manipulations, and a method to find the most efficient path from one cell state to another. These tools provide a powerful framework for researchers to predict how to optimally reprogram any cell type to another, a fundamental challenge in stem cell biology and regenerative medicine, as well as to generate hypotheses of how other genetic changes will alter cells’ fate. There are a variety of possible applications.

“If we devise a set of equations that can describe how genes within a cell regulate each other, we can computationally describe how to transform terminally differentiated cells into stem cells, or predict how a cancer cell may respond to various combinations of drugs that would be impractical to test experimentally,” Xing says.

Dynamo’s computational modeling can be used to predict the most likely path that a cell will follow when reprogramming one cell type to another, as well as the path that a cell will take after specific genetic manipulations. 

Dynamo moves beyond merely descriptive and statistical analyses of single cell sequencing data to derive a predictive theory of cell fate transitions. The dynamo toolset can provide deep insights into how cells change over time, hopefully making cells’ trajectories as predictable for researchers as the arc of a ball, and therefore also as easy to change as switching up a pitch.

3 Questions: Kristin Knouse on the liver’s regenerative capabilities

The clinically-trained cell biologist exploits the liver’s unique capacities in search of new medical applications.

Grace van Deelen | Department of Biology
December 15, 2021

Why is the liver the only human organ that can regenerate? How does it know when it’s been injured? What can our understanding of the liver contribute to regenerative medicine? These are just some of the questions that new assistant professor of biology Kristin Knouse and her lab members are asking in their research at the Koch Institute for Integrative Cancer Research. Knouse sat down to discuss why the liver is so unique, what lessons we might learn from the organ, and what its regeneration might teach us about cancer.

Q: Your lab is interested in questions about how body tissues sense and respond to damage. What is it about the liver that makes it a good tool to model those questions?

A: I’ve always felt that we, as scientists, have so much to gain from treasuring nature’s exceptions, because those exceptions can shine light onto a completely unknown area of biology and provide building blocks to confer such novelty to other systems. When it comes to organ regeneration in mammals, the liver is that exception. It is the only solid organ that can completely regenerate itself. You can damage or remove over 75 percent of the liver and the organ will completely regenerate in a matter of weeks. The liver therefore contains the instructions for how to regenerate a solid organ; however, we have yet to access and interpret those instructions. If we could fully understand how the liver is able to regenerate itself, perhaps one day we could coax other solid organs to do the same.

There are some things we already know about liver regeneration, such as when it begins, what genes are expressed, and how long it takes. However, we still don’t understand why the liver can regenerate but other organs cannot. Why is it that these fully differentiated liver cells — cells that have already assumed specialized roles in the liver — can re-enter the cell cycle and regenerate the organ? We don’t have a molecular explanation for this. Our lab is working to answer this fundamental question of cell and organ biology and apply our discoveries to unlock new approaches for regenerative medicine. In this regard, I don’t necessarily consider myself exclusively a liver biologist, but rather someone who is leveraging the liver to address this much broader biological problem.

Q: As an MD/PhD student, you conducted your graduate research in the lab of the late Professor Angelika Amon here at MIT. How did your work in her lab lead to an interest in studying the liver’s regenerative capacities?

A: What was incredible about being in Angelika’s lab was that she had an interest in almost everything and gave me tremendous independence in what I pursued. I began my graduate research in her lab with an interest in cell division, and I was doing experiments to observe how cells from different mammalian tissues divide. I was isolating cells from different mouse tissues and then studying them in culture. In doing that, I found that when the cells were isolated and grown in a dish they could not segregate their chromosomes properly, suggesting that the tissue environment was essential for accurate cell division. In order to further study and compare these two different contexts — cells in a tissue versus cells in culture — I was keen to study a tissue in which I could observe a lot of cells undergoing cell division at the same time.

So I thought back to my time in medical school, and I remembered that the liver has the ability to completely regenerate itself. With a single surgery to remove part of the liver, I could stimulate millions of cells to divide. I therefore began exploiting liver regeneration as a means of studying chromosome segregation in tissue. But as I continued to perform surgeries on mice and watch the liver rapidly regenerate itself, I couldn’t help but become absolutely fascinated by this exceptional biological process. It was that fascination with this incredibly unique but poorly understood phenomenon — alongside the realization that there was a huge, unmet medical need in the area of regeneration — that convinced me to dedicate my career to studying this.

Q: What kinds of clinical applications might a better understanding of organ regeneration lead to, and what role do you see your lab playing in that research?

A: The most proximal medical application for our work is to confer regenerative capacity to organs that are currently non-regenerative. As we begin to achieve a molecular understanding of how and why the liver can regenerate, we put ourselves in a powerful position to identify and surmount the barriers to regeneration in non-regenerative tissues, such as the heart and nervous system. By answering these complementary questions, we bring ourselves closer to the possibility that, one day, if someone has a heart attack or a spinal cord injury, we could deliver a therapy that stimulates the tissue to regenerate itself. I realize that may sound like a moonshot now, but I don’t think any problem is insurmountable so long as it can be broken down into a series of tractable questions.

Beyond regenerative medicine, I believe our work studying liver regeneration also has implications for cancer. At first glance this may seem counterintuitive, as rapid regrowth is the exact opposite of what we want cancer cells to do. However, the reality is that the majority of cancer-related deaths are attributable not to the rapidly proliferating cells that constitute primary tumors, but rather to the cells that disperse from the primary tumor and lie dormant for years before manifesting as metastatic disease and creating another tumor. These dormant cells evade most of the cancer therapies designed to target rapidly proliferating cells. If you think about it, these dormant cells are not unlike the liver: they are quiet for months, maybe years, and then suddenly awaken. I hope that as we start to understand more about the liver, we might learn how to target these dormant cancer cells, prevent metastatic disease, and thereby offer lasting cancer cures.

How some tissues can “breathe” without oxygen
Eva Frederick | Whitehead Institute
December 2, 2021

Humans need oxygen molecules for a process called cellular respiration, which takes place in our cells’ mitochondria. Through a series of reactions called the electron transport chain, electrons are passed along in a sort of cellular relay race, allowing the cell to create ATP, the molecule that gives our cells energy to complete their vital functions.

At the end of this chain, two electrons remain, which are typically passed off to oxygen, the “terminal electron acceptor.” This completes the reaction and allows the process to continue with more electrons entering the electron transport chain.

In the past, however, scientists have noticed that cells are able to maintain some functions of the electron transport chain, even in the absence of oxygen. “This indicated that mitochondria could actually have partial function, even when oxygen is not the electron acceptor,” said Whitehead Institute postdoctoral researcher Jessica Spinelli. “We wanted to understand, how does this work? How are mitochondria capable of maintaining these electron inputs when oxygen is not the terminal electron acceptor?”

In a paper published December 2 in the journal Science, Whitehead Institute scientists and collaborators led by Spinelli have found the answer to these questions. Their research shows that when cells are deprived of oxygen, another molecule called fumarate can step in and serve as a terminal electron acceptor to enable mitochondrial function in this environment. The research, which was completed in the laboratory of former Whitehead Member David Sabatini, answers a long-standing mystery in the field of cellular metabolism, and could potentially inform research into diseases that cause low oxygen levels in tissues, including ischemia, diabetes and cancer.

A new runner in the cellular relay

The researchers began their investigation into how cells can maintain mitochondrial function without oxygen by using mass spectrometry to measure the quantities of molecules called metabolites that are produced through cellular respiration in both normal and low-oxygen conditions. When cells were deprived of oxygen, researchers noticed a high level of a molecule called succinate.

When you add electrons to oxygen at the end of the electron transport chain, it picks up two protons and becomes water. When you add electrons to fumarate, it becomes succinate. “This led us to think that maybe this accumulation of succinate that’s occurring could actually be caused by fumarate being used as an electron acceptor, and that this reaction could explain the maintenance of mitochondrial functions in hypoxia,” Spinelli said.

Usually, the fumarate-succinate reaction runs the other direction in cells — a protein complex called the SDH complex takes away electrons from succinate, leaving fumarate. For the opposite to happen, the SDH complex would need to be running in reverse. “Although the SDH complex is known to catalyze some fumarate reduction, it was thought that it was thermodynamically impossible for this SDH complex to undergo a net reversal,” Spinelli said. “Fumarate is used as an electron acceptor in lower eukaryotes, but they use a totally different enzyme and electron carrier, and mammals are not known to possess either of these.”

Through a series of assays, however, the researchers were able to ascertain that this complex was indeed running in reverse in cultured cells, largely due to accumulation of a molecule called ubiquinol, which the researchers observed to build up under low-oxygen conditions.

With their hypothesis confirmed, “We wanted to get back to our initial question and ask, does that net reversal of the SDH complex maintain mitochondrial functions which are happening when cells are exposed to hypoxia?” said Spinelli. “So, we knocked out SDH complex and then we demonstrated through a number of means that loss of both oxygen and fumarate as an electron acceptors was sufficient to [bring the electron transport chain to a halt].”

All this work was done in cultured cells, so the next step for Spinelli and collaborators was to study whether fumarate could serve as a terminal electron acceptor in mouse models.

When they tried this, the team uncovered something interesting: some, but not all, of the mice’s tissues were able to successfully reverse the activity of the SDH complex and perform mitochondrial functions using fumarate as a terminal electron acceptor.

“What was really cool to see is that there were three tissues  — the kidney, the liver, and the brain — which on a bulk tissue scale, are operating the SDH complex in a backwards direction, even at physiological oxygen levels,” said Spinelli. In other words, even in normal conditions, these tissues were reducing both fumarate and oxygen to maintain their functions, and when deprived of oxygen, fumarate could take over as a terminal electron acceptor.

In contrast, tissues such as the heart and the skeletal muscle are able to perform minimal fumarate reduction without reversing the SDH complex, but not to the extent that they could effectively retain mitochondrial function when deprived of oxygen.

“We think there’s a lot of exciting work downstream of this to figure out how exactly this process is regulated differently in different tissues — and understanding in disease settings whether the SDH complex is operating in the net reverse direction,” Spinelli said.

In particular, Spinelli is interested in studying the behavior of the SDH complex in cancer cells.
“Certain regions of solid tumors have very low levels of oxygen, and certain regions have high levels of oxygen,” Spinelli said. “It’s interesting to think about how those cells are surviving in that microenvironment — are they using fumarate as an electron acceptor to enable cell survival?”