Lab-grown fat cells help scientists understand type 2 diabetes
Eva Frederick | Whitehead Institute
June 16, 2022

In research published June 17 in the journal Science Advances, researchers in the lab of Whitehead Institute Founding Member Rudolf Jaenisch present a way to create fat cells that can be modified to display different levels of insulin sensitivity.

The cells accurately model healthy insulin metabolism, as well as insulin resistance, one of the key hallmarks of type 2 diabetes. “This system, I think, will be really useful for studying the mechanisms of this disease,” said Jaenisch, who is also a professor of biology at the Massachusetts Institute of Technology (MIT).

“It’s really exciting,” said Max Friesen, a postdoctoral researcher in Jaenisch’s lab and a first author of the study. “This is the first time that you can actually use a human stem cell-derived [fat cell] to show a real insulin response.”

Body fat — also known as adipose tissue — is essential for regulating your body’s metabolism and plays an important role in the storage and release of energy. When fat cells called adipocytes encounter the hormone insulin, they suck up sugar from the blood and store it for future use.

But over many years, factors such as genetics, stress, certain diets, or polluted air or water can cause this process to go awry, leading to type 2 diabetes. In this disease, adipocytes, as well as cells in the muscles and liver, become resistant to insulin and therefore unable to regulate the levels of sugar in the blood.

Tools to model diabetes in the lab generally rely on mice or on cells in a petri dish or test tube. Both these systems have their own problems. Mice, although they are comparable with humans in some respects, have a completely different metabolism and do not experience human diabetes comorbidities like heart attacks. And cell culture has, in the past, failed to replicate key markers of diabetes in a way that is comparable to human tissues.

That’s why Friesen and Andrew Khalil, another postdoc in Jaenisch’s lab, set out to create a new model. The researchers started with human pluripotent stem cells. These cells are the shapeshifters of the body — given the right conditions, they can assume the specific characteristics of almost any human cell type. The Jaenisch Lab has used them in the past to replicate liver cells, brain cells, and even cancerous tumors.

They decided to try to optimize an existing method for differentiating pluripotent cells into fat cells. The protocol created cells that looked like adipocytes, but these cells did not recreate the conditions of healthy insulin signaling or insulin resistance seen in the human body in type 2 diabetes. When healthy adipocytes encounter insulin in the human body, they respond by taking up glucose out of the bloodstream. These lab-made fat cells weren’t doing that, unless the researchers cranked up insulin levels to a thousand times higher than levels ever seen in humans. “Taking up glucose [in response to normal levels on insulin] is really the main function of an adipocyte, so if the model fails to do that, anything downstream in terms of disease research is not going to work either,” Friesen said.

Friesen and Khalil wondered if the lab-grown adipocytes’ low sensitivity to insulin could be a product of the conditions in which they grew. “We thought that maybe this happens because we’re feeding them an artificial culture medium, with all kinds of extra supplements that might be inhibiting their metabolic response,” Friesen said.

Friesen and Khalil decided to use a method called the Design of Experiments approach, which allows researchers to tease out the contributions of different factors to a specific outcome. Informed by this approach, they created nearly 30 different media compositions, each with slightly different levels of key ingredients such as glucose, insulin, the growth factor IGF-1, and albumin, a protein found in blood serum.

The medium that worked best had concentrations of insulin and glucose that were similar to the levels in the human body. When grown in this new medium, the cells responded to much lower concentrations of insulin, just like cells in the body. “So this is our healthy adipocyte,” Friesen said. “Next we wanted to see if we could make a disease model out of this — to make it an insulin-resistant adipocyte like you would see in the progression to type 2 diabetes.”

To desensitize the cells, they flooded the media with insulin for a short period of time. This caused the cells to become less sensitive to the hormone, and respond similarly to diabetic or pre-diabetic fat cells in a living person.

The researchers could then study how the cells responded to the change — such as what genes the insulin resistant cells expressed that healthy cells did not — in order to tease out the underlying genetics of insulin resistance. “We saw small changes in a lot of genes that are metabolism regulated, so that seems to be pointing to a deficiency of the metabolism or mitochondria of the insulin-resistant cells,” Friesen said. “That’s one thing we want to pursue in the future — figure out what is wrong with their metabolism, and then hopefully how to fix it.”

Now that they have created this new model for studying insulin resistance in fat cells, the researchers hope to develop similar procedures for other cells affected in diabetes.  “It seems that with some modifications, we can apply this method to other tissues as well,” Friesen said. “In the future, this will hopefully lead to a unified system for all stem cell-derived tissues, including liver, skeletal muscle, and other cell types, to get a really robust insulin response.”

Ankur Jain Named as Pew Scholar in Biomedical Sciences
Merrill Meadow | Whitehead Institute
June 13, 2022

The Pew Charitable Trusts has selected Whitehead Institute Member Ankur Jain to be a 2022 Pew Scholar in the Biomedical Sciences. The Pew program provides funding to young investigators of outstanding promise who work in areas of science relevant to the advancement of human health.

Jain, who joined the Whitehead Institute faculty in 2019, is one of 22 scientists selected to receive this year’s honor, chosen from among 197 nominations submitted by leading U.S. academic and research institutions. “I am grateful to the Pew Trusts for funding our work, and thrilled to be a part of the Pew community,” says Jain, who is also an assistant professor of biology and the Thomas D. and Virginia W. Cabot Career Development Professor at Massachusetts Institute of Technology.

The Pew award will provide research support for the next four years, enabling him to study the role of evolutionarily ancient metabolites called polyamines, which are essential for cell growth and survival.

“Polyamine concentrations within cells are carefully regulated, and disruptions in polyamine production are known to be associated with conditions ranging from cancer and aging to neurological disorders such as Parkinson’s disease” Jain explains. “But, despite being studied for more than a century, the specific role polyamines play in both healthy and diseased cells remains obscure. This is due, in part, to a lack of technologies effective in probing polyamines.”

Jain’s lab will harness the cell’s own polyamine detection machinery to build new tools to inspect polyamines. Those tools will allow his team to measure and track polyamines in individual cells, study how cells maintain their polyamine content, and explore how changing polyamine levels affect cellular functions. “Ultimately, this work could provide the basis for novel strategies for treating cancer or promoting healthy aging,” Jain observes.

Previously, Jain received a 2017 NIH Pathway to Independence Award and was named a 2019 Packard Fellow for Science and Engineering. He is the third current Whitehead Member to be named a Pew Scholar, following in the steps of Mary Gehring (2010) and Jing-Ke Weng (2014). Former Whitehead Fellow Fernando Camargo, now professor of stem cell and regenerative biology at Harvard University, also became a Pew Scholar in 2010.

Launched in 1985, the Pew Scholars in the Biomedical Sciences program supports top U.S. scientists at the assistant professor level and has, since inception, provided nearly 1000 young investigators with  funding for research projects that, though seemingly risky, have the potential to benefit human health. Pew Scholars are selected by a national advisory committee of eminent scientists, who evaluate candidates on the basis of proven creativity.

More information about Jain’s selection, the 2022 class of Pew Scholars, and the Pew Scholars program is available here.

New CRISPR-based map ties every human gene to its function

Jonathan Weissman and collaborators used their single-cell sequencing tool Perturb-seq on every expressed gene in the human genome, linking each to its job in the cell.

Eva Frederick | Whitehead Institute
June 9, 2022

The Human Genome Project was an ambitious initiative to sequence every piece of human DNA. The project drew together collaborators from research institutions around the world, including MIT’s Whitehead Institute for Biomedical Research, and was finally completed in 2003. Now, over two decades later, MIT Professor Jonathan Weissman and colleagues have gone beyond the sequence to present the first comprehensive functional map of genes that are expressed in human cells. The data from this project, published online June 9 in Cell, ties each gene to its job in the cell, and is the culmination of years of collaboration on the single-cell sequencing method Perturb-seq.

The data are available for other scientists to use. “It’s a big resource in the way the human genome is a big resource, in that you can go in and do discovery-based research,” says Weissman, who is also a member of the Whitehead Institute and an investigator with the Howard Hughes Medical Institute. “Rather than defining ahead of time what biology you’re going to be looking at, you have this map of the genotype-phenotype relationships and you can go in and screen the database without having to do any experiments.”

The screen allowed the researchers to delve into diverse biological questions. They used it to explore the cellular effects of genes with unknown functions, to investigate the response of mitochondria to stress, and to screen for genes that cause chromosomes to be lost or gained, a phenotype that has proved difficult to study in the past. “I think this dataset is going to enable all sorts of analyses that we haven’t even thought up yet by people who come from other parts of biology, and suddenly they just have this available to draw on,” says former Weissman Lab postdoc Tom Norman, a co-senior author of the paper.

Pioneering Perturb-seq

The project takes advantage of the Perturb-seq approach that makes it possible to follow the impact of turning on or off genes with unprecedented depth. This method was first published in 2016 by a group of researchers including Weissman and fellow MIT professor Aviv Regev, but could only be used on small sets of genes and at great expense.

The massive Perturb-seq map was made possible by foundational work from Joseph Replogle, an MD-PhD student in Weissman’s lab and co-first author of the present paper. Replogle, in collaboration with Norman, who now leads a lab at Memorial Sloan Kettering Cancer Center; Britt Adamson, an assistant professor in the Department of Molecular Biology at Princeton University; and a group at 10x Genomics, set out to create a new version of Perturb-seq that could be scaled up. The researchers published a proof-of-concept paper in Nature Biotechnology in 2020.

The Perturb-seq method uses CRISPR-Cas9 genome editing to introduce genetic changes into cells, and then uses single-cell RNA sequencing to capture information about the RNAs that are expressed resulting from a given genetic change. Because RNAs control all aspects of how cells behave, this method can help decode the many cellular effects of genetic changes.

Since their initial proof-of-concept paper, Weissman, Regev, and others have used this sequencing method on smaller scales. For example, the researchers used Perturb-seq in 2021 to explore how human and viral genes interact over the course of an infection with HCMV, a common herpesvirus.

In the new study, Replogle and collaborators including Reuben Saunders, a graduate student in Weissman’s lab and co-first author of the paper, scaled up the method to the entire genome. Using human blood cancer cell lines as well noncancerous cells derived from the retina, he performed Perturb-seq across more than 2.5 million cells, and used the data to build a comprehensive map tying genotypes to phenotypes.

Delving into the data

Upon completing the screen, the researchers decided to put their new dataset to use and examine a few biological questions. “The advantage of Perturb-seq is it lets you get a big dataset in an unbiased way,” says Tom Norman. “No one knows entirely what the limits are of what you can get out of that kind of dataset. Now, the question is, what do you actually do with it?”

The first, most obvious application was to look into genes with unknown functions. Because the screen also read out phenotypes of many known genes, the researchers could use the data to compare unknown genes to known ones and look for similar transcriptional outcomes, which could suggest the gene products worked together as part of a larger complex.

The mutation of one gene called C7orf26 in particular stood out. Researchers noticed that genes whose removal led to a similar phenotype were part of a protein complex called Integrator that played a role in creating small nuclear RNAs. The Integrator complex is made up of many smaller subunits — previous studies had suggested 14 individual proteins — and the researchers were able to confirm that C7orf26 made up a 15th component of the complex.

They also discovered that the 15 subunits worked together in smaller modules to perform specific functions within the Integrator complex. “Absent this thousand-foot-high view of the situation, it was not so clear that these different modules were so functionally distinct,” says Saunders.

Another perk of Perturb-seq is that because the assay focuses on single cells, the researchers could use the data to look at more complex phenotypes that become muddied when they are studied together with data from other cells. “We often take all the cells where ‘gene X’ is knocked down and average them together to look at how they changed,” Weissman says. “But sometimes when you knock down a gene, different cells that are losing that same gene behave differently, and that behavior may be missed by the average.”

The researchers found that a subset of genes whose removal led to different outcomes from cell to cell were responsible for chromosome segregation. Their removal was causing cells to lose a chromosome or pick up an extra one, a condition known as aneuploidy. “You couldn’t predict what the transcriptional response to losing this gene was because it depended on the secondary effect of what chromosome you gained or lost,” Weissman says. “We realized we could then turn this around and create this composite phenotype looking for signatures of chromosomes being gained and lost. In this way, we’ve done the first genome-wide screen for factors that are required for the correct segregation of DNA.”

“I think the aneuploidy study is the most interesting application of this data so far,” Norman says. “It captures a phenotype that you can only get using a single-cell readout. You can’t go after it any other way.”

The researchers also used their dataset to study how mitochondria responded to stress. Mitochondria, which evolved from free-living bacteria, carry 13 genes in their genomes. Within the nuclear DNA, around 1,000 genes are somehow related to mitochondrial function. “People have been interested for a long time in how nuclear and mitochondrial DNA are coordinated and regulated in different cellular conditions, especially when a cell is stressed,” Replogle says.

The researchers found that when they perturbed different mitochondria-related genes, the nuclear genome responded similarly to many different genetic changes. However, the mitochondrial genome responses were much more variable.

“There’s still an open question of why mitochondria still have their own DNA,” said Replogle. “A big-picture takeaway from our work is that one benefit of having a separate mitochondrial genome might be having localized or very specific genetic regulation in response to different stressors.”

“If you have one mitochondria that’s broken, and another one that is broken in a different way, those mitochondria could be responding differentially,” Weissman says.

In the future, the researchers hope to use Perturb-seq on different types of cells besides the cancer cell line they started in. They also hope to continue to explore their map of gene functions, and hope others will do the same. “This really is the culmination of many years of work by the authors and other collaborators, and I’m really pleased to see it continue to succeed and expand,” says Norman.

Tracing a cancer’s family tree to its roots reveals how tumors grow

Family trees of lung cancer cells reveal how cancer evolves from its earliest stages to an aggressive form capable of spreading throughout the body.

Greta Friar | Whitehead Institute
May 5, 2022

Over time, cancer cells can evolve to become resistant to treatment, more aggressive, and metastatic — capable of spreading to additional sites in the body and forming new tumors. The more of these traits that a cancer evolves, the more deadly it becomes. Researchers want to understand how cancers evolve these traits in order to prevent and treat deadly cancers, but by the time cancer is discovered in a patient, it has typically existed for years or even decades. The key evolutionary moments have come and gone unobserved.

MIT Professor Jonathan Weissman and collaborators have developed an approach to track cancer cells through the generations, allowing researchers to follow their evolutionary history. This lineage-tracing approach uses CRISPR technology to embed each cell with an inheritable and evolvable DNA barcode. Each time a cell divides, its barcode gets slightly modified. When the researchers eventually harvest the descendants of the original cells, they can compare the cells’ barcodes to reconstruct a family tree of every individual cell, just like an evolutionary tree of related species. Then researchers can use the cells’ relationships to reconstruct how and when the cells evolved important traits. Researchers have used similar approaches to follow the evolution of the virus that causes Covid-19, in order to track the origins of variants of concern.

Weissman and collaborators have used their lineage-tracing approach before to study how metastatic cancer spreads throughout the body. In their latest work, Weissman; Tyler Jacks, the Daniel K. Ludwig Scholar and David H. Koch Professor of Biology at MIT; and computer scientist Nir Yosef, associate professor at the University of California at Berkeley and the Weizmann Institute of Science, record their most comprehensive cancer cell history to date. The research, published today in Cell, tracks lung cancer cells from the very first activation of cancer-causing mutations. This detailed tumor history reveals new insights into how lung cancer progresses and metastasizes, demonstrating the wealth of understanding that lineage tracing can provide.

“This is a new way of looking at cancer evolution with much higher resolution,” says Weissman, who is a professor of biology at MIT, a member of the Whitehead Institute for Biomedical Research, and an investigator with Howard Hughes Medical Institute. “Previously, the critical events that cause a tumor to become life-threatening have been opaque because they are lost in a tumor’s distant past, but this gives us a window into that history.”

In order to track cancer from its very beginning, the researchers developed an approach to simultaneously trigger cancer-causing mutations in cells and start recording the cells’ history. They engineered mice such that when their lung cells were exposed to a tailor-made virus, that exposure activated a cancer-causing mutation in the Kras gene and deactivated tumor suppressing gene Trp53 in the cells, as well as activating the lineage tracing technology. The mouse model, developed in Jacks’ lab, was also engineered so that lung cancer would develop in it very similarly to how it would in humans.

“In this model, cancer cells develop from normal cells and tumor progression occurs over an extended time in its native environment. This closely replicates what occurs in patients,” Jacks says. Indeed, the researchers’ findings closely align with data about disease progression in lung cancer patients.

The researchers let the cancer cells evolve for several months before harvesting them. They then used a computational approach developed in their previous work to reconstruct the cells’ family trees from their modified DNA barcodes. They also measured gene expression in the cells using RNA sequencing to characterize each individual cell’s state. With this information, they began to piece together how this type of lung cancer becomes aggressive and metastatic.

“Revealing the relationships between cells in a tumor is key to making sense of their gene expression profiles and gaining insight into the emergence of aggressive states,” says Yosef, who is a co-corresponding author on both the current work and the previous lineage tracing paper.

The results showed significant diversity between subpopulations of cells within the same tumor. In this model, cancer cells evolved primarily through inheritable changes to their gene expression, rather than through genetic mutations. Certain subpopulations had evolved to become more fit — better at growth and survival — and more aggressive, and over time they dominated the tumor. Genes that the researchers identified as commonly expressed in the fittest cells could be good candidates for possible therapeutic targets in future research. The researchers also discovered that metastases originated only from these groups of dominant cells, and only late in their evolution. This is different from what has been proposed for some other cancers, in which cells may gain the ability to metastasize early in their evolution. This insight could be important for cancer treatment; metastasis is often when cancers become deadly, and if researchers know which types of cancer develop the ability to metastasize in this stepwise manner, they can design interventions to stop the progression.

“In order to develop better therapies, it’s important to understand the fundamental principles that tumors adopt to develop,” says co-first author Dian Yang, a Damon Runyon Postdoctoral Fellow in Weissman’s lab. “In the future, we want to be able to look at the state of the cancer cells when a patient comes in, and be able to predict how that cancer’s going to evolve, what the risks are, and what is the best treatment to stop that evolution.”

The researchers also figured out important details of the evolutionary paths that cancer subpopulations take to become fit and aggressive. Cells evolve through different states, defined by key characteristics that the cell has at that point in time. In this cancer model the researchers found that early on, cells in a tumor quickly diversified, switching between many different states. However, once a subpopulation landed in a particularly fit and aggressive state, it stayed there, dominating the tumor from that stable state. Furthermore, the ultimately dominant cells seemed to follow one of two distinct paths through different cell states. Either of those paths could then lead to further progression that enabled cancers to enter aggressive “mesenchymal” cell states, which are linked to metastasis.

After the researchers thoroughly mapped the cancer cells’ evolutionary paths, they wondered how those paths would be affected if the cells experienced additional cancer-linked mutations, so they deactivated one of two additional tumor suppressors. One of these affected which state cells stabilized in, while the other led cells to follow a completely new evolutionary pathway to fitness.

The researchers hope that others will use their approach to study all kinds of questions about cancer evolution, and they already have a number of questions in mind for themselves. One goal is to study the evolution of therapeutic resistance, by seeing how cancers evolve in response to different treatments. Another is to study how cancer cells’ local environments shape their evolution.

“The strength of this approach is that it lets us study the evolution of cancers with fine-grained detail,” says co-first author Matthew Jones, a graduate student in the Weissman and Yosef labs. “Every time there is a shift from bulk to single-cell analysis in a technology or approach, it dramatically widens the scope of the biological insights we can attain, and I think we are seeing something like that here.”

Helping drugs play nice in the human body
Whitehead Institute
March 25, 2022

For the hundreds of thousands of people diagnosed with breast cancer each year, surgery to remove the cancerous tissue is often the best option — but this relatively simple procedure comes with some drawbacks. In more than a few cases, the surgical removal of a tumor can lead to an increased risk of the cancer reemerging in other locations in the body.

In a 2018 study, a postdoc in the lab of Whitehead Institute Member Bob Weinberg discovered that, at least in mice, this phenomenon was due to a bodily butterfly effect: the creation of a wound site in one place in the body, which necessitated subsequent wound healing, caused immune system changes affecting distant parts of the body.

These changes occurred as bone marrow cells responded to the wounding with a flood of inflammatory cells that entered into the wound site and, at the same time,  scattered throughout the body. These dispersed inflammatory cells weakened the ability of the immune system to control the outgrowth of a distantly located metastatic tumor.  Without this immune control, which otherwise could keep the metastasis at a very small size,  the metastasis would grow out aggressively.

Hence, wounding in one part of the body provoked metastasis outgrowth at a distant site. This suggested, among other things, that the outgrowth of metastatic tumors, which is often seen in women who have recently undergone a mastectomy,  might be actively provoked by the post-surgical wound-healing process.

Weinberg’s work also presented a way to potentially avoid this effect, using a preventative measure that’s probably sitting in your bathroom cabinet right now: the cheap and common class of drugs known as NSAIDS, which includes ibuprofen and aspirin. When mice were given NSAIDS before and after tumor removal surgeries, they experienced a fivefold lower rate of cancer recurrence at the site of metastasis than a control group given opioids. These NSAIDs could therefore be used in place of the opioids, which are often used to treat post-surgical pain.

The human body is full of undiscovered connections like this one and adding in foreign substances further complicates matters. While a treatment might work well in a Petri dish, researchers describe whole -body metabolism as “a whole different kettle of fish.”

The way drugs move through the body and interact with internal systems is called pharmacokinetics. When a person is given a medicine — either orally, through a chemotherapy method, or via injection — that drug must be able to find its way to its target in a high enough concentration to have an effect, and then when its purpose is served, it must be able to leave the body safely and not build up to a harmful amount.

Much like Weinberg’s work on NSAIDS in breast cancer, Whitehead Institute’s basic research has led to other surprising discoveries about drug activities in the human body. Read on to learn about research that is changing the way new drugs are designed, making existing treatments less toxic, and more.

Concentration is key

When it comes to the action of drugs in the human body, concentration is key. Just ask Rick Young, a Whitehead Institute Member and professor of biology at MIT. In 2018, Young’s lab, which had previously studied the regulatory circuitry involved in transcription (the copying of DNA into RNA), shifted its focus after discovering tiny droplets within cells that concentrate the molecular materials needed to transcribe the DNA.

The droplets, called transcriptional condensates, were the newest in a slew of recent discoveries of other such groupings of cellular components. Some of these aggregations facilitate RNA splicing while others help to form ribosomes.

For Young, the discovery of transcription-related condensates sparked an interest in how these droplets were affecting the action of drugs. Previous theories held that transcription was able to take place in cells because there was a sufficient concentration of necessary proteins, such as RNA polymerase and other accessory proteins. As the Young lab showed,  these collaborating cellular players were actually being concentrated in the condensates,

In 2020, Young and Ann Boija and Isaac Klein, two postdocs in his lab, took their investigation a step further, analyzing the mechanism by which several cancer drugs are concentrated in cellular condensates, and how that concentration could affect their action in individual cells and thus in the body. They found that cancer drugs sort themselves into specific types of condensates, independently of their targets, which can allow them to build up into high concentrations in these localized areas within cells.

“This could have enormous implications for the way we discover and develop drugs,” said Rick Young.  “If drugs had properties that had them partitioning into a condensate where their target lives, then they would enjoy two properties of condensates: they would be compartmentalized, and they would be at much higher concentrations than if they diffuse through the cell.”

Young’s work on condensates led him to co-create a pharmaceutical company called Dewpoint Therapeutics, with the goal of reformulating treatments for cancer or neurological conditions such as amyotrophic lateral sclerosis by targeting biomolecular condensates. Whitehead Institute Founding Member Rudolf Jaenisch serves as a scientific advisor.

Trouble in parasites

While researchers in Young’s lab investigate how drugs could be more efficiently targeted, Sebastian Lourido’s lab is taking a different tack — why do some drugs stop working as time progresses?

The malaria drug artemisinin was developed in China in the 1970s, and completely changed the way the world treated malaria. In the following decades, however, the parasites that cause malaria, several species within the genus Plasmodium, have slowly grown less susceptible to the drug.

In a paper published in September of 2020, Whitehead Institute Member Lourido and collaborators identified two parasite genes that were negatively impacting the actions of the drug in the parasite’s cells.

Researchers liken artemisinin to a “ticking time bomb,” which needs another molecule, called heme, to light its fuse. Heme, a small molecule that is one component of hemoglobin, helps transport electrons and deliver oxygen to tissues. When heme encounters artemisinin, it activates the drug, allowing the creation of small, toxic chemical radicals. These proceed to react with the parasites proteins, fats, and metabolites, eventually leading to its death.

In order to understand how some parasites were becoming less vulnerable to the drug, Lourido, along with researchers Clare Harding, Boryana Petrova and Saima Sidik, ran a genetic screen on a related parasite, Toxoplasma gondii. The screen allowed them to assess which mutations in the parasites’ genomes were beneficial for their survival and which ones were harmful.

The screen revealed two genes that affected how susceptible the parasites were to treatment with artemisinin. One, called Tmem14c, seemed to be protecting the parasites. The gene is analogous to a gene that transports heme out of mitochondria where it is generated. Lourido hypothesized that when the  Tmem14c protein is working properly, it helps the cells shuttle heme and its building blocks and get them where they need to go in the cell. When this gene is knocked out or mutated, heme can build up in the parasite cells, making them more likely to activate the artemisinin “bomb.”

Another gene, when mutated, made the parasites less sensitive to artemisinin. The gene, called DegP2, encodes a protein that plays a role in heme metabolism, so when it was mutated, less heme was available in the cells to activate the drug.

This knowledge provides useful insights for treatment methods, said Lourido. For example, healthcare providers should take into consideration the fact that heme is key in artemisinin’s action, and avoid combining the drug with other treatments that might lower the amount of heme in parasite cells. “Understanding how different pathways within the cell participate to render parasites susceptible to these antiparasitic drugs helps us better pair them with other compounds that are going to be synergistic and not work against our own goal of defeating parasites,” Lourido said.

Taking the edge off toxic treatments 

Another application of fundamental pharmacokinetics research involves mitigating the harmful effects of drugs. Consider the chemotherapy drug methotrexate. Methotrexate was the very first targeted drug ever made. Developed more than 60 years ago by Dr. Sidney Farber, the drug acts by inhibiting a key molecule in the metabolic process that builds DNA and RNA, thereby interfering with basic functions of the cell and with DNA synthesis, repair and replication, helping to destroy cancerous cells in the body.

Methotrexate is still a widely used component of chemotherapy cocktails, especially for pediatric leukemia. In the human body, though, methotrexate is like a bull in a china shop. It is very effective at knocking back cancer, but the drug’s life-threatening side effects, including gut, liver, kidney and brain damage, often lead doctors to terminate their patients’ treatment early, or seriously compromise the survivors’ quality of life.

The drug was much-studied in the 70s, but research trailed off in the subsequent decades due to limits on the existing technologies. Nearly fifty years later, Naama Kanarek, then a postdoctoral researcher in the lab of former Whitehead Institute Member David Sabatini, decided to take  a fundamental research approach to studying the effects of methotrexate, in hopes that she might discover some way to make the drug less toxic.

“We now have access to genetic tools that allow us to address long standing questions in a way that was not possible before,” said Kanarek, who now runs her own lab at Boston Children’s Hospital. “We can use a CRISPR screen, and instead of focusing on what is known, we can ask what is unknown about the drug. We can find new genes that are involved in the response of cells to the drug that were not found before simply because the tools were not there.”

The screen revealed one gene in particular that seemed to be playing a role in how sensitive cancer cells were to methotrexate, the researchers reported in Nature in July of 2018. The cells’ sensitivity is important, Kanarek said, because if the cells can be made more vulnerable to methotrexate, the duration of treatment or required dose could be reduced. “If we can reduce dose because we can improve efficacy, then we can reduce toxicity without compromising on the cure rates and that is good news to the patients,” Kanarek said.

The gene in question, called FTCD, encodes an enzyme involved in the breakdown of the amino acid histidine. When the gene was knocked out, cancer cells were less sensitive to methotrexate. When the pathway was boosted with the addition of extra histidine, cells became more sensitive.

Former Whitehead Institute Director Susan Lindquist, who passed away 2016, performed similar work on the natural product amphotericin B, a drug which is used to treat some fungal infections. The drug is especially useful because fungi have not yet developed a resistance to it, as they have with so many other treatments. But amphotericin B also has some serious drawbacks; it can cause kidney damage, heart failure, and other serious and even fatal side effects.

These side effects mostly happened because amphotericin B works by binding to a chemical group called a sterol. In fungi, it binds to molecules called ergosterols in the cell wall, destabilizing the cells. Unfortunately humans also have a prevalent sterol: cholesterol. When amphotericin B binds to cholesterol in human cell membranes, it can damage human cells.

Using chemical synthesis methods, Lindquist and colleagues at Whitehead Institute and elsewhere were able to tweak the structure of the drug to bind only to ergosterol molecules and not cholesterol, bypassing most of the harmful side effects.

Why fundamental research 

Drug development is often an extremely targeted pursuit, but for Whitehead Institute scientists, their advances have mostly come from a simple curiosity about the cellular mechanisms. For example, Rick Young didn’t set out to study condensates, but an inquiry into the fundamentals of transcription led him in this entirely new direction.

Such fundamental research has the potential to branch in any number of different ways. “Fundamental science can lead in directions that you would not foresee,” said the Institute’s Associate Director of Intellectual Property Shoji Takahashi. Basic research into drug behavior is essential and can contribute to life-changing therapies down the line.

Pioneering a deeper understanding of metabolism
Merrill Meadow | Whitehead Institute
March 23, 2022

Metabolism is the sum of life-sustaining chemical reactions occurring in cells and across whole organisms. The human genome codes for thousands of metabolic enzymes, and specific metabolic pathways play significant roles in many biological processes—from breaking down food to release energy, to normal proliferation and differentiation of cells, to pathologies underlying diabetes, cancer, and other diseases.

For decades, Whitehead Institute researchers have helped both to clarify how metabolism works in healthy states and to identify how metabolic processes gone awry contribute to diseases. Among Whitehead Founding Member Harvey Lodish’s wide-ranging accomplishments, for example, are the identification of genes and proteins involved in development of insulin resistance and stress responses in fat cells. His lab explored the hormones controlling fatty acid and glucose metabolism, broadening understanding of obesity and type 2 diabetes. In 1995, the lab cloned adiponectin, a hormone made exclusively by fat cells. A long series of studies has shown that adiponectin causes muscle to burn fatty acids faster – so they are not stored as fat – and increases the metabolism of the sugar glucose. More recently the lab identified and characterized types of RNAs that are specifically expressed in fat cells – including a microRNA unique to brown fat, which burns rather than stores fatty acids. In addition, former Member David Sabatini’s discovery of the mTOR protein and his subsequent work elaborating many ways in which the mTOR pathway affects cells function has proven to be fundamental to understanding the relationship between metabolism and an array of diseases.

Today, Institute researchers continue to pioneer a deeper understanding of how metabolic processes contribute to health and disease – with long-term implications that could range from new treatments for obesity and type 2 diabetes to methods for slowing the aging process. Here are a few examples of Whitehead Institute scientists’ creative and pioneering work in the field of metabolism.

Understanding hibernation and torpor 

Research inspiration comes in many forms. For example, Whitehead Institute Member Siniša Hrvatin – who joined the faculty in January 2022 from Harvard Medical School (HMS) – was inspired to pursue his current research by science-fiction tales about suspended animation for long-term space travel. And during graduate school, he realized that the ability of some mammals to enter a state of greatly reduced metabolism – such as occurs in hibernation –  was a mild but real-world form of suspended animation.

Hrvatin’s doctoral research in Doug Melton’s lab at Harvard University focused primarily on stem cell biology. But his subsequent postdoctoral research positions at Massachusetts Institute of Technology (MIT) and HMS enabled him to begin exploring the mechanisms and impact of reduced metabolic states in mammals. The timing was serendipitous, too, because he was able to use the growing array of genetic tools that were becoming available – and create some new tools of his own as well.

“To survive extreme environments, many animals have evolved the ability to profoundly decrease metabolic rate and body temperature and enter states of dormancy, such as hibernation and torpor,” Hrvatin says. Hibernating animals enter repeated states of significantly reduced metabolic activity, each lasting days to weeks. By comparison, daily torpor is shorter, with animals entering repeated periods of lower-than-normal metabolic activity lasting several hours.

Hrvatin’s lab studies the mysteries of how animals and their cells initiate, regulate, and survive these adaptations. “Our long-term goal is to determine if these adaptations can be harnessed to create therapeutic applications for humans.” He and his team are focusing on three broad questions regarding the mechanisms underlying torpor in mice and hibernation in hamsters.

First: How do the animals’ brains initiate and regulate the metabolic processes involved in this process? During his postdoctoral research,  Hrvatin published details of his discovery of neurons involved in the regulation of mouse torpor. “Now we are investigating how these torpor-regulating neurons receive information about the body’s energy-state,” he explains, “and studying how these neurons then drive a decrease of  metabolic rate and body temperature throughout the body.”

Second: How do individual cells – and their genomes – adapt to extreme or changing environments; and how do these adaptations differ between types of organisms?

“Cells from hibernating organisms ranging from rodents to bears have evolved the ability to survive extreme cold temperatures for many weeks to months,” Hrvatin notes. “We are using genetic screens to identify species-specific mechanisms of tolerance to extreme cold. Then we will explore whether these mechanisms can be induced in non-hibernating organisms – potentially to provide health benefits.”

Third: Can we deliberately and specifically slow down tissue damage, disease progression, and or aging in cells and whole organisms by inducing torpor or hibernation – or facets of those states? It has long been known that hibernating animals live longer than closely related non-hibernators; that cancer cells do not replicate during hibernation; and that cold can help protect neurons from the effects of loss of oxygen. However, the cellular mechanisms underlying these phenomena remain largely unknown. Hrvatin’s lab will induce a long-term hibernation-like state in mice and natural hibernation in hamsters, and study how those states affect processes such as tissue repair, cancer progression, and aging.

“In the lab, if you take many human cell types and put them in a cold environment they die, but cells from hibernators survive,” Hrvatin notes. “We’re fascinated by the cellular processes underlying those survival capacities. As a starting place, we are using novel CRISPR screening approaches to help us identify the genomic mechanisms involved.”

And then? “Ultimately, we hope to take on the biggest question: Is it possible to transfer some of those survival abilities to humans?

Solving a mitochondrial conundrum

When Whitehead Institute postdoctoral researcher Jessica Spinelli was studying cancer metabolism in graduate school, she became interested in what seemed to be a scientific paradox regarding mitochondria, the cell’s energy-producing organelles: Mitochondria are believed to be important for tumor growth; but they generally need oxygen to function, and substantial portions of tumors have very low oxygen levels. Pursuing research in the lab of former Whitehead Institute Member David Sabatini, Spinelli sought to understand how those facts fit together and whether mitochondria could somehow adapt to function with limited oxygen levels.

Recently, Spinelli and colleagues published an answer to the conundrum – one that could inform research into medical conditions including ischemia, diabetes and cancer. In a Science paper for which Spinelli was first author, the team demonstrated that when cells are deprived of oxygen, a molecule called fumarate can serve as a substitute and enable mitochondria to continue functioning.

As Spinelli explains, humans need oxygen molecules for the cellular respiration process that takes place in our cells’ mitochondria. In this process – called the electron transport chain – electrons are passed along in a sort of cellular relay race that, ultimately, allows the cell to create the energy needed to perform its vital functions. Usually, oxygen is necessary to keep that process operating.

Using mass spectrometry to measure the quantities of molecules produced through cellular respiration in varied conditions, Spinelli and the team found that cells deprived of oxygen had a high level of succinate molecules, which form when electrons are added to a molecule called fumarate. “From this, we hypothesized that the accumulation of succinate in low-oxygen environments is caused by mitochondria using fumarate as a substitute for oxygen’s role in the electron transport chain,” Spinelli explains. “That could explain how mitochondria function with relatively little oxygen.” The next step was to test that hypothesis in mice, and those studies provided several interesting findings: Only mitochondria in kidney, liver, and brain tissues could use fumarate in the electron transport chain. And even in normal conditions, mitochondria in these tissues used both fumarate and oxygen to function – shifting to rely more heavily on fumarate when oxygen was reduced. In contrast, heart and skeletal muscle mitochondria made minimal use of fumarate and did not function well with limited oxygen.

“We foresee some exciting work ahead, learning exactly how this process is regulated in different tissues,” Spinelli says, “and, especially, in solid tumor cancers, where oxygen levels vary between regions.”

Seeking a more accurate model of diabetes

Max Friesen, a postdoctoral researcher in the lab of Whitehead Institute Founding Member Rudolf Jaenisch, studies the role of cell metabolism in type 2 diabetes (T2D). An increasingly prevalent disease that affects millions of people around the world, T2D is hard to study in the lab. This has made it very challenging for scientists to detail the cellular mechanisms through which it develops – and therefore to create effective therapeutics.

“It has always been very hard to model T2D, because metabolism differs greatly between species,” Friesen says. “That fact leads to complications when we use animal models to study this disease. Mice, for example, have much higher metabolism and faster heart rates than humans. As a result, researchers have developed many approaches that cure diabetes in mice but that fail in humans.” Nor do most in vitro culture systems—cells in a dish—effectively recapitulate the disease.

But, building on Jaenisch’s pioneering success in developing disease models derived from human stem cells, Friesen is working to create a much more accurate in vitro system for studying diabetes. His goal is to make human stem cell-derived tissues that function as they would in the human body, closely recapitulating what happens when an individual develops diabetes. Currently, Friesen is differentiating human stem cells into metabolic tissues such as liver and adipocytes (fat). He has improved current differentiation protocols by adapting these cells to a culture medium that is much closer to the environment they see in the human body. Serendipitously, the process also makes the cells responsive to insulin at levels that are present in the human bloodstream. “This serves as a great model of a healthy cell that we can then turn into a disease model by exposing the cell to diabetic hyperinsulinemia,” Friesen says.

These advances should enable him to gain a better understanding of how metabolic pathways – such as the insulin signaling pathway – function in a diabetic model versus a healthy control model. “My hope is that our new models will enable us to figure out how dietary insulin resistance develops, and then identify a therapeutic intervention that blocks that disease-causing process,” he explains. “It would be fantastic to help alleviate this growing global health burden.”

Yukiko Yamashita, unraveler of stem cells’ secrets

The MIT biologist’s research has shed light on the immortality of germline cells and the function of “junk DNA.”

Anne Trafton | MIT News Office
March 22, 2022

When cells divide, they usually generate two identical daughter cells. However, there are some important exceptions to this rule: When stem cells divide, they often produce one differentiated cell along with another stem cell, to maintain the pool of stem cells.

Yukiko Yamashita has spent much of her career exploring how these “asymmetrical” cell divisions occur. These processes are critically important not only for cells to develop into different types of tissue, but also for germline cells such as eggs and sperm to maintain their viability from generation to generation.

“We came from our parents’ germ cells, who used to be also single cells who came from the germ cells of their parents, who used to be single cells that came from their parents, and so on. That means our existence can be tracked through the history of multicellular life,” Yamashita says. “How germ cells manage to not go extinct, while our somatic cells cannot last that long, is a fascinating question.”

Yamashita, who began her faculty career at the University of Michigan, joined MIT and the Whitehead Institute in 2020, as the inaugural holder of the Susan Lindquist Chair for Women in Science and a professor in the Department of Biology. She was drawn to MIT, she says, by the eagerness to explore new ideas that she found among other scientists.

“When I visited MIT, I really enjoyed talking to people here,” she says. “They are very curious, and they are very open to unconventional ideas. I realized I would have a lot of fun if I came here.”

Exploring paradoxes

Before she even knew what a scientist was, Yamashita knew that she wanted to be one.

“My father was an admirer of Albert Einstein, so because of that, I grew up thinking that the pursuit of the truth is the best thing you could do with your life,” she recalls. “At the age of 2 or 3, I didn’t know there was such a thing as a professor, or such a thing as a scientist, but I thought doing science was probably the coolest thing I could do.”

Yamashita majored in biology at Kyoto University and then stayed to pursue her PhD, studying how cells make exact copies of themselves when they divide. As a postdoc at Stanford University, she became interested in the exceptions to that carefully orchestrated process, and began to study how cells undergo divisions that produce daughter cells that are not identical. This kind of asymmetric division is critical for multicellular organisms, which begin life as a single cell that eventually differentiates into many types of tissue.

Those studies led to a discovery that helped to overturn previous theories about the role of so-called junk DNA. These sequences, which make up most of the genome, were thought to be essentially useless because they don’t code for any proteins. To Yamashita, it seemed paradoxical that cells would carry so much DNA that wasn’t serving any purpose.

“I couldn’t really believe that huge amount of our DNA is junk, because every time a cell divides, it still has the burden of replicating that junk,” she says. “So, my lab started studying the function of that junk, and then we realized it is a really important part of the chromosome.”

In human cells, the genome is stored on 23 pairs of chromosomes. Keeping all of those chromosomes together is critical to cells’ ability to copy genes when they are needed. Over several years, Yamashita and her colleagues at the University of Michigan, and then at MIT, discovered that stretches of junk DNA act like bar codes, labeling each chromosome and helping them bind to proteins that bundle chromosomes together within the cell nucleus.

Without those barcodes, chromosomes scatter and start to leak out of the cell’s nucleus. Another intriguing observation regarding these stretches of junk DNA was that they have much greater variability between different species than protein-coding regions of DNA. By crossing two different species of fruit flies, Yamashita showed that in cells of the hybrid offspring flies, chromosomes leak out just as they would if they lost their barcodes, suggesting that the codes are specific to each species.

“We think that might be one of the big reasons why different species become incompatible, because they don’t have the right information to bundle all of their chromosomes together into one place,” Yamashita says.

Stem cell longevity

Yamashita’s interest in stem cells also led her to study how germline cells (the cells that give rise to eggs and sperm cells) maintain their viability so much longer than regular body cells across generations. In typical animal cells, one factor that contributes to age-related decline is loss of genetic sequences that encode genes that cells use continuously, such as genes for ribosomal RNAs.

A typical human cell may have hundreds of copies of these critical genes, but as cells age, they lose some of them. For germline cells, this can be detrimental because if the numbers get too low, the cells can no longer form viable daughter cells.

Yamashita and her colleagues found that germline cells overcome this by tearing sections of DNA out of one daughter cell during cell division and transferring them to the other daughter cell. That way, one daughter cell has the full complement of those genes restored, while the other cell is sacrificed.

That wasteful strategy would likely be too extravagant to work for all cells in the body, but for the small population of germline cells, the tradeoff is worthwhile, Yamashita says.

“If skin cells did that kind of thing, where every time you make one cell, you are essentially trashing the other one, you couldn’t afford it. You would be wasting too many resources,” she says. “Germ cells are not critical for viability of an organism. You have the luxury to put many resources into them but then let only half of the cells recover.”

An ‘oracle’ for predicting the evolution of gene regulation

Researchers created a mathematical framework to examine the genome and detect signatures of natural selection, deciphering the evolutionary past and future of non-coding DNA.

Raleigh McElvery
March 9, 2022

Despite the sheer number of genes that each human cell contains, these so-called “coding” DNA sequences comprise just 1% of our entire genome. The remaining 99% is made up of “non-coding” DNA — which, unlike coding DNA, does not carry the instructions to build proteins.

One vital function of this non-coding DNA, also called “regulatory” DNA, is to help turn genes on and off, controlling how much (if any) of a protein is made. Over time, as cells replicate their DNA to grow and divide, mutations often crop up in these non-coding regions — sometimes tweaking their function and changing the way they control gene expression. Many of these mutations are trivial, and some are even beneficial. Occasionally, though, they can be associated with increased risk of common diseases, such as type 2 diabetes, or more life-threatening ones, including cancer.

To better understand the repercussions of such mutations, researchers have been hard at work on mathematical maps that allow them to look at an organism’s genome, predict which genes will be expressed, and determine how that expression will affect the organism’s observable traits. These maps, called fitness landscapes, were conceptualized roughly a century ago to understand how genetic makeup influences one common measure of organismal fitness in particular: reproductive success. Early fitness landscapes were very simple, often focusing on a limited number of mutations. Much richer data sets are now available, but researchers still require additional tools to characterize and visualize such complex data. This ability would not only facilitate a better understanding of how individual genes have evolved over time, but would also help to predict what sequence and expression changes might occur in the future.

In a new study published on March 9 in Nature, a team of scientists has developed a framework for studying the fitness landscapes of regulatory DNA. They created a neural network model that, when trained on hundreds of millions of experimental measurements, was capable of predicting how changes to these non-coding sequences in yeast affected gene expression. They also devised a unique way of representing the landscapes in two dimensions, making it easy to understand the past and forecast the future evolution of non-coding sequences in organisms beyond yeast — and even design custom gene expression patterns for gene therapies and industrial applications.

“We now have an ‘oracle’ that can be queried to ask: What if we tried all possible mutations of this sequence? Or, what new sequence should we design to give us a desired expression?” says Aviv Regev, a professor of biology at MIT (on leave), core member of the Broad Institute of Harvard and MIT (on leave), head of Genentech Research and Early Development, and the study’s senior author. “Scientists can now use the model for their own evolutionary question or scenario, and for other problems like making sequences that control gene expression in desired ways. I am also excited about the possibilities for machine learning researchers interested in interpretability; they can ask their questions in reverse, to better understand the underlying biology.”

Prior to this study, many researchers had simply trained their models on known mutations (or slight variations thereof) that exist in nature. However, Regev’s team wanted to go a step further by creating their own unbiased models capable of predicting an organism’s fitness and gene expression based on any possible DNA sequence — even sequences they’d never seen before. This would also enable researchers to use such models to engineer cells for pharmaceutical purposes, including new treatments for cancer and autoimmune disorders.

To accomplish this goal, Eeshit Dhaval Vaishnav, a graduate student at MIT and co-first author, Carl de Boer, now an assistant professor at the University of British Columbia, and their colleagues created a neural network model to predict gene expression. They trained it on a dataset generated by inserting millions of totally random non-coding DNA sequences into yeast, and observing how each random sequence affected gene expression. They focused on a particular subset of non-coding DNA sequences called promoters, which serve as binding sites for proteins that can switch nearby genes on or off.

“This work highlights what possibilities open up when we design new kinds of experiments to generate the right data to train models,” Regev says. “In the broader sense, I believe these kinds of approaches will be important for many problems — like understanding genetic variants in regulatory regions that confer disease risk in the human genome, but also for predicting the impact of combinations of mutations, or designing new molecules.”

Regev, Vaishnav, de Boer, and their coauthors went on to test their model’s predictive abilities in a variety of ways, in order to show how it could help demystify the evolutionary past — and possible future — of certain promoters. “Creating an accurate model was certainly an accomplishment, but, to me, it was really just a starting point,” Vaishnav explains.

First, to determine whether their model could help with synthetic biology applications like producing antibiotics, enzymes, and food, the researchers practiced using it to design promoters that could generate desired expression levels for any gene of interest. They then scoured other scientific papers to identify fundamental evolutionary questions, in order to see if their model could help answer them. The team even went so far as to feed their model a real-world population data set from one existing study, which contained genetic information from yeast strains around the world. In doing so, they were able to delineate thousands of years of past selection pressures that sculpted the genomes of today’s yeast.

But, in order to create a powerful tool that could probe any genome, the researchers knew they’d need to find a way to forecast the evolution of non-coding sequences even without such a comprehensive population data set. To address this goal, Vaishnav and his colleagues devised a computational technique that allowed them to plot the predictions from their framework onto a two-dimensional graph. This helped them show, in a remarkably simple manner, how any non-coding DNA sequence would affect gene expression and fitness, without needing to conduct any time-consuming experiments at the lab bench.

“One of the unsolved problems in fitness landscapes was that we didn’t have an approach for visualizing them in a way that meaningfully captured the evolutionary properties of sequences,” Vaishnav explains. “I really wanted to find a way to fill that gap, and contribute to the longstanding vision of creating a complete fitness landscape.”

Martin Taylor, a professor of genetics at the University of Edinburgh’s Medical Research Council Human Genetics Unit who was not involved in the research, says the study shows that artificial intelligence can not only predict the effect of regulatory DNA changes, but also reveal the underlying principles that govern millions of years of evolution.

Despite the fact that the model was trained on just a fraction of yeast regulatory DNA in a few growth conditions, he’s impressed that it’s capable of making such useful predictions about the evolution of gene regulation in mammals.

“There are obvious near-term applications, such as the custom design of regulatory DNA for yeast in brewing, baking, and biotechnology,” he explains. “But extensions of this work could also help identify disease mutations in human regulatory DNA that are currently difficult to find and largely overlooked in the clinic. This work suggests there is a bright future for AI models of gene regulation trained on richer, more complex, and more diverse data sets.”

Even before the study was formally published, Vaishnav began receiving queries from other researchers hoping to use the model to devise non-coding DNA sequences for use in gene therapies.

“People have been studying regulatory evolution and fitness landscapes for decades now,” Vaishnav says. “I think our framework will go a long way in answering fundamental, open questions about the evolution and evolvability of gene regulatory DNA — and even help us design biological sequences for exciting new applications.”

Sometimes science takes a village
Greta Friar | Whitehead Institute
February 17, 2022

Alexandra Navarro, a graduate student in Whitehead Institute Member Iain Cheeseman’s lab, was studying the gene for CENPR, a protein related to cell division—the Cheeseman lab’s research focus—when she came across something interesting: another molecule hidden in CENPR’s genetic code. The hidden molecule is a peptide only 37 amino acids long, too small to show up in most surveys of the cell. It gets created only when the genetic code for CENPR is translated from an offset start and stopping place—essentially, when a cell reads the instructions for making CENPR in a different way. The Cheeseman lab has become very interested in these sorts of hidden molecules, which they have found lurking in a number of other molecules’ genetic codes. Navarro began studying the peptide as a side project during slow periods in her main research on cell division proteins. However, as her research on the peptide progressed, Navarro eventually found herself unsure of how to proceed. CENPR belongs in the centromere, a part of the cell necessary for cell division, but the alternative peptide ends up in the Golgi, a structure that helps to modify molecules and prep them for delivery to different destinations. In other words, the peptide had nothing to do with the part of the cell that Navarro and Cheeseman typically study.

Usually when Navarro comes across something outside of her area of expertise, she will consult with her lab mates, others in Whitehead Institute, or nearby collaborators. However, none of her usual collaborators’ research focuses on the Golgi, so this time Cheeseman suggested that Navarro share what they had found and ask for input from as wide a circle of researchers as possible—on the internet. Often, researchers guard their work in progress carefully, reluctant to share it lest they be scooped, which means someone else publishes a paper on the same topic first. In the competitive world of academic research, where publishing papers is a key part of getting jobs, tenure, and future funding, the specter of scooping can loom large. But science is also an inherently collaborative practice, with scientists contributing droplets of discovery to a shared pool of knowledge, so that new findings can be built upon what came before. Cheeseman is a board member of ASAPbio (Accelerating Science and Publication in biology), a nonprofit that promotes open communication, the use of preprints, and transparent peer review in the life sciences. Researchers like Cheeseman believe that if science adopts more transparent and collaborative practices, such as more frequently and widely sharing research in progress, this will benefit both the people involved and the quality of the science, and will speed up the search for discoveries with the potential to positively impact humankind. But how helpful are such “open science” practices in reality? Navarro and Cheeseman had the perfect opportunity to find out.

The power of preprints

Navarro and Cheeseman wrote up what they knew so far–they had found a hidden peptide that localizes exclusively to the Golgi, and it stays there throughout the cell cycle–as a “preprint in progress,” an incomplete draft of a paper that acknowledges there is more to come. In December 2020, they posted the preprint in progress to bioRxiv, a website that serves as a repository for biology preprints, or papers that have not yet

been published. The site was inspired by arXiv, a similar repository launched in 1991 to provide free and easy access to research in math, physics, computer science, and similar fields. arXiv has become a central hub for research in these fields, with an average of 10-15,000 submissions and 30 million downloads per month. The biology fields were slower to create such a hub: BioRxiv launched in 2013. In December, 2021, it received around 3,000 submissions and 2.3 million downloads.

Navarro and Cheeseman’s decision to post a preprint in progress to bioRvix is not common practice, but a lot of researchers have started posting preprints that resemble the final paper closer to publication. Some journals even require it. This type of early sharing has many benefits: contrary to the fear that sharing research before publication will lead to scooping, it allows researchers to stake a claim sooner by making their work public record pre-publication. Preprints enable researchers to show off their most current work during the narrow windows of the academic job cycle. This can be particularly crucial for early career researchers whose biggest project to date—such as graduate thesis work—is still in publication limbo. Preprints also allow new ideas and knowledge to get out into the world sooner, the better to inspire other researchers. Findings that seemed minor at first have provided the key insight for someone else’s major discovery throughout biology’s history. The sooner research is shared, the sooner it can be built upon to develop important advances, like new medicines or a better model of how a disease spreads.

Navarro and Cheeseman weren’t expecting their discovery to have that kind of major impact, but they knew the peptide could be useful to researchers studying the Golgi. The peptide is small and doesn’t disrupt any functions in the cell. Researchers can attach fluorescent proteins to it that make the Golgi glow in imaging. These traits make the peptide a useful potential tool. Since Navarro and Cheeseman posted the preprint, multiple researchers have reached out about using the peptide.

However, the main goal of posting a preprint in progress, as opposed to a polished preprint, is to ask for input to further the research. The morning after the researchers put their preprint on bioRxiv, Cheeseman shared it on Twitter and asked for feedback. Other researchers soon shared the tweet further, and responses started flooding in. Some researchers simply commented that they found the project interesting, which was reassuring for Cheeseman and Navarro.

“It was nice to see that we weren’t the only ones who thought this thing we found was really cool,” Navarro says. “It gave me a lot of motivation to keep moving on with this project.”

Then, some researchers had specific questions and ideas. The topic that seemed of greatest interest was how the peptide ends up at the Golgi, followed by where exactly in the Golgi it ends up. Researchers suggested online tools that might help predict answers to these questions. They proposed different mechanisms that might be involved.

Navarro used these suggestions to design a new series of experiments, in order to better characterize how the peptide associates with the Golgi. She found out that the peptide attaches to the Golgi’s outer-facing membrane. She started developing an understanding of which of peptide’s 37 amino acids were necessary for Golgi localization, and so was able to narrow in on a 14-amino acid sequence within the peptide that was sufficient for this localization.

Her next question was what specific mechanisms were driving the peptide’s Golgi localization. Navarro had a good lead for one mechanism: the evidence and outside input suggested that after the peptide was created, it likely underwent a modification that gave it a sticky tag to anchor it to the Golgi. What would be the best experiments to confirm this mechanism and determine the other mechanisms involved? Navarro and Cheeseman decided it was time to check back in with the crowd online.

Narrowing in on answers

Navarro and Cheeseman updated their preprint with their new findings, and invited further feedback. This time, they had a specific ask: how to test whether the peptide has the modification they suspected. They received suggestions: a probe, an inhibitor. They also received some unexpected feedback that took them in a new direction. Harmit Malik, professor and associate director of the basic sciences division at Fred Hutch, studies the evolutionary changes that occur in genes. Malik found the peptide interesting enough to dig into its evolutionary history across primates. He emailed Cheeseman and Navarro his findings. Versions of the peptide existed in many primates, and some of the variations between species affected where the peptide ended up. This was a rich new vein of inquiry for Navarro to follow in order to pinpoint exactly which parts of the sequence were necessary for Golgi localization, and the researchers might never have come across it if they had not sought input online.

Guided by the latest set of suggestions, Navarro resumed work on the project. She found evidence that the peptide does undergo the suspected modification. She winnowed down to a 10-amino acid sequence within the peptide that appears to be the minimal sequence necessary for this type of Golgi localization. Navarro and Cheeseman rewrote the paper, adding the discovery of a minimal Golgi targeting sequence—basically a postal code that marks a molecule’s destination as the Golgi. They posted a third version of the preprint in September, 2021. This time, Cheeseman did not ask Twitter for feedback: the paper may undergo more changes, but it now contains a complete research story.

The changing face of science

Based on their experience, would Cheeseman and Navarro recommend sharing preprints in progress? The answer is a resounding yes—if the project is a good fit. Both agree that for projects like this, where the subject is outside the expertise of a researcher’s usual circle of collaborators, asking the wider scientific community for help can be extremely valuable.

“I often share my research with other people at Whitehead Institute, and other cell division researchers at conferences, but this process allowed me to share it with people who work in different scientific areas, with whom I would not normally engage,” Navarro says.

Cheeseman hopes that sharing hubs like bioRxiv will develop ways for even larger and more diverse groups of scientists to connect.

If researchers are hesitant to use an open science approach, Cheeseman and Navarro recommend testing the waters by starting with a lower stakes project. In this case, Navarro’s Golgi paper was a side project, something of personal interest but not integral to her career. Having had a positive experience using an open approach on this project, Cheeseman and Navarro agree they would be comfortable using such an approach again in the future.

“I wouldn’t suggest sharing a preprint in progress for every paper, but I think constructive opportunities are more plentiful than researchers may realize,” Cheeseman says.

In general, Cheeseman thinks, the biology field needs to re-envision how its science gets shared.

“The idea that one size fits all, that everything needs to be a multi-figure paper in a high impact journal, is just not compatible with the way that people do research,” Cheeseman says. “We need to get flexible and explore and value scholarship in every form.”

As for the peptide paper? Regardless of where it ends up, Cheeseman and Navarro consider their open science experiment a success. By sharing their research and asking for input, they gained insights, research tools, and points of view that took the project from a curious finding to a rich understanding of the mechanisms behind Golgi localization. Their early realization that the peptide functions outside of their region of expertise could have been a dead end. But by being open about what they were working on and what sort of guidance they needed, the researchers were able to overcome that hurdle and decode their mystery peptide, with a little help from the wider scientific community.