Francisco J. Sánchez-Rivera

Education

  • PhD, 2016, Biology, MIT
  • BS, 2008, Microbiology, University of Puerto Rico at Mayagüez

Research Summary

The overarching goal of the Sánchez-Rivera laboratory is to elucidate the cellular and molecular mechanisms by which genetic variation shapes normal physiology and disease, particularly in the context of cancer. To do so, we develop and apply genome engineering technologies, genetically-engineered mouse models (GEMMs), and single cell lineage tracing and omics approaches to obtain comprehensive biological pictures of disease evolution at single cell resolution. By doing so, we hope to produce actionable discoveries that could pave the way for better therapeutic strategies to treat cancer and other diseases.

Awards

  • V Foundation Award, 2022
  • Hanna H. Gray Fellowship, Howard Hughes Medical Institute, 2018-2026
  • GMTEC Postdoctoral Researcher Innovation Grant, Memorial Sloan Kettering Cancer Center, 2020-2022
  • 100 inspiring Hispanic/Latinx scientists in America, Cell Mentor/Cell Press, 2020
Olivia Corradin

Education

  • PhD, 2015, Case Western Reserve University
  • BS, 2010, Biochemistry, Marquette University

Research Summary

Our lab studies genetic and epigenetic variation that contributes to human disease by disrupting gene expression programs. We utilize biological insights into the mechanisms of gene regulation in order to determine the impact of disease-associated variants on cellular function. We aim to identify actionable insights into disease pathogenesis by studying the confluence of genetic and epigenetic risk factors of human diseases, including multiple sclerosis and opioid use disorder.

Awards

  • NIH Director’s Pioneer Award Program Avenir Award, 2017

The Davis and Berger labs combined cryo-electron microscopy and machine learning to visualize molecules in 3D.

February 4, 2021
Machine-learning model helps determine protein structures

New technique reveals many possible conformations that a protein may take.

Anne Trafton | MIT News Office
February 4, 2021

Cryo-electron microscopy (cryo-EM) allows scientists to produce high-resolution, three-dimensional images of tiny molecules such as proteins. This technique works best for imaging proteins that exist in only one conformation, but MIT researchers have now developed a machine-learning algorithm that helps them identify multiple possible structures that a protein can take.

Unlike AI techniques that aim to predict protein structure from sequence data alone, protein structure can also be experimentally determined using cryo-EM, which produces hundreds of thousands, or even millions, of two-dimensional images of protein samples frozen in a thin layer of ice. Computer algorithms then piece together these images, taken from different angles, into a three-dimensional representation of the protein in a process termed reconstruction.

In a Nature Methods paper, the MIT researchers report a new AI-based software for reconstructing multiple structures and motions of the imaged protein — a major goal in the protein science community. Instead of using the traditional representation of protein structure as electron-scattering intensities on a 3D lattice, which is impractical for modeling multiple structures, the researchers introduced a new neural network architecture that can efficiently generate the full ensemble of structures in a single model.

“With the broad representation power of neural networks, we can extract structural information from noisy images and visualize detailed movements of macromolecular machines,” says Ellen Zhong, an MIT graduate student and the lead author of the paper.

With their software, they discovered protein motions from imaging datasets where only a single static 3D structure was originally identified. They also visualized large-scale flexible motions of the spliceosome — a protein complex that coordinates the splicing of the protein coding sequences of transcribed RNA.

“Our idea was to try to use machine-learning techniques to better capture the underlying structural heterogeneity, and to allow us to inspect the variety of structural states that are present in a sample,” says Joseph Davis, the Whitehead Career Development Assistant Professor in MIT’s Department of Biology.

Davis and Bonnie Berger, the Simons Professor of Mathematics at MIT and head of the Computation and Biology group at the Computer Science and Artificial Intelligence Laboratory, are the senior authors of the study, which appears today in Nature Methods. MIT postdoc Tristan Bepler is also an author of the paper.

Visualizing a multistep process

The researchers demonstrated the utility of their new approach by analyzing structures that form during the process of assembling ribosomes — the cell organelles responsible for reading messenger RNA and translating it into proteins. Davis began studying the structure of ribosomes while a postdoc at the Scripps Research Institute. Ribosomes have two major subunits, each of which contains many individual proteins that are assembled in a multistep process.

To study the steps of ribosome assembly in detail, Davis stalled the process at different points and then took electron microscope images of the resulting structures. At some points, blocking assembly resulted in accumulation of just a single structure, suggesting that there is only one way for that step to occur. However, blocking other points resulted in many different structures, suggesting that the assembly could occur in a variety of ways.

Because some of these experiments generated so many different protein structures, traditional cryo-EM reconstruction tools did not work well to determine what those structures were.

“In general, it’s an extremely challenging problem to try to figure out how many states you have when you have a mixture of particles,” Davis says.

After starting his lab at MIT in 2017, he teamed up with Berger to use machine learning to develop a model that can use the two-dimensional images produced by cryo-EM to generate all of the three-dimensional structures found in the original sample.

In the new Nature Methods study, the researchers demonstrated the power of the technique by using it to identify a new ribosomal state that hadn’t been seen before. Previous studies had suggested that as a ribosome is assembled, large structural elements, which are akin to the foundation for a building, form first. Only after this foundation is formed are the “active sites” of the ribosome, which read messenger RNA and synthesize proteins, added to the structure.

In the new study, however, the researchers found that in a very small subset of ribosomes, about 1 percent, a structure that is normally added at the end actually appears before assembly of the foundation. To account for that, Davis hypothesizes that it might be too energetically expensive for cells to ensure that every single ribosome is assembled in the correct order.

“The cells are likely evolved to find a balance between what they can tolerate, which is maybe a small percentage of these types of potentially deleterious structures, and what it would cost to completely remove them from the assembly pathway,” he says.

Viral proteins

The researchers are now using this technique to study the coronavirus spike protein, which is the viral protein that binds to receptors on human cells and allows them to enter cells. The receptor binding domain (RBD) of the spike protein has three subunits, each of which can point either up or down.

“For me, watching the pandemic unfold over the past year has emphasized how important front-line antiviral drugs will be in battling similar viruses, which are likely to emerge in the future. As we start to think about how one might develop small molecule compounds to force all of the RBDs into the ‘down’ state so that they can’t interact with human cells, understanding exactly what the ‘up’ state looks like and how much conformational flexibility there is will be informative for drug design. We hope our new technique can reveal these sorts of structural details,” Davis says.

The research was funded by the National Science Foundation Graduate Research Fellowship Program, the National Institutes of Health, and the MIT Jameel Clinic for Machine Learning and Health. This work was supported by MIT Satori computation cluster hosted at the MGHPCC.

New gene regulation model provides insight into brain development

A well-known protein family binds to many more RNA sequences than previously thought to help neurons grow.

Raleigh McElvery
August 17, 2020

In every cell, RNA-binding proteins (RBPs) help tune gene expression and control biological processes by binding to RNA sequences. Researchers often assume that individual RBPs latch tightly to just one RNA sequence. For instance, an essential family of RBPs, the Rbfox family, was thought to bind one particular RNA sequence alone. However, it’s becoming increasingly clear that this idea greatly oversimplifies Rbfox’s vital role in development.

Members of the Rbfox family are among the best-studied RBPs and have been implicated in mammalian brain, heart, and muscle development since their discovery 25 years ago. They influence how RNA transcripts are “spliced” together to form a final RNA product, and have been associated with disorders like autism and epilepsy. But this family of RBPs is compelling for another reason as well: until recently, it was considered a classic example of predictable binding.

More often than not, it seemed, Rbfox proteins bound to a very specific sequence, or motif, of nucleotide bases, “GCAUG.” Occasionally, binding analyses hinted that Rbfox proteins might attach to other RNA sequences as well, but these findings were usually discarded. Now, a team of biologists from MIT has found that Rbfox proteins actually bind less tightly — but no less frequently — to a handful of other RNA nucleotide sequences besides GCAUG. These so-called “secondary motifs” could be key to normal brain development, and help neurons grow and assume specific roles.

“Previously, possible binding of Rbfox proteins to atypical sites had been largely ignored,” says Christopher Burge, professor of biology and the study’s senior author. “But we’ve helped demonstrate that these secondary motifs form their own separate class of binding sites with important physiological functions.”

Graduate student Bridget Begg is the first author of the study, published on August 17 in Nature Structural & Molecular Biology.

“Two-wave” regulation

After the discovery that GCAUG was the primary RNA binding site for mammalian Rbfox proteins, researchers characterized its binding in living cells using a technique called CLIP (crosslinking-immunoprecipitation). However, CLIP has several limitations. For example, it can indicate where a protein is bound, but not how much protein is bound there. It’s also hampered by some technical biases, including substantial false-negative and false-positive results.

To address these shortcomings, the Burge lab developed two complementary techniques to better quantify protein binding, this time in a test tube: RBNS (RNA Bind-n-Seq), and later, nsRBNS (RNA Bind-n-Seq with natural sequences), both of which incubate an RBP of interest with a synthetic RNA library. First author Begg performed nsRBNS with naturally-occurring mammalian RNA sequences, and identified a variety of intermediate-affinity secondary motifs that were bound in the absence of GCAUG. She then compared her own data with publicly-available CLIP results to examine the “aberrant” binding that had often been discarded, demonstrating that signals for these motifs existed across many CLIP datasets.

To probe the biological role of these motifs, Begg performed reporter assays to show that the motifs could regulate Rbfox’s RNA splicing behavior. Subsequently, computational analyses by Begg and co-author Marvin Jens using mouse neuronal data established a handful of secondary motifs that appeared to be involved in neuronal differentiation and cellular diversification.

Based on analyses of these key secondary motifs, Begg and colleagues devised a “two-wave” model. Early in development, they believe, Rbfox proteins bind predominantly to high-affinity RNA sequences like GCAUG, in order to tune gene expression. Later on, as the Rbfox concentration increases, those primary motifs become fully occupied and Rbfox additionally binds to the secondary motifs. This results in a second wave of Rbfox-regulated RNA splicing with a different set of genes.

Begg theorizes that the first wave of Rbfox proteins binds GCAUG sequences early in development, and she showed that they regulate genes involved in nerve growth, like cytoskeleton and membrane organization. The second wave appears to help neurons establish electrical and chemical signaling. In other cases, secondary motifs might help neurons specialize into different subtypes with different jobs.

John Conboy, a molecular biologist at Lawrence Berkeley National Lab and an expert in Rbfox binding, says the Burge lab’s two-wave model clearly shows how a single RBP can bind different RNA sequences — regulating splicing of distinct gene sets and influencing key processes during brain development. “This quantitative analysis of RNA-protein interactions, in a field that is often semi-quantitative at best, contributes fascinating new insights into the role of RNA splicing in cell type specification,” he says.

A binding spectrum

The researchers suspect that this two-wave model is not unique to Rbfox. “This is probably happening with many different RBPs that regulate development and other dynamic processes,” Burge says. “In the future, considering secondary motifs will help us to better understand developmental disorders and diseases, which can occur when RBPs are over- or under-expressed.”

Begg adds that secondary motifs should be incorporated into computer models that predict gene expression, in order to probe cellular behavior. “I think it’s very exciting that these more finely-tuned developmental processes, like neuronal differentiation, could be regulated by secondary motifs,” she says.

Both Begg and Burge agree it’s time to consider the entire spectrum of Rbfox binding, which are highly influenced by factors like protein concentration, binding strength, and timing. According to Begg, “Rbfox regulation is actually more complex than we sometimes give it credit for.”

Citation:
“Concentration-dependent splicing is enabled by Rbfox motifs of intermediate affinity”
Nature Structural & Molecular Biology, online August 17, 2020, DOI: 10.1038/s41594-020-0475-8
Bridget E. Begg, Marvin Jens, Peter Y. Wang, Christine M. Minor, and Christopher B. Burge

Top illustration: Some RNA-binding proteins like Rbfox (gold ellipses) help tune gene expression and control biological processes by latching onto more RNA sequences (black and gold lines) as their concentration increases (teal shading). Credit: Bridget Begg
Posted: 8.17.20
Bringing RNA into genomics

ENCODE consortium identifies RNA sequences that are involved in regulating gene expression.

Anne Trafton | MIT News Office
July 29, 2020

The human genome contains about 20,000 protein-coding genes, but the coding parts of our genes account for only about 2 percent of the entire genome. For the past two decades, scientists have been trying to find out what the other 98 percent is doing.

A research consortium known as ENCODE (Encyclopedia of DNA Elements) has made significant progress toward that goal, identifying many genome locations that bind to regulatory proteins, helping to control which genes get turned on or off. In a new study that is also part of ENCODE, researchers have now identified many additional sites that code for RNA molecules that are likely to influence gene expression.

These RNA sequences do not get translated into proteins, but act in a variety of ways to control how much protein is made from protein-coding genes. The research team, which includes scientists from MIT and several other institutions, made use of RNA-binding proteins to help them locate and assign possible functions to tens of thousands of sequences of the genome.

“This is the first large-scale functional genomic analysis of RNA-binding proteins with multiple different techniques,” says Christopher Burge, an MIT professor of biology. “With the technologies for studying RNA-binding proteins now approaching the level of those that have been available for studying DNA-binding proteins, we hope to bring RNA function more fully into the genomic world.”

Burge is one of the senior authors of the study, along with Xiang-Dong Fu and Gene Yeo of the University of California at San Diego, Eric Lecuyer of the University of Montreal, and Brenton Graveley of UConn Health.

The lead authors of the study, which appears today in Nature, are Peter Freese, a recent MIT PhD recipient in Computational and Systems Biology; Eric Van Nostrand, Gabriel Pratt, and Rui Xiao of UCSD; Xiaofeng Wang of the University of Montreal; and Xintao Wei of UConn Health.

RNA regulation

Much of the ENCODE project has thus far relied on detecting regulatory sequences of DNA using a technique called ChIP-seq. This technique allows researchers to identify DNA sites that are bound to DNA-binding proteins such as transcription factors, helping to determine the functions of those DNA sequences.

However, Burge points out, this technique won’t detect genomic elements that must be copied into RNA before getting involved in gene regulation. Instead, the RNA team relied on a technique known as eCLIP, which uses ultraviolet light to cross-link RNA molecules with RNA-binding proteins (RBPs) inside cells. Researchers then isolate specific RBPs using antibodies and sequence the RNAs they were bound to.

RBPs have many different functions — some are splicing factors, which help to cut out sections of protein-coding messenger RNA, while others terminate transcription, enhance protein translation, break down RNA after translation, or guide RNA to a specific location in the cell. Determining the RNA sequences that are bound to RBPs can help to reveal information about the function of those RNA molecules.

“RBP binding sites are candidate functional elements in the transcriptome,” Burge says. “However, not all sites of binding have a function, so then you need to complement that with other types of assays to assess function.”

The researchers performed eCLIP on about 150 RBPs and integrated those results with data from another set of experiments in which they knocked down the expression of about 260 RBPs, one at a time, in human cells. They then measured the effects of this knockdown on the RNA molecules that interact with the protein.

Using a technique developed by Burge’s lab, the researchers were also able to narrow down more precisely where the RBPs bind to RNA. This technique, known as RNA Bind-N-Seq, reveals very short sequences, sometimes containing structural motifs such as bulges or hairpins, that RBPs bind to.

Overall, the researchers were able to study about 350 of the 1,500 known human RBPs, using one or more of these techniques per protein. RNA splicing factors often have different activity depending on where they bind in a transcript, for example activating splicing when they bind at one end of an intron and repressing it when they bind the other end. Combining the data from these techniques allowed the researchers to produce an “atlas” of maps describing how each RBP’s activity depends on its binding location.

“Why they activate in one location and repress when they bind to another location is a longstanding puzzle,” Burge says. “But having this set of maps may help researchers to figure out what protein features are associated with each pattern of activity.”

Additionally, Lecuyer’s group at the University of Montreal used green fluorescent protein to tag more than 300 RBPs and pinpoint their locations within cells, such as the nucleus, the cytoplasm, or the mitochondria. This location information can also help scientists to learn more about the functions of each RBP and the RNA it binds to.

“The strength of this manuscript is in the generation of a comprehensive and multilayered dataset that can be used by the biomedical community to develop therapies targeted to specific sites on the genome using genome-editing strategies, or on the transcriptome using antisense oligonucleotides or agents that mediate RNA interference,” says Gil Ast, a professor of human molecular genetics and biochemistry at Tel Aviv University, who was not involved in the research.

Linking RNA and disease

Many research labs around the world are now using these data in an effort to uncover links between some of the RNA sequences identified and human diseases. For many diseases, researchers have identified genetic variants called single nucleotide polymorphisms (SNPs) that are more common in people with a particular disease.

“If those occur in a protein-coding region, you can predict the effects on protein structure and function, which is done all the time. But if they occur in a noncoding region, it’s harder to figure out what they may be doing,” Burge says. “If they hit a noncoding region that we identified as binding to an RBP, and disrupt the RBP’s motif, then we could predict that the SNP may alter the splicing or stability of the gene.”

Burge and his colleagues now plan to use their RNA-based techniques to generate data on additional RNA-binding proteins.

“This work provides a resource that the human genetics community can use to help identify genetic variants that function at the RNA level,” he says.

The research was funded by the National Human Genome Research Institute ENCODE Project, as well as a grant from the Fonds de Recherche de Québec-Santé.

These muscle cells are guideposts to help regenerative flatworms grow back their eyes
Eva Frederick | Whitehead Institute
June 25, 2020

If anything happens to the eyes of the tiny, freshwater-dwelling planarian Schmidtea mediterranea, they can grow them back within just a few days. How they do this is a scientific conundrum — one that Peter Reddien’s lab at Whitehead Institute has been studying for years.

The lab’s latest project offers some insight: in a paper published in Science June 25, researchers in Reddien’s lab have identified a new type of cell that likely serves as a guidepost to help route axons from the eyes to the brain as the worms complete the difficult task of regrowing their neural circuitry.

Schmidtea mediterranea’s eyes are composed of light-capturing photoreceptor neurons connected to the brain with long, spindly processes called axons. They use their eyes to respond to light to help navigate their environment.

The worms, which are popular models for research into regeneration, can regrow pretty much any part of their body; eyes are an interesting part to study because regenerating the visual system requires the worms rewire their neurons to connect them to the brain.

When neural systems develop in embryos, the first nerve fibers, called pioneer axons, snake their way through tissue to form the circuitry needed to perceive and interpret external stimuli. The axons are helped along their way by specialized cells called guidepost cells. These special cells are positioned at choice points — places where the axon’s path could fork in different directions.

In many organisms, these guidepost cells aren’t a priority anymore once development is finished, and typically are not renewed through adulthood. That’s one reason why, when humans experience brain or nerve damage, the injury is usually permanent.

“This is a fundamental mystery of regeneration that we hadn’t even been thinking about,” says Reddien, the senior author of the paper who is also a professor of biology at Massachusetts Institute of Technology and an investigator with the Howard Hughes Medical Institute. “How can an adult animal regenerate a functional nervous system when the original development of the nervous system typically involves a number of cues that are thought to be transient?”

Then, in 2018, Reddien Lab scientist Lucila Scimone found something surprising in adult planarians: groups of mysterious cells that looked like they might play a role in guiding growing axons. She’d noticed this group of cells because they co-expressed two genes not often seen together and some were conspicuously close to the eyes.

“I was captivated by these cells,” she says. They appeared in very small numbers (a normal worm might have around 5; a large one might have up to 10) in every planarian she examined. They were divided into two distinct groups: some around the flatworms’ eyes, and others spaced out along the path to the brain center. When she traced the path of existing axons leading from the planarians’ eyes to their brain, they coincided with the positions of these cells without exception.

When the researchers characterized the cells, they found that they did not express any of the genes that are hallmarks of photoreceptor neurons; instead, they had markers often found in muscle tissue. “That was very striking, because muscle cells — that’s not what they do in most animals,” Scimone says.

In other organisms, guidepost cells are often neurons or glia. It would be unusual for muscle cells to serve as guideposts; but past work in the Reddien Lab had shown that planarian muscle cells played other special roles, such as secreting the extracellular matrix. The researchers now wondered whether they could add the role of guidepost to the long list of planarian muscle cell functions.

To test their hypothesis, the researchers designed a series of experiments. “We developed an eye transplantation method where you can take an eye from an animal and transplant it into another animal,” says Reddien Lab postdoc Kutay Deniz Atabay. “When you do this, the axonal projections from that eye will basically, if positioned appropriately, correctly wire themselves into the brain, producing a functional state.”

The researchers also created genetically engineered planarians that had the muscle cells, but no eyes, and then transplanted eyes onto their eyeless heads. Sure enough, the neurons grew as normal, snaking towards the cells and then adjusting their trajectories after encountering them.

Without the cells, it was a different story. When the researchers transplanted eyes to distant parts of planarians’ bodies without a population of these muscle cells, the photoreceptor neurons did not connect to the brain center. Likewise, when they transplanted eyes into planarians that had been modified to not have these muscle cells, their photoreceptor neurons still grew — but they did not wire properly to reach the brain.

These findings combined suggested that the cells were fully independent of the visual system — they did not form because of eyes or photoreceptor neurons, but likely established themselves before the neurons grew — which provided more evidence for the guidepost role.

The guidepost-like activity of these cells then begged the question: how do the cells themselves know where to be? “We found that there’s a pattern of signaling molecules in muscle that is setting where these cells should be,” Reddien says. “If we perturb the global positional information of the system, these cells get placed in the wrong positions, and then axons go to the wrong positions — so we think there’s a positional information framework that places the cells during regeneration, and that allows them to work as guideposts in the correct locations.”

At this point, the researchers don’t know exactly how the cells are able to communicate with growing axons to serve as guideposts. They could be releasing some sort of signaling molecule that attracts the axons, or they could be communicating by using trans-membrane proteins.

“That will be an exciting direction for the future,” Reddien says. “We have now identified the transcriptome for the cells, which means we know all the genes that these cells express. That provides us with an intriguing list of genes that can be probed functionally, to try to see which ones are mediating the functions of these cells.”

This study is a step forward in a body of work that aims to expand the capabilities of regenerative medicine. “Imagine a scenario where someone experiences a spinal cord injury or an eye injury or stroke that leads to the loss of a neural circuit,” says Atabay. “The reason we can’t fully cure these cases today is that we lack fundamental information regarding how these systems can regenerate. Looking at regenerative organisms provides a lot of insights. From this case, we see that regenerating the lost system may not be enough; you may also need to regenerate systems that are properly patterning that system.”

***

Written by Eva Frederick

***

Scimone, M. L. et al. “Muscle and neuronal guidepost-like cells facilitate planarian visual system regeneration.” Science, June 25, 2020.

Jonathan Weissman

Education

  • PhD, 1993, MIT
  • AB, 1988, Physics, Harvard

Research Summary

We study how cells ensure that proteins fold into their correct shape, as well as the role of protein misfolding in disease and normal physiology. We also build innovative tools for broadly exploring organizational principles of biological systems. These include ribosome profiling, which globally monitors protein translation, CRIPSRi/a for controlling the expression of human genes and rewiring the epigenome, and lineage tracing tools, to record the history of cells.

Awards

  • Ira Herskowitz Award, Genetic Society of America, 2020
  • European Molecular Biology Organization, Member, 2017
  • National Academy of Sciences Award for Scientific Discovery, 2015
  • American Academy of Microbiology, Fellow, 2010
  • National Academy of Sciences, Member, 2009
  • Raymond and Beverly Sackler International Prize in Biophysics, Tel Aviv University, 2008
  • Protein Society Irving Sigal Young Investigator’s Award, 2004
  • Howard Hughes Medical Institute, Assistant Investigator, 2000
  • Searle Scholars Program Fellowship, 1997
  • David and Lucile Packard Fellowship, 1996
Genetic study takes research on sex differences to new heights

Differences in male and female gene expression, including those contributing to height differences, found throughout the body in humans and other mammals.

Greta Friar | Whitehead Institute
July 19, 2019

Throughout the animal kingdom, males and females frequently exhibit sexual dimorphism: differences in characteristic traits that often make it easy to tell them apart. In mammals, one of the most common sex-biased traits is size, with males typically being larger than females. This is true in humans: Men are, on average, taller than women. However, biological differences among males and females aren’t limited to physical traits like height. They’re also common in disease. For example, women are much more likely to develop autoimmune diseases, while men are more likely to develop cardiovascular diseases.

In spite of the widespread nature of these sex biases, and their significant implications for medical research and treatment, little is known about the underlying biology that causes sex differences in characteristic traits or disease. In order to address this gap in understanding, Whitehead Institute Director David Page has transformed the focus of his lab in recent years from studying the X and Y sex chromosomes to working to understand the broader biology of sex differences throughout the body. In a paper published in Science, Page, a professor of biology at MIT and a Howard Hughes Medical Institute investigator; Sahin Naqvi, first author and former MIT graduate student (now a postdoc at Stanford University); and colleagues present the results of a wide-ranging investigation into sex biases in gene expression, revealing differences in the levels at which particular genes are expressed in males versus females.

The researchers’ findings span 12 tissue types in five species of mammals, including humans, and led to the discovery that a combination of sex-biased genes accounts for approximately 12 percent of the average height difference between men and women. This finding demonstrates a functional role for sex-biased gene expression in contributing to sex differences. The researchers also found that the majority of sex biases in gene expression are not shared between mammalian species, suggesting that — in some cases — sex-biased gene expression that can contribute to disease may differ between humans and the animals used as models in medical research.

Having the same gene expressed at different levels in each sex is one way to perpetuate sex differences in traits in spite of the genetic similarity of males and females within a species — since with the exception of the 46th chromosome (the Y in males or the second X in females), the sexes share the same pool of genes. For example, if a tall parent passes on a gene associated with an increase in height to both a son and a daughter, but the gene has male-biased expression, then that gene will be more highly expressed in the son, and so may contribute more height to the son than the daughter.

The researchers searched for sex-biased genes in tissues across the body in humans, macaques, mice, rats, and dogs, and they found hundreds of examples in every tissue. They used height for their first demonstration of the contribution of sex-biased gene expression to sex differences in traits because height is an easy-to-measure and heavily studied trait in quantitative genetics.

“Discovering contributions of sex-biased gene expression to height is exciting because identifying the determinants of height is a classic, century-old problem, and yet by looking at sex differences in this new way we were able to provide new insights,” Page says. “My hope is that we and other researchers can repeat this model to similarly gain new insights into diseases that show sex bias.”

Because height is so well studied, the researchers had access to public data on the identity of hundreds of genes that affect height. Naqvi decided to see how many of those height genes appeared in the researchers’ new dataset of sex-biased genes, and whether the genes’ sex biases corresponded to the expected effects on height. He found that sex-biased gene expression contributed approximately 1.6 centimeters to the average height difference between men and women, or 12 percent of the overall observed difference.

The scope of the researchers’ findings goes beyond height, however. Their database contains thousands of sex-biased genes. Slightly less than a quarter of the sex-biased genes that they catalogued appear to have evolved that sex bias in an early mammalian ancestor, and to have maintained that sex bias today in at least four of the five species studied. The majority of the genes appear to have evolved their sex biases more recently, and are specific to either one species or a certain lineage, such as rodents or primates.

Whether or not a sex-biased gene is shared across species is a particularly important consideration for medical and pharmaceutical research using animal models. For example, previous research identified certain genetic variants that increase the risk of Type 2 diabetes specifically in women; however, the same variants increase the risk of Type 2 diabetes indiscriminately in male and female mice. Therefore, mice would not be a good model to study the genetic basis of this sex difference in humans. Even when the animal appears to have the same sex difference in disease as humans, the specific sex-biased genes involved might be different. Based on their finding that most sex bias is not shared between species, Page and colleagues urge researchers to use caution when picking an animal model to study sex differences at the level of gene expression.

“We’re not saying to avoid animal models in sex-differences research, only not to take for granted that the sex-biased gene expression behind a trait or disease observed in an animal will be the same as that in humans. Now that researchers have species and tissue-specific data available to them, we hope they will use it to inform their interpretation of results from animal models,” Naqvi says.

The researchers have also begun to explore what exactly causes sex-biased expression of genes not found on the sex chromosomes. Naqvi discovered a mechanism by which sex-biased expression may be enabled: through sex-biased transcription factors, proteins that help to regulate gene expression. Transcription factors bind to specific DNA sequences called motifs, and he found that certain sex-biased genes had the motif for a sex-biased transcription factor in their promoter regions, the sections of DNA that turn on gene expression. This means that, for example, a male-biased transcription factor was selectively binding to the promoter region for, and so increasing the expression of, male-biased genes — and likewise for female-biased transcription factors and female-biased genes. The question of what regulates the transcription factors remains for further study — but all sex differences are ultimately controlled by either the sex chromosomes or sex hormones.

The researchers see the collective findings of this paper as a foundation for future sex-differences research.

“We’re beginning to build the infrastructure for a systematic understanding of sex biases throughout the body,” Page says. “We hope these datasets are used for further research, and we hope this work gives people a greater appreciation of the need for, and value of, research into the molecular differences in male and female biology.”

This work was supported by Biogen, Whitehead Institute, National Institutes of Health, Howard Hughes Medical Institute, and generous gifts from Brit and Alexander d’Arbeloff and Arthur W. and Carol Tobin Brill.