A day in the life — graduate student and genomics researcher Neha Bokil

Neha Bokil is studying mechanisms that regulate expression of genes located on the X and Y chromosomes in order to better understand sex-biased conditions that predominantly affect one sex.

Shafaq Zia | Whitehead Institute
June 25, 2024

Graduate student Neha Bokil moves around the Page lab with urgency. Today, she’s running an experiment using white blood cells from patients with varying numbers of X and Y chromosomes.

The lab of Whitehead Institute Member David Page investigates the role of the X and Y chromosomes beyond determining sex. While most females have two X chromosomes (XX) and most males have one X and one Y chromosome (XY), there are individuals whose sex chromosome constitution varies from this, having instead, for example, XXY, XXX, or XXXXY. With the goal of understanding why certain conditions are more prevalent in one sex versus than the other, Bokil is using this experiment to explore if and how cellular processes, such as gene regulation, vary among individuals with these atypical combinations of sex chromosomes.

Partially hidden in the cell culture hood, Bokil finally locates what she’s been searching for: a pipette for dispensing 99 microliters of the cell suspension she’s meticulously prepared this afternoon, a type of culture where cells float in nutrient-rich liquid, free to function and grow.

Bokil carefully extracts this volume and transfers it to a flat plate — also called a 96-well plate — with tiny holes for growing small cell samples. Now, it’s a waiting game until she can find out how these cells are growing, and whether their proliferation rate depends on the number of sex chromosomes in a cell.

Bokil dives into the intricacies of human genetics every day, hoping her work will eventually help reshape how sex differences are understood in medicine and improve treatment outcomes. The dynamic research Bokil is conducting at Whitehead Institute is her calling, but she has other passions as well. Here’s what a typical day in her life as a graduate student looks like, both in and outside the lab.

An inherited love of numbers

When she isn’t rushing out the door, Bokil loves brewing and savoring the perfect cup of morning chai, a traditional South Asian loose-leaf tea with milk. Every family has their own recipe, and Bokil makes hers with ginger, a touch of cardamom, and some sugar.

“Chai is comforting at any time, but I’ve noticed my mood vastly improves when I’m able to have a cup in the morning,” she says.

On her walk to the Whitehead Institute, she often listens to Bollywood songs. But these predilections — chai and Indian cinema — are more than just rituals for her. They symbolize tradition and cherished connections with family and friends.

In fact, family bonds have greatly influenced Bokil’s career path. As a child, she loved mathematics. It wasn’t a trait passed on genetically, but one that flourished through moments of connection with her grandmother, a math teacher in India. During summer visits to Bokil’s family in the U.S., she’d enthusiastically impart her passion for numbers onto her granddaughter. By the time Bokil went to high school and later college, she had become fluent in the language of logic and patterns.

“My time with her made me realize just how beautiful and fun math is, and I could see its practical applications in everyday life, all around me,” Bokil says.

For her PhD, she sought to combine her undergraduate training in mathematics and molecular biology to tackle a real-world problem. With genetics at the crossroads of these disciplines, and the Page Lab leading the way in transforming scientific understanding of X and Y chromosomes beyond reproduction, Bokil knew she had to get involved.

This morning, as she sits at her desk, poring over a research paper before an afternoon lab meeting, she ponders how insights from the study could enhance her manuscript writing process. Bokil’s graduate project uses a collection of cell lines derived from patients with atypical numbers of X and Y chromosomes to investigate mechanisms that regulate — or dial up and down the expression of — genes located on one of the X chromosomes in females called the “inactive” X chromosome.

Although the X and Y sex chromosomes in mammals began as a pair with similar structures, over time, the Y chromosome underwent degeneration, leading to the loss of numerous active genes. In contrast, the X chromosome preserved its original genes and even gained new ones. To maintain balance in gene expression across the two sexes — XX and XY — an evolutionary mechanism called X chromosome inactivation emerged.

This process is known to randomly silence one X chromosome in each XX pair, ensuring that both sexes have an equal dosage of genes from the X chromosome. However, in recent years, the Page lab has discovered that there are powerful distinctions within females’ pair of X chromosomes, and the so-called “inactive” X chromosome is far from passive. Instead, it plays a crucial role in regulating gene expression on the active X chromosome.

“That’s not all,” adds Bokil. “There are still genes expressed from that “inactive” X chromosome. Cracking how these genes are regulated could answer longstanding questions about sex differences in health.”

Bokil is unraveling this genetic mystery with the help of chemical tags called histone marks. These tags cling to a family of proteins that function like spools, allowing long strands of DNA to coil around them — like thread around a bobbin — so genetic information remains neatly packaged within the cell’s nucleus.

This complex of DNA, RNA, and proteins is called chromatin, the genetic material that eventually forms chromosomes. Chromatin also lays the groundwork for gene regulation by keeping some genes tightly wound around the histones, rendering them inaccessible, and unwinding others for active use.

Certain histone marks are associated with open chromatin structure and active gene expression, while others indicate closed chromatin structure and gene silencing. By examining the specific histone marks on proteins near genes on the “inactive” X chromosome, Bokil aims to decipher if and how these genes are turned on and off.

She’s particularly interested in a group of genes that have counterparts on the Y chromosome. These genes, known as homologous X-Y gene pairs, are typically dosage-sensitive and play a crucial role in regulating essential processes throughout the body like the transcription of DNA into RNA and the translation of RNA into proteins.

Celebrating small triumphs

Graduate school can feel like a marathon — progress is slow but every small step counts towards a breakthrough. For Bokil, stumbling upon a captivating scientific puzzle has been a stroke of luck she deeply appreciates. In fact, the mystery of how genes are controlled on the “inactive” X chromosome has not only shaped her scientific pursuits but also her artwork — on one quiet evening at home, she found herself inspired to capture an experiment, called CUT&RUN, in her painting.

During the early days of her PhD, Bokil spent hundreds of hours using this technique to identify the precise locations of histone protein and DNA interactions. Right as she was prepared to expand these experiments across multiple cell lines, the COVID-19 hit, throwing her plans — and progress — off course.

During these challenging times, Bokil found solace in her cultural roots and the warmth of community. She began teaching virtual BollyX classes — a dance similar to Zumba, but on Bollywood tunes — every Tuesday evening as a means to stay connected, a commitment she’s upheld ever since throughout her time in graduate school.

Beyond nurturing a sense of togetherness through dance, Bokil is committed to mentoring in science and celebrating improbable victories along a tedious research journey.

“I had a former lab mate who used to do what she called a data dance every time she had a graph she felt happy with,” Bokil recalls. “I think that should catch on a little bit more because it’s always a really good feeling to see how these experiments that have taken up so much of your time and effort are leading somewhere.”

Sara Prescott named Pew Scholar in the Biomedical Sciences

Assistant Professor Sara Prescott and her lab plan to test whether and how neurons have a role in airway remodeling, which goes awry in many diseases.

David Orenstein | The Picower Institute for Learning and Memory
June 17, 2024
Whitehead Institute Member Siniša Hrvatin named a 2024 McKnight Scholar

The McKnight Endowment Fund for Neuroscience has selected Whitehead Institute Member Siniša Hrvatin as one of ten early career scientists to receive a 2024 McKnight Scholar Award, supporting his research on mechanisms underlying certain animals’ capacity to enter states of torpor and hibernation.

Merrill Meadow | Whitehead Institute
June 20, 2024
Rudolf Jaenisch receives the ISTT Prize for contributions to transgenic technologies

The International Society for Transgenic Technologies recognized Whitehead Institute Founding Member Rudolf Jaenisch for his exceptional contribution to the field of animal transgenesis over the past five decades.

Merrill Meadow | Whitehead Institute
June 11, 2024
With programmable pixels, novel sensor improves imaging of neural activity

New camera chip design allows for optimizing each pixel’s timing to maximize signal to noise ratio when tracking real-time visual indicator of neural voltage, described in a new paper from a team in the Wilson Lab published in Nature Communications.

David Orenstein | The Picower Institute for Learning and Memory
June 7, 2024
New technique reveals how gene transcription is coordinated in cells

By capturing short-lived RNA molecules, scientists can map relationships between genes and the regulatory elements that control them.

Anne Trafton | MIT News
June 5, 2024

The human genome contains about 23,000 genes, but only a fraction of those genes are turned on inside a cell at any given time. The complex network of regulatory elements that controls gene expression includes regions of the genome called enhancers, which are often located far from the genes that they regulate.

This distance can make it difficult to map the complex interactions between genes and enhancers. To overcome that, MIT researchers have invented a new technique that allows them to observe the timing of gene and enhancer activation in a cell. When a gene is turned on around the same time as a particular enhancer, it strongly suggests the enhancer is controlling that gene.

Learning more about which enhancers control which genes, in different types of cells, could help researchers identify potential drug targets for genetic disorders. Genomic studies have identified mutations in many non-protein-coding regions that are linked to a variety of diseases. Could these be unknown enhancers?

“When people start using genetic technology to identify regions of chromosomes that have disease information, most of those sites don’t correspond to genes. We suspect they correspond to these enhancers, which can be quite distant from a promoter, so it’s very important to be able to identify these enhancers,” says Phillip Sharp, an MIT Institute Professor Emeritus and member of MIT’s Koch Institute for Integrative Cancer Research.

Sharp is the senior author of the new study, which appears today in Nature. MIT Research Assistant D.B. Jay Mahat is the lead author of the paper.

Hunting for eRNA

Less than 2 percent of the human genome consists of protein-coding genes. The rest of the genome includes many elements that control when and how those genes are expressed. Enhancers, which are thought to turn genes on by coming into physical contact with gene promoter regions through transiently forming a complex, were discovered about 45 years ago.

More recently, in 2010, researchers discovered that these enhancers are transcribed into RNA molecules, known as enhancer RNA or eRNA. Scientists suspect that this transcription occurs when the enhancers are actively interacting with their target genes. This raised the possibility that measuring eRNA transcription levels could help researchers determine when an enhancer is active, as well as which genes it’s targeting.

“That information is extraordinarily important in understanding how development occurs, and in understanding how cancers change their regulatory programs and activate processes that lead to de-differentiation and metastatic growth,” Mahat says.

However, this kind of mapping has proven difficult to perform because eRNA is produced in very small quantities and does not last long in the cell. Additionally, eRNA lacks a modification known as a poly-A tail, which is the “hook” that most techniques use to pull RNA out of a cell.

One way to capture eRNA is to add a nucleotide to cells that halts transcription when incorporated into RNA. These nucleotides also contain a tag called biotin that can be used to fish the RNA out of a cell. However, this current technique only works on large pools of cells and doesn’t give information about individual cells.

While brainstorming ideas for new ways to capture eRNA, Mahat and Sharp considered using click chemistry, a technique that can be used to join two molecules together if they are each tagged with “click handles” that can react together.

The researchers designed nucleotides labeled with one click handle, and once these nucleotides are incorporated into growing eRNA strands, the strands can be fished out with a tag containing the complementary handle. This allowed the researchers to capture eRNA and then purify, amplify, and sequence it. Some RNA is lost at each step, but Mahat estimates that they can successfully pull out about 10 percent of the eRNA from a given cell.

Using this technique, the researchers obtained a snapshot of the enhancers and genes that are being actively transcribed at a given time in a cell.

“You want to be able to determine, in every cell, the activation of transcription from regulatory elements and from their corresponding gene. And this has to be done in a single cell because that’s where you can detect synchrony or asynchrony between regulatory elements and genes,” Mahat says.

Timing of gene expression

Demonstrating their technique in mouse embryonic stem cells, the researchers found that they could calculate approximately when a particular region starts to be transcribed, based on the length of the RNA strand and the speed of the polymerase (the enzyme responsible for transcription) — that is, how far the polymerase transcribes per second. This allowed them to determine which genes and enhancers were being transcribed around the same time.

The researchers used this approach to determine the timing of the expression of cell cycle genes in more detail than has previously been possible. They were also able to confirm several sets of known gene-enhancer pairs and generated a list of about 50,000 possible enhancer-gene pairs that they can now try to verify.

Learning which enhancers control which genes would prove valuable in developing new treatments for diseases with a genetic basis. Last year, the U.S. Food and Drug Administration approved the first gene therapy treatment for sickle cell anemia, which works by interfering with an enhancer that results in activation of a fetal globin gene, reducing the production of sickled blood cells.

The MIT team is now applying this approach to other types of cells, with a focus on autoimmune diseases. Working with researchers at Boston Children’s Hospital, they are exploring immune cell mutations that have been linked to lupus, many of which are found in non-coding regions of the genome.

“It’s not clear which genes are affected by these mutations, so we are beginning to tease apart the genes these putative enhancers might be regulating, and in what cell types these enhancers are active,” Mahat says. “This is a tool for creating gene-to-enhancer maps, which are fundamental in understanding the biology, and also a foundation for understanding disease.”

The findings of this study also offer evidence for a theory that Sharp has recently developed, along with MIT professors Richard Young and Arup Chakraborty, that gene transcription is controlled by membraneless droplets known as condensates. These condensates are made of large clusters of enzymes and RNA, which Sharp suggests may include eRNA produced at enhancer sites.

“We picture that the communication between an enhancer and a promoter is a condensate-type, transient structure, and RNA is part of that. This is an important piece of work in building the understanding of how RNAs from enhancers could be active,” he says.

The research was funded by the National Cancer Institute, the National Institutes of Health, and the Emerald Foundation Postdoctoral Transition Award.

“Rosetta Stone” of cell signaling could expedite precision cancer medicine

An atlas of human protein kinases enables scientists to map cell signaling pathways with unprecedented speed and detail. Michael Yaffe, the David H. Koch Professor of Science at MIT, the director of the MIT Center for Precision Cancer Medicine, a member of MIT’s Koch Institute for Integrative Cancer Research, and a senior author of the new study published in Nature, is hoping to apply the comprehensive atlas of enzymes that regulate a wide variety of cellular activities to individual patients’ tumors to learn more about how the signaling states differ in cancer cancer, which could reveal new

Megan Scudellari | Koch Institute
June 3, 2024

A newly complete database of human protein kinases and their preferred binding sites provides a powerful new platform to investigate cell signaling pathways.

Culminating 25 years of research, MIT, Harvard University, and Yale University scientists and collaborators have unveiled a comprehensive atlas of human tyrosine kinases — enzymes that regulate a wide variety of cellular activities — and their binding sites.

The addition of tyrosine kinases to a previously published dataset from the same group now completes a free, publicly available atlas of all human kinases and their specific binding sites on proteins, which together orchestrate fundamental cell processes such as growth, cell division, and metabolism.

Now, researchers can use data from mass spectrometry, a common laboratory technique, to identify the kinases involved in normal and dysregulated cell signaling in human tissue, such as during inflammation or cancer progression.

“I am most excited about being able to apply this to individual patients’ tumors and learn about the signaling states of cancer and heterogeneity of that signaling,” says Michael Yaffe, who is the David H. Koch Professor of Science at MIT, the director of the MIT Center for Precision Cancer Medicine, a member of MIT’s Koch Institute for Integrative Cancer Research, and a senior author of the new study. “This could reveal new druggable targets or novel combination therapies.”

The study, published in Nature, is the product of a long-standing collaboration with senior authors Lewis Cantley at Harvard Medical School and Dana-Farber Cancer Institute, Benjamin Turk at Yale School of Medicine, and Jared Johnson at Weill Cornell Medical College.

The paper’s lead authors are Tomer Yaron-Barir at Columbia University Irving Medical Center, and MIT’s Brian Joughin, with contributions from Kontstantin Krismer, Mina Takegami, and Pau Creixell.

Kinase kingdom

Human cells are governed by a network of diverse protein kinases that alter the properties of other proteins by adding or removing chemical compounds called phosphate groups. Phosphate groups are small but powerful: When attached to proteins, they can turn proteins on or off, or even dramatically change their function. Identifying which of the almost 400 human kinases phosphorylate a specific protein at a particular site on the protein was traditionally a lengthy, laborious process.

Beginning in the mid 1990s, the Cantley laboratory developed a method using a library of small peptides to identify the optimal amino acid sequence — called a motif, similar to a scannable barcode — that a kinase targets on its substrate proteins for the addition of a phosphate group. Over the ensuing years, Yaffe, Turk, and Johnson, all of whom spent time as postdocs in the Cantley lab, made seminal advancements in the technique, increasing its throughput, accuracy, and utility.

Johnson led a massive experimental effort exposing batches of kinases to these peptide libraries and observed which kinases phosphorylated which subsets of peptides. In a corresponding Nature paper published in January 2023, the team mapped more than 300 serine/threonine kinases, the other main type of protein kinase, to their motifs. In the current paper, they complete the human “kinome” by successfully mapping 93 tyrosine kinases to their corresponding motifs.

Next, by creating and using advanced computational tools, Yaron-Barir, Krismer, Joughin, Takegami, and Yaffe tested whether the results were predictive of real proteins, and whether the results might reveal unknown signaling events in normal and cancer cells. By analyzing phosphoproteomic data from mass spectrometry to reveal phosphorylation patterns in cells, their atlas accurately predicted tyrosine kinase activity in previously studied cell signaling pathways.

For example, using recently published phosphoproteomic data of human lung cancer cells treated with two targeted drugs, the atlas identified that treatment with erlotinib, a known inhibitor of the protein EGFR, downregulated sites matching a motif for EGFR. Treatment with afatinib, a known HER2 inhibitor, downregulated sites matching the HER2 motif. Unexpectedly, afatinib treatment also upregulated the motif for the tyrosine kinase MET, a finding that helps explain patient data linking MET activity to afatinib drug resistance.

Actionable results

There are two key ways researchers can use the new atlas. First, for a protein of interest that is being phosphorylated, the atlas can be used to narrow down hundreds of kinases to a short list of candidates likely to be involved. “The predictions that come from using this will still need to be validated experimentally, but it’s a huge step forward in making clear predictions that can be tested,” says Yaffe.

Second, the atlas makes phosphoproteomic data more useful and actionable. In the past, researchers might gather phosphoproteomic data from a tissue sample, but it was difficult to know what that data was saying or how to best use it to guide next steps in research. Now, that data can be used to predict which kinases are upregulated or downregulated and therefore which cellular signaling pathways are active or not.

“We now have a new tool now to interpret those large datasets, a Rosetta Stone for phosphoproteomics,” says Yaffe. “It is going to be particularly helpful for turning this type of disease data into actionable items.”

In the context of cancer, phosophoproteomic data from a patient’s tumor biopsy could be used to help doctors quickly identify which kinases and cell signaling pathways are involved in cancer expansion or drug resistance, then use that knowledge to target those pathways with appropriate drug therapy or combination therapy.

Yaffe’s lab and their colleagues at the National Institutes of Health are now using the atlas to seek out new insights into difficult cancers, including appendiceal cancer and neuroendocrine tumors. While many cancers have been shown to have a strong genetic component, such as the genes BRCA1 and BRCA2 in breast cancer, other cancers are not associated with any known genetic cause. “We’re using this atlas to interrogate these tumors that don’t seem to have a clear genetic driver to see if we can identify kinases that are driving cancer progression,” he says.

Biological insights

In addition to completing the human kinase atlas, the team made two biological discoveries in their recent study. First, they identified three main classes of phosphorylation motifs, or barcodes, for tyrosine kinases. The first class is motifs that map to multiple kinases, suggesting that numerous signaling pathways converge to phosphorylate a protein boasting that motif. The second class is motifs with a one-to-one match between motif and kinase, in which only a specific kinase will activate a protein with that motif. This came as a partial surprise, as tyrosine kinases have been thought to have minimal specificity by some in the field.

The final class includes motifs for which there is no clear match to one of the 78 classical tyrosine kinases. This class includes motifs that match to 15 atypical tyrosine kinases known to also phosphorylate serine or threonine residues. “This means that there’s a subset of kinases that we didn’t recognize that are actually playing an important role,” says Yaffe. It also indicates there may be other mechanisms besides motifs alone that affect how a kinase interacts with a protein.

The team also discovered that tyrosine kinase motifs are tightly conserved between humans and the worm species C. elegans, despite the species being separated by more than 600 million years of evolution. In other words, a worm kinase and its human homologue are phosphorylating essentially the same motif. That sequence preservation suggests that tyrosine kinases are highly critical to signaling pathways in all multicellular organisms, and any small change would be harmful to an organism.

The research was funded by the Charles and Marjorie Holloway Foundation, the MIT Center for Precision Cancer Medicine, the Koch Institute Frontier Research Program via L. Scott Ritterbush, the Leukemia and Lymphoma Society, the National Institutes of Health, Cancer Research UK, the Brain Tumour Charity, and the Koch Institute Support (core) grant from the National Cancer Institute.

Whitehead Institute Director Ruth Lehmann elected as a Fellow of the Royal Society

Whitehead Institute Director and President Ruth Lehmann has been named a Foreign Member of the Royal Society. The election recognizes her “pioneering studies of the mechanisms underlying the embryonic development and reproduction of the fruit fly Drosophila.” It honors her work establishing the role of messenger RNA localization in specifying the antero-posterior body axis and germ line development and additionally notes her discoveries that revealed the role of lipid-based signaling pathways in the migration of germ cells to the developing gonads.

Lisa Girard | Whitehead Institute
May 22, 2024
Q&A: Pulin Li on recreating development in the lab

In the whirlwind of activity that occurs simultaneously in a developing embryo, it can be difficult for scientists to pinpoint critical moments of a particular trait. In this Q&A, Pulin Li discusses how her lab ventures beyond mere observation to actually engineer developmental events in a petri dish, and why this approach is vital for understanding health and disease more broadly.

Shafaq Zia | Whitehead Institute
May 1, 2024
New findings activate a better understanding of Rett syndrome’s causes

Rett syndrome is caused by mutations to the gene MECP2, which is highly expressed in the brain and appears to play important roles in maintaining healthy neurons. Researchers led by Rudolf Jaenisch have used cutting-edge techniques to create an epigenome map of MECP2, which may help guide future research on the disease.

Greta Friar | Whitehead Institute
April 25, 2024