Considering biological puzzles, one line of code at a time
A fascination with the applications for machine learning led to a summer of programming at MIT
Lillian Eden
It was a fortunate stroke of serendipity that ultimately led Fareeda Abu-Juam to the Bernard S. and Sophie G. Gould Summer Research Program in Biology (BSG-MSRP-Bio) at MIT this summer. After an MIT alumna visited her high school in Ghana, Abu-Juam became fascinated by the application of machine learning to biological problems and decided to study computer science and biology in college.
For ten weeks as part of the BSG-MSRP-Bio program, Abu-Juam worked in the lab of Associate Professor of Biology Joseph “Joey” Davis, developing computer code to parse out useful information from grayscale cryo-electron microscopy (cryo-EM) images.
“I feel like I really got to experience what it was like to be doing research and got a glimpse of what it would be like to be a graduate student here,” she says. “It has really changed my perspective on what the research environment is like. Everyone is really friendly, collaborative, and easy to talk to. I know I can go to anyone in my lab or the building and ask for help.”
Davis is widely known for studying how cells create and degrade massive molecular machines such as the ribosome. The lab also develops tools at the interface of machine learning, biochemistry, biology, and structural biology, including machine learning applications for cryo-EM data.
Researchers use hundreds or thousands of 2D cryo-EM images of particles like ribosomes, proteosomes, and other protein complexes to generate 3D structures. But first, those particles must be identified within the images. This is challenging because Cryo-EM results are visually noisy, like static on a television. Current approaches to picking out useful particles have some limitations. For example, programs typically require that the particles within a given image are all of a similar size.
A computer program presented with a picture of an office that can identify objects like a desk, a chair, a computer, and a person could, theoretically, be adapted to annotate structures for researchers trying to solve fundamental biological questions. Still a work in progress, Abu-Juam took the source code for the former and was reworking it for the latter type of data. It was a novel project both for the Davis lab and for Abu-Juam, who embraced every exciting aspect and challenge of the work.
As a budding computational biologist who loves computing and programming all day, some of the skills Abu-Juam gained had nothing to do with programming. Her mentor, second-year graduate student Maria Carreira, applauded Abu-Juam the first time she decided to tackle a problem by taking a break.
“If I was really stuck and had no idea what to do, I would take the time to not think about any of that—just listen to music,” Abu-Juam says. “I eventually found a balance for when I need to walk around for a bit or when I need to just push through.”
Going outside or walking is “one of the most useful debugging tools,” according to Carreira.
The programming work Abu-Juam undertook with Carreira’s guidance required fluency in machine learning and biology. That meant that, as a mentor, Carreira had to be able to explain the biological relevance of the project as well as concepts in machine learning and generative modeling.
“People say this a lot, but you learn a lot from your mentee,” Carreira says. “Every day as graduate students, we’re learning a lot of new things. Feeling that someone’s learning from what I know is very rewarding.”
Davis said the MSRP-Bio program benefits both the students and those who guide them, often postdocs or graduate students like Carreira.
“I think it’s really valuable for their development as mentors—thinking about how to structure a ten-week research project, how to motivate students, and how to set up an environment where the mentee is most likely to succeed,” Davis says. “There’s some art to that, and I think the students that mentor MSRP students start to develop those skills.”
Carreira wasn’t solely providing guidance for Abu-Juam’s project, however. She also directed Abu-Juam to resources for programs Abu-Juam was interested in and suggested how to network as a student. They also connected on a personal level over a mutual interest in solving biological problems using programming and machine learning.
“It’s important to surround yourself with supportive people,” Carreira says. “I’m excited to see Fareeda succeeding in her career, and I look forward to staying in touch as she prepares her graduate school applications.”
Carreira also tried to pass on the wisdom that being able to discuss one’s work is critical. Abu-Juam, as one of 47 students participating in the program this year, had that opportunity right from the start by sharing her work with her MSRP-Bio cohort.
“That’s not only a strength of the program but also something I think all young scientists have to go through,” Carreira says. “One of the most important things, when we’re talking about science, is to be able to clearly articulate and explain what we’re doing to people who may have never heard of—or know why they should care about—our work.”
The BSG-MSRP-Bio program also brings in faculty to speak to the cohort, and Abu-Juam was given much to consider about when and why to apply to graduate school.
She recalled asking one speaker, Emeritus Faculty and Koch Institute Intramural Faculty Phillip Sharp, how to know if you are ready to apply.
“He said ‘The question isn’t are you ready. You’re not going to be ready—but is the school ready to help you get there?’” she says. “It’s been very nice having these programs where having this breadth of people to talk to has helped me think about where I want to be later on. It really reinforced going to grad school, even if I have doubts about being ready. Everyone says they don’t know what they’re doing—we’re all in the same boat.”
She also spoke with Assistant Professor of Biology and Koch Institute Intramural Faculty Yadira Soto-Feliciano, who shared her own experience about taking time to be a lab tech before starting graduate school.
“She was so nice, and she helped me look at her experience and weigh the pros and cons of it—and talked to me about making the decision for myself,” Abu-Juam recalls.
Abu-Juam was also selected to be a Gould fellow. She had a chance to meet Mike Gould and Sara Moss and was impressed by how engaged they were with the MSRP-Bio students. Established in 2016, The Bernard S. and Sophie G. Gould Fund supports students participating in the program.
“They came in with a stack of the essays we wrote as part of the application process, and you could tell they read them because once they matched faces with names, they’d ask questions, and you know they’d read and really thought about us,” she says.
Abu-Juam, who attends The College of Wooster in Wooster, Ohio, this fall, is already reconsidering what classes to enroll in when she returns to her home institution to better prepare for a future in computational biology and machine learning.
Ultimately, Abu-Juam says that, as a result of her time at MIT, she’s learned not to be discouraged when things aren’t going well, a valuable but humbling lesson for any aspiring scientist.
“Research is definitely not easy. You don’t just sit there and figure everything out,” she says. “It’s a very cumulative process of small successes that keep compounding, combined with long periods of confusion—but confusion that ultimately leads you somewhere.”