HOUSTON – (Aug. 1, 2018) – The National Science Foundation has awarded two grants for a combined $1.5 million to Rice University computational biologist Luay Nakhleh to expand big data techniques in the fight against cancer and to scale up methods that infer connections between evolutionary pathways.
While the projects use similar strategies to track evolutionary pathways, one focuses on species-level analysis, while the other drops down to the level of single cells.
Nakhleh’s research group specializes in computational research related to evolution and develops big-data tools that use genetic data to find previously unknown connections between species. Using a statistical technique called inference, the team can estimate the probability of that genes in one species are related to genes in another.
A four-year grant will allow Nakhleh’s lab to expand the capabilities of PhyloNet, an open-source software package he and his team developed to determine aspects of evolution that wouldn’t show up on a standard evolutionary — or phylogenetic — tree but would appear as part of a network.
Phylogenetic networks are branching diagrams of evolutionary progression based on similarities and differences in species’ genetic characteristics, but they are limited in the amount of genetic data they can handle at once.
“We have developed the computation model so we know how to do these comparisons,” said Nakhleh, a professor of computer science and biosciences and chair of Rice’s Department of Computer Science.”Now we want to do them fast and for large data sets. Now that we know how to do them for five genomes, how do we do it for 300?”
The second grant will fund a three-year project in collaboration with the University of Texas MD Anderson Cancer Center to better understand why some cancer cells spread and mutate differently than others, a phenomenon that can complicate cancer diagnosis and treatment.
“In the last four or five years, the medical community has started to get interested in evolutionary questions related to cancer, and now they’re gathering data we can use,” Nakhleh said. “The computational community has been developing tools for this kind of analysis for a long time, but we have focused on how species and genes evolve.”
Tools that test for variations between the genetic code of species can be adapted for cancer research, he said.
“Cancer cells are heterogeneous,” Nakhleh said. “If you sequence multiple cells from a patient biopsy, the average won’t reflect what’s going on in the individual cells.”
Nakhleh’s MD Anderson collaborator, computational biologist Ken Chen, made Nakhleh aware of work by geneticist and colleague Nicholas Navin to extract sequence data from individual cancer cells.
“Cells divide, DNA replicates and cancers start to evolve as mutations accumulate,” he said. “That’s why single-cell data is ideal for doing phylogenetic or evolutionary analysis.”
Nakhleh said cell data will help researchers understand when and how cancers evolved in individual patients. “We have a lot of questions: Why do cells become resistant to chemotherapy or targeted drugs? Why do certain cancers become metastatic? Why does one patient with lung cancer survive for 50 years and another dies after six months? Not all the answers are in the genome, but we think we can start to go after basic genetic underpinnings of these kinds of phenomena.”
He said his group will draw upon mathematics, computer science and statistics as it develops algorithms to infer the evolutionary histories of tumor cells. “Usually when you say evolutionary biology to medical people, they think that’s an academic question,” Nakhleh said. “But they are missing so much of what’s happening because at the end of the day, cancer is about evolution and mutation. It’s not an academic question; it’s about getting at the root cause of what’s happening.”