Our research focuses on developing computational frameworks that place genomic data into the context of gene regulatory networks and on exploring how these networks influence cancer.
- The regulation of gene expression involves a complicated network of interacting elements (Fig 1).
Transcription factor (TF) complexes, promoters, inhibitors, and enhancers all play a role in regulating
transcription of genes into mRNA, while microRNAs (miRNAs) fine-tune levels of mRNA expression. What emerges is not
a single set of interactions, or even a single pathway, but a complex network of interacting genes and gene
- Our driving hypothesis is that the complex clinical phenotypes we observe in cancer cannot be adequately defined by individual genes. Instead, we must consider the underlying network of regulatory interactions between multiple different biological components. Our research goals emphasize the importance of correctly modeling alterations in gene regulation, relating those changes to cellular function, and ultimately, identifying the underlying biological mechanisms that drive cancer development, progression, and clinical phenotypes (Fig 2):
1. Integration of multiple sources of `omics data using networks
- Many methods have been developed that try to capture the intricate interactions among the complex array of molecules that comprise the cell. However, few include multiple types of interactions. To address this issue, we incorporated miRNA target predictions into an algorithm that uses "message passing" between regulators and their target genes to infer gene regulatory processes (Fig 3). We used this method to identify target tumor suppressor genes of prognostic miRNAs. We are currently working on fine-tuning the prior transcription factor and miRNA regulatory networks we use as an input in these models. In future work, we plan to further expand our algorithms to model other types of post-transcriptional regulatory processes.
2. Understanding cancer data using pathways and network modules
- One challenge in both biological data analysis in general, and in regulatory network modeling in particular, is how to analyze complex information in a meaningful way. For example, to interpret regulatory networks we often start by determining which genes in the network are most likely to be affected by transcription factors and/or miRNAs. This involves applying a summary statistic to quantify information about each of the genes in the network so that pathway enrichment analysis can be performed. Other network metrics, however, could potentially highlight information regarding biological processes important for mediating a particular disease state. We are therefore expanding the way we analyze and interpret regulatory networks by taking into account the network's higher order structure. In cancer, we hope to integrate network topology with mutational patterns to uncover recurrent mutations in regulatory pathways. This will allow us to translate the results of network analysis into actionable biological hypotheses.
3. Enabling precision medicine through single-sample network modeling
- Network reconstruction algorithms often draw on large numbers of measured expression samples to tease out subtle signals and infer connections between genes or gene products. The result is an aggregate network model representing a single estimate for edge likelihoods. While informative, aggregate models fail to capture the heterogeneity represented in a population. We recognized that edges in an aggregate network can be modeled as a linear combination of edges from the networks representing each individual sample. Based on this, we can solve for single-sample networks by comparing two network models, one including and one excluding the sample of interest. We are applying this method to large-scale cancer datasets to better understand the effects of gene regulation on specific cancer types, and to understand pan-cancer gene regulation. Importantly, this development has potential applications in genomically informed precision medicine, because it allows us to statistically associate individual networks and network properties with clinical endpoints, such as patient survival (Fig 4) and response to therapy. It will also allow us to integrate individual patient networks with other `omics data types to uncover new dependencies between gene regulation, genotype, and phenotype.