Kuijjer Lab

Research

Our research focuses on developing computational frameworks that place genomic data into the context of gene regulatory networks and on exploring how these networks influence complex disease, with a main focus on cancer.

    The regulation of gene expression involves a complicated network of interacting elements (Fig 1). Transcription factor (TF) complexes, promoters, inhibitors, and enhancers all play a role in regulating transcription of genes into mRNA, while microRNAs (miRNAs) fine-tune levels of mRNA expression. What emerges is not a single set of interactions, or even a single pathway, but a complex network of interacting genes and gene products.
    Fig 1. Genes are regulated by transcription factors, microRNAs, and other regulatory factors.
    Our driving hypothesis is that the complex clinical phenotypes we observe in cancer cannot be adequately defined by individual genes. Instead, we must consider the underlying network of regulatory interactions between multiple different biological components. Our research goals emphasize the importance of correctly modeling alterations in gene regulation, relating those changes to cellular function, and ultimately, identifying the underlying biological mechanisms that drive disease development, progression, and clinical phenotypes (Fig 2):
    Fig 2. Our research focus areas.

1. Integration of multiple sources of `omics data using networks

    Many methods have been developed that try to capture the intricate interactions among the complex array of molecules that comprise the cell. However, few include multiple types of interactions. To address this issue, we incorporated miRNA target predictions into an algorithm that uses "message passing" between regulators and their target genes to infer gene regulatory processes (Fig 3). We used this method to identify target tumor suppressor genes of prognostic miRNAs. We are currently working on expanding our algorithms to model other types of gene and protein regulatory processes.
    Fig 3. We integrate multiple `omics data types using a message passing approach (PANDA/PUMA), which allows us to estimate interactions between regulators and their target genes. For more information, please see the Tools section.

2. Understanding disease using pathways and network modules

    One challenge in both biological data analysis in general, and in regulatory network modeling in particular, is how to analyze complex information in a meaningful way. For example, to interpret regulatory networks we often start by determining which genes in the network are most likely to be affected by transcription factors and/or miRNAs. This involves applying a summary statistic to quantify information about each of the genes in the network so that pathway enrichment analysis can be performed. Other network metrics, however, could potentially highlight information regarding biological processes important for mediating a particular disease state. We are therefore expanding the way we analyze and interpret regulatory networks by taking into account the network's higher order structure. We are also integrating network topology with mutational patterns to uncover recurrent mutations in regulatory pathways. This will allow us to translate the results of network analysis into actionable biological hypotheses.

3. Enabling precision medicine through single-sample network modeling

    Network reconstruction algorithms often draw on large numbers of measured expression samples to tease out subtle signals and infer connections between genes or gene products. The result is an aggregate network model representing a single estimate for edge likelihoods. While informative, aggregate models fail to capture the heterogeneity represented in a population. We recognized that edges in an aggregate network can be modeled as a linear combination of edges from the networks representing each individual sample. Based on this, we can solve for single-sample networks by comparing two network models, one including and one excluding the sample of interest. We are applying this method to large-scale cancer datasets to better understand the effects of gene regulation on specific cancer types, and to understand pan-cancer gene regulation. Importantly, this development has potential applications in genomically informed precision medicine, because it allows us to statistically associate individual networks and network properties with clinical endpoints, such as patient survival (Fig 4) and response to therapy. It will also allow us to integrate individual patient networks with other `omics data types to uncover new dependencies between gene regulation, genotype, and phenotype.
    Fig 4. We can model individual patient networks using patient-specific `omics data and thereby identify network properties important for clinical features, such as patient survival. For an overview of our tools, please see the Tools section.