• Mostafavi, Sara


    Investigator, BC Children's Hospital
    Assistant Professor, Department of Medical Genetics, Faculty of Medicine and Department of Statistics, Faculty of Science, University of British Columbia

    Degrees / Designations


    Primary Area of Research
    Healthy Starts
    Secondary Area(s) of Research
    Lab Phone
    Heidi Cheung
    Mailing Address

    BC Children's Hospital Research Institute
    Room 3110A
    950 West 28th Avenue
    Vancouver, BC V5Z 4H4

    Affiliate Websites
    Research Areas
    • Computational Biology
    • Regulatory Networks
    • Genetics of Complex Traits
    • Psychiatric Genetics
    • Machine Learning in Computational Biology
    • Genomics
    The production of diverse types of high-dimensional and high-throughput biological data has increased tremendously in the last decade, presenting novel opportunities to develop and apply computational and machine learning approaches to understand the genetics of human diseases. However, the high dimensionality of this data, whereby up to millions of diverse and heterogeneous “features” are measured in a single experiment, coupled with the prevalence of systematic confounding factors present significant challenges in disentangling bona fide associations that are informative of causal molecular events in disease.

    The research program in my lab focuses on designing tailored computational models and algorithms for integrating multiple types of high-dimensional “omics” data, with the ultimate goal of disentangling meaningful molecular correlations for common diseases such as psychiatric disorders and cancers.

    Current Projects
    The research program in the Mostafavi lab focuses on developing and applying computational and machine learning approaches for integrating and interpreting high-dimensional genomics data.  In particular, three ongoing research projects are summarized below.

    (a) Integrating multiple data types for understanding the genetics of complex traits: Common diseases are multifactorial with contributions from multiple genetic and environmental factors. The availability of multiple types of genomics data (e.g., genome, methylome, transcriptome, and proteome) now allow us to build a comprehensive understanding of varied types of risk factors that underlie complex diseases.  For example, the combination of genotyping and epigenomic data can summarize the effect of genetic factors, environmental factors, and interactions between the genetic and the environmental at the cellular level. To this end, we are developing computational models that integrate multiple types of genomics data in the context of complex diseases, with the goal of disentangling meaningful, and likely causal, from merely correlated or downstream factors. 

    (b) Understanding the impact of genetic variation on cellular traits:  In order to understand how genetic variation results in disease, we must first understand the impact of such variation on cellular traits.  We are interested in developing predictive computational models for linking genetic variation to multiple types of cellular traits, including histone modification, gene and protein expression levels. In particular, we are working to develop approaches that take into account tissue- ans cell-specificity when making such predictions. 

    (c) Predicting gene function from heterogeneous data sources: A major goal in molecular biology is to determine the functional role of all genes and proteins in a cell.  Our current knowledge of gene function is limited: majority of human genes (or proteins) have not yet been associated with an informative function(s).  With the availability of large and diverse types of genomic data, we can now make rapid progress in this domain. We are developing and applying computational approaches for integrating multiple types of genomics data in order to predict the function of uncharacterized genes in a genome-wide and context-specific manner. 
    Selected Publications
    Mostafavi S, Oritz-Lopez A, Bogue M, Hattori K, Pop C, Koller D, Mathis D, Benoist C, and the ImmGenConsortium. Variation and genetic control of gene expression in primary immunocytes across inbred mouse strains. Journal of Immunology. 2014. PMID: 25267973 

    Mostafavi S, Battle A, Zhu X, Potash JB, Weissman MM, Shi J, Beckman K, Haudenschild C, McCormick C, Mei R, Gameroff MJ, Gindes H, Adams P, Goes FS, Mondimore FM, Mackinnon DF, Notes L, Schweizer B, Furman D, Montgomery SB, Urban AE, Koller D, Levinson DF.  Type I interferon signaling genes in recurrent major depression: increased expression detected by whole-blood RNA sequencing. Molecular Psychiatry. 2014. PMID: 24296977

    Battle A, Mostafavi S, Zhu X, Potash JB, Weissman MM, McCormick C, Haudenschild CD, Beckman KB, Shi J, Mei R, Urban AE, Montgomery SB, Levinson DF, Koller D. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Research. 2014. PMID: 24092820

    Raj T, Rothamel K, Mostafavi S, Ye C, Lee MN, Replogle JM, Feng T, Lee M, Asinovski N, Frohlich I, Imboywa S, Von Korff A, Okada Y, Patsopoulos NA, Davis S, McCabe C, Paik HI, Srivastava GP, Raychaudhuri S, Hafler DA, Koller D, Regev A, Hacohen N, Mathis D, Benoist C, Stranger BE, De Jager PL. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science. 2014. PMID: 24786080 

    Mostafavi S, Battle A, Zhu X, Urban AE, Levinson D, Montgomery SB, Koller D. Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. PLoS One. 2013. PMID: 23874524

    Mostafavi S, Goldenberg A, Morris Q. Labeling nodes using three degrees of propagation. PLoS One. 2012. PMID: 23284828

    Mostafavi S, Morris Q. Combining many interaction networks to predict gene function and analyze gene lists. Proteomics. 2012 (Review article). PMID: 22589215

    Mostafavi S, Morris Q. Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics. 2010. PMID: 20507895
    Honours & Awards
    Research Group Members