Overview

Innovative technologies now allow us to probe the genome in more dimensions and at higher resolution than ever before, providing a wealth of information for studying the genomic basis of complex traits. However, meaningful biological insights are often masked by technical artifacts, systematic biases, or low signal-to-noise ratio (“needle in a haystack”). These challenges demand tailored statistical methodology in order to unlock the full potential of emerging assays.

My research group focuses on developing novel frameworks and rigorous inferential procedures that exploit the increased scope and scale of high-throughput sequencing data, with the ultimate goal of uncovering new molecular signals in cancer, child health, and development.

Publications

Complete Reference Genome and Pangenome Expand Biologically Relevant Information for Genome-Wide DNA Methylation Analysis Using Short-Read Sequencing and Array Data
BioRxiv
10/2024

Discrepancies in readouts between Infinium MethylationEPIC v2.0 and v1.0 reflected in DNA methylation-based tools: implications and considerations for human population epigenetic studies
Beryl Zhuang and Marcia Smiti Jude and Chaini Konwar and Natan Yusupov and Calen Patrick Ryan and Hannah-Ruth Engelbrecht and Joanne Whitehead and Alexandra A. Halberstam and Julia MacIsaac and Kristy Dever and Toan Khanh Tran and Kim Korinek and Zachary Zimmer and Nanette R. Lee and Thomas W. McDade and Christopher W. Kuzawa and Kim M. Huffman and Daniel W Belsky and Elisabeth Binder and Darina Czamara and Keegan Korthauer and Michael Kobor
DOI: 10.1101/2024.07.02.600461
09/2024

vmrseq: Probabilistic Modeling of Single-cell Methylation Heterogeneity
Ning Shen and Keegan Korthauer
DOI: 10.1101/2023.11.20.567911
11/2023

Bayesian Decision Curve Analysis with bayesDCA
arXiv
DOI: 10.48550/arXiv.2308.02067
2023

FTD-associated behavioural and transcriptomic abnormalities in 'humanized' progranulin-deficient mice: A novel model for progranulin-associated FTD
Neurobiology of Disease
DOI: 10.1016/J.NBD.2023.106138
2023

Conservation and divergence of canonical and non-canonical imprinting in murids
Genome Biology
DOI: 10.1186/S13059-023-02869-1
2023

Reversal of viral and epigenetic HLA class I repression in Merkel cell carcinoma
Journal of Clinical Investigation
DOI: 10.1172/JCI151666
2022

Detecting Neuroendocrine Prostate Cancer Through Tissue-Informed Cell-Free DNA Methylation Analysis
Clinical Cancer Research
DOI: 10.1158/1078-0432.CCR-21-3762
2022

Differential substrate use in EGF- and oncogenic KRAS-stimulated human mammary epithelial cells.
The FEBS journal
Keibler MA and Dong W and Korthauer KD and Hosios AM and Moon SJ and Sullivan LB and Liu N and Abbott KL and Arevalo OD and Ho K and Lee J and Phanse AS and Kelleher JK and Iliopoulos O and Stephanopoulos G
DOI: 10.1111/febs.15858
PubMed: 33811729
04/2021

A compositional model to assess expression changes from single-cell rna-seq data
Annals of Applied Statistics
DOI: 10.1214/20-AOAS1423
2021

Reprogramming of the FOXA1 cistrome in treatment-emergent neuroendocrine prostate cancer
Nature Communications
DOI: 10.1038/s41467-021-22139-7
2021

Androgen receptor and MYC equilibration centralizes on developmental super-enhancer
Nature Communications
DOI: 10.1038/S41467-021-27077-Y
2021

CDK4/6 inhibition reprograms the breast cancer enhancer landscape by stimulating AP-1 transcriptional activity
Nature Cancer
April C. Watt and Paloma Cejas and Molly J. DeCristo and Otto Metzger-Filho and Enid Y. N. Lam and Xintao Qiu and Haley BrinJones and Nikolas Kesten and Rhiannon Coulson and Alba Font-Tello and Klothilda Lim and Raga Vadhi and Veerle W. Daniels and Joan Montero and Len Taing and Clifford A. Meyer and Omer Gilan and Charles C. Bell and Keegan D. Korthauer and Claudia Giambartolomei and Bogdan Pasaniuc and Ji-Heui Seo and Matthew L. Freedman and Cynthia Ma and Matthew J. Ellis and Ian Krop and Eric Winer and Anthony Letai and Myles Brown and Mark A. Dawson and Henry W. Long and Jean J. Zhao and Shom Goel
DOI: 10.1038/s43018-020-00135-y
11/2020

Transparency and reproducibility in artificial intelligence.
Nature
Haibe-Kains B and Adam GA and Hosny A and Khodakarami F and Massive Analysis Quality Control (MAQC) Society Board of Directors and Waldron L and Wang B and McIntosh C and Aerts HJWL
DOI: 10.1038/s41586-020-2766-y
PubMed: 33057217
10/2020

Detection of urothelial carcinoma using plasma cell-free methylated DNA.
Journal of Clinical Oncology
DOI: 10.1200/jco.2020.38.15_suppl.5046
05/2020

Detection of renal cell carcinoma using plasma and urine cell-free DNA methylomes
Nature Medicine
Nuzzo, P.V. and Berchuck, J.E. and Korthauer, K. and Spisak, S. and Nassar, A.H. and Abou Alaiwi, S. and Chakravarthy, A. and Shen, S.Y. and Bakouny, Z. and Boccardo, F. and Steinharter, J. and Bouchard, G. and Curran, C.R. and Pan, W. and Baca, S.C. and Seo, J.-H. and Lee, G.-S.M. and Michaelson, M.D. and Chang, S.L. and Waikar, S.S. and Sonpavde, G. and Irizarry, R.A. and Pomerantz, M. and De Carvalho, D.D. and Choueiri, T.K. and Freedman, M.L.
DOI: 10.1038/s41591-020-0933-1
2020

Prostate cancer reactivates developmental epigenomic programs during metastatic progression
Nature Genetics
Pomerantz, M.M. and Qiu, X. and Zhu, Y. and Takeda, D.Y. and Pan, W. and Baca, S.C. and Gusev, A. and Korthauer, K.D. and Severson, T.M. and Ha, G. and Viswanathan, S.R. and Seo, J.-H. and Nguyen, H.M. and Zhang, B. and Pasaniuc, B. and Giambartolomei, C. and Alaiwi, S.A. and Bell, C.A. and O?Connor, E.P. and Chabot, M.S. and Stillman, D.R. and Lis, R. and Font-Tello, A. and Li, L. and Cejas, P. and Bergman, A.M. and Sanders, J. and van der Poel, H.G. and Gayther, S.A. and Lawrenson, K. and Fonseca, M.A.S. and Reddy, J. and Corona, R.I. and Martovetsky, G. and Egan, B. and Choueiri, T. and Ellis, L. and Garraway, I.P. and Lee, G.-S.M. and Corey, E. and Long, H.W. and Zwart, W. and Freedman, M.L.
DOI: 10.1038/s41588-020-0664-8
2020

Plasma cell-free DNA variant analysis compared with methylated DNA analysis in renal cell carcinoma
Genetics in Medicine
Lasseter, K. and Nassar, A.H. and Hamieh, L. and Berchuck, J.E. and Nuzzo, P.V. and Korthauer, K. and Shinagare, A.B. and Ogorek, B. and McKay, R. and Thorner, A.R. and Lee, G.-S.M. and Braun, D.A. and Bhatt, R.S. and Freedman, M. and Choueiri, T.K. and Kwiatkowski, D.J.
DOI: 10.1038/s41436-020-0801-x
2020

Detection and accurate false discovery rate control of differentially methylated regions from whole genome bisulfite sequencing.
Biostatistics (Oxford, England)
Korthauer K and Chakraborty S and Benjamini Y and Irizarry RA
DOI: 10.1093/biostatistics/kxy007
PubMed: 29481604
07/2019

A practical guide to methods controlling false discoveries in computational biology
Genome Biology
Korthauer, K. and Kimes, P.K. and Duvallet, C. and Reyes, A. and Subramanian, A. and Teng, M. and Shukla, C. and Alm, E.J. and Hicks, S.C.
DOI: 10.1186/s13059-019-1716-1
PubMed: 31164141
2019

Genome-wide repressive capacity of promoter DNA methylation is revealed through epigenomic manipulation
Keegan Korthauer and Rafael A. Irizarry
DOI: 10.1101/381145
08/2018

High-throughput identification of RNA nuclear enrichment sequences
The EMBO Journal
Chinmay J Shukla and Alexandra L McCorkindale and Chiara Gerhardinger and Keegan D Korthauer and Moran N Cabili and David M Shechner and Rafael A Irizarry and Philipp G Maass and John L Rinn
DOI: 10.15252/embj.201798452
03/2018

A Somatically Acquired Enhancer of the Androgen Receptor Is a Noncoding Driver in Advanced Prostate Cancer
Cell
Takeda, D.Y. and Spisák, S. and Seo, J.-H. and Bell, C. and O'Connor, E. and Korthauer, K. and Ribli, D. and Csabai, I. and Solymosi, N. and Szállási, Z. and Stillman, D.R. and Cejas, P. and Qiu, X. and Long, H.W. and Tisza, V. and Nuzzo, P.V. and Rohanizadegan, M. and Pomerantz, M.M. and Hahn, W.C. and Freedman, M.L.
DOI: 10.1016/j.cell.2018.05.037
2018

Detection and accurate False Discovery Rate control of differentially methylated regions from Whole Genome Bisulfite Sequencing
Keegan D. Korthauer and Sutirtha Chakraborty and Yuval Benjamini and Rafael A. Irizarry
DOI: 10.1101/183210
08/2017

IPI59: An Actionable Biomarker to Improve Treatment Response in Serous Ovarian Carcinoma Patients
Statistics in Biosciences
Choi, J. and Ye, S. and Eng, K.H. and Korthauer, K. and Bradley, W.H. and Rader, J.S. and Kendziorski, C.
DOI: 10.1007/s12561-016-9144-1
2017

A statistical approach for identifying differential distributions in single-cell RNA-seq experiments
Genome Biology
Korthauer, K.D. and Chu, L.-F. and Newton, M.A. and Li, Y. and Thomson, J. and Stewart, R. and Kendziorski, C.
DOI: 10.1186/s13059-016-1077-y
2016

scDD: A statistical approach for identifying differential distributions in single-cell RNA-seq experiments
Korthauer KD and Chu L and Newton MA and Li Y and Thomson J and Stewart R and Kendziorski C
DOI: 10.1101/035501
12/2015

MADGiC: A model-based approach for identifying driver genes in cancer
Bioinformatics
Korthauer, K.D. and Kendziorski, C.
DOI: 10.1093/bioinformatics/btu858
2015

Chromosomal copy number alterations and HPV integration in cervical precancer and invasive cancer
Carcinogenesis
Bodelon, C. and Vinokurova, S. and Sampson, J.N. and den Boon, J.A. and Walker, J.L. and Horswill, M.A. and Korthauer, K. and Schiffman, M. and Sherman, M.E. and Zuna, R.E. and Mitchell, J. and Zhang, X. and Boland, J.F. and Chaturvedi, A.K. and Dunn, S.T. and Newton, M.A. and Ahlquist, P. and Wang, S.S. and Wentzensen, N.
DOI: 10.1093/carcin/bgv171
2015

Methods for collapsing multiple rare variants in whole-genome sequence data
Genetic Epidemiology
Sung, Y.J. and Korthauer, K.D. and Swartz, M.D. and Engelman, C.D.
DOI: 10.1002/gepi.21820
2014

Limited model antigen expression by transgenic fungi induces disparate fates during differentiation of adoptively transferred T cell receptor transgenic CD4 + T cells: Robust activation and proliferation with weak effector function during recall
Infection and Immunity
Wüthrich, M. and Ersland, K. and Pick-Jacobs, J.C. and Gern, B.H. and Frye, C.A. and Sullivan, T.D. and Brennan, M.B. and Filutowicz, H.I. and O'brien, K. and Korthauer, K.D. and Schultz-Cherry, S. and Klein, B.S.
DOI: 10.1128/IAI.05326-11
2012

The Genetic Network Controlling the Arabidopsis Transcriptional Response to Pseudomonas syringae pv. maculicola: Roles of Major Regulators and the Phytotoxin Coronatine
Molecular Plant-Microbe Interactions
DOI: 10.1094/MPMI-21-11-1408
2008

Predicting Cancer Subtypes Using Survival-Supervised Latent Dirichlet Allocation Models
Advances in Statistical Bioinformatics
Keegan Korthauer and John Dawson and Christina Kendziorski
DOI: 10.1017/cbo9781139226448.019

Research

Unraveling the spatial landscape of epigenomic signals
A common task in the interpretation of epigenomic data, which holds information about the genome not encoded in the DNA sequence itself, is the detection and inference of regions of interest. For example, it is of interest to detect segments of the genome that show significantly higher or lower DNA methylation levels with respect to disease state or developmental stage, as this particular modification to the DNA is known to influence gene regulation. However, the number of possible segments of all possible sizes is near infinite, leading to a massive multiple testing problem. Our group develops tailored statistical and computational approaches for powerful detection and inference of region-based epigenomic signals, while paying particular attention to spatial patterns. We are interested in designing and applying these techniques for the analysis of DNA methylation, histone modification, and chromatin accessibility assay data.

Predicting gene expression from epigenomic signals
It is widely known that epigenetic information, such as DNA methylation and histone modifications, plays a role in gene regulation. However, the prediction of gene expression from epigenomic signals is challenging due to interactions between different epigenomic marks as well as interactions between different regions of the genome. We are working on developing predictive models that account for these challenges and assess the predictive capacity for various epigenomic signals.

Understanding the genomic basis of complex traits
Our group develops computational approaches to study the genomic basis of a variety of complex traits. Our main focus areas currently include modeling the mutation spectrum of cancer genomes, revealing heterogeneity in single-cell gene expression during development, and characterizing the epigenomic landscape of prostate cancer. To maximize impact of our work, we also provide open source computational tools that enable other scientists to make meaningful biological insights.

Research Group Members

Giuliano Cruz, Graduate Research Assistant
Erick Navarro
Ning Shen, Graduate Research Assistant