Research Overview

We are broadly interested in the discovery, functional analysis, and therapeutic targeting of genes that are altered in human disease.  We seek to understand how different types of genetic changes affect the function of human genes and influence the molecular phenotype within a single cell.  By understanding the underlying function of a genetic mutation at its fundamental level, we can identify potential targeted therapies for a wide variety of clinical diseases.

Our research lab leverages genomic tools to understand how both rare and common human genetic variation contribute to and cause human disease.  As the sequencing technology has matured, one of the major challenges in genomics is interpreting the majority of the DNA base pairs that are sequenced within an individual -- an important next step in integration of research findings into the clinical setting.

The lab takes a bidirectional approach. First, we are interested in rare genetic syndromes caused by genetic changes in genes that function to organize DNA through chromatin modification.  The lab leverages functional genomic approaches (RNA-seq, ChIP-seq, methylation-seq, CLIP-seq, and Hi-C) to 1) understand how rare deleterious mutations in chromatin modifiers affect downstream pathways and human development in a cellular model system,  2) identify modifiers of disease severity, and 3) prioritize putative drug targets. Second, we are also interested in the shared genetic basis of monogenic and complex diseases. We are using existing large scale GWAS data sets to better identify and interpret findings by leveraging the extremes of the phenotypic spectrum (monogenic/Mendelian disease). 

Our current focus is on genetic syndromes that are due to rare pathogenic mutations in genes that are important for chromatin conformation (a.k.a. chromatin modifiers). 

Developing a multi-omics approach to rapidly identify critical pathways and mechanisms in patients with pathogenic mutations in the gene KAT6A.

Understanding the role of chromatin modifiers in human development and disease started several years ago when I led a study first describing a novel genetic syndrome with global developmental delay and syndromic features that were caused by rare de novo mutations in the gene KAT6A.  The KAT6A gene's major role is to acetylate histones and other proteins within a cell, allowing for the proper expression of RNA and proteins during human development. Within 4 years of our initial description in 2015, there are over 200 individuals worldwide with KAT6A Syndrome, making it a common causes of syndromic developmental delay.  We recently published the largest study of KAT6A syndrome to date (Kennedy J, et al 2019 Genetics in Medicine) that described the mutational spectrum (Figure 1) and clinical phenotypes in 72 patients from around the world.

Figure 1. Mutations identified in 72 patients with KAT6A syndrome (Kennedy J, ...Arboleda VA* & Newbury-Ecob R* co-corresponding author, Genetics in Medicine 2018).

Figure 1. Mutations identified in 72 patients with KAT6A syndrome (Kennedy J, ...Arboleda VA* & Newbury-Ecob R* co-corresponding author, Genetics in Medicine 2018).

We have an active research program around KAT6A and related genes, as mutations in chromatin modifier genes typically have similar features, suggesting a related etiology.  In the lab, we work to develop patient-specific model systems to understand how mutations directly effect the differentiation of brain and muscle cells’ genetic regulation. To do this, we have created induced pluripotent stem cell lines from patient’s with known pathogenic mutations in genes such as KAT6A, ASXL1 and other related chromatin modifier genes to better understand how the differentiation in brain-specific and heart-specific cells are affected by the patient mutations (Figure 2).

In addition to deriving patient-specific cell lines, we also use CRISPR-Cas9 systems to create specific patient mutations in human cell lines to study the effect of these genes on chromatin structure and gene regulation.

NPC workflow-vaeds.png

For patients, a great resource is that KAT6A Foundation, with whom we work closely with, to understand the clinical phenotype of KAT6A syndrome and develop clinical biomarkers for improved clinical variant interpretation.  

If you are interested in contacting us to hear about the lab’s on-going research and how you might become involved, please fill out the form below and we will get back to you within 1-2 weeks.

Correlation Between Mendelian and Complex Disease

Disrupted genes in Mendelian syndromes have large effects on downstream targets, contributing to the multitude of syndromic features. One or more of these gene targets may affect the risk of common disease. Recent studies have demonstrated a combinatorial effect of Mendelian syndromes on risk of common disease, but until recently, with the advent of high throughput sequencing, we have not had the ability to systematically detect these interactions within an individual patient. While the relationship between Mendelian syndromes and common diseases acts through multiple layers of cellular regulation, we take a focused approach to study interactions at the level of chromatin structure (Hi-C, ChIP-seq and ATAC-seq) and transcriptional regulation. Understanding the joint influence of monogenic mutations and common disease variants on disease phenotypes allows us to unravel the underlying biological processes contributing to human disease.

The SNP effect size in a GWAS study is larger when the SNP is localized near a phenotype-matched Mendelian gene.

The SNP effect size in a GWAS study is larger when the SNP is localized near a phenotype-matched Mendelian gene.

Our more recent work has focused on quantifying the contribution of phenotype-specific mendelian genes to SNP-effect sizes from genome wide association studies (Freund MK, et al AJHG 2019). We found that when explored the effect of phenotype-matched Mendelian Disorder genes, the nearby GWAS signal had a larger effect. This suggested to us that non-coding variation that can modulate expression of Mendelian disease genes affect the same clinical phenotypes as those associated with the Mendelian disease, but with a more subtle effect.

To full explore these questions, we are interested in leveraging CRISPR based screen to test the effects of non-coding interactions on cell-type specific long-range interactions (see below).

Long range interactions in pre-adipocyte promoter Hi-C data demonstrate that specific GWAS variants have long-range interactions with phenotype-matched genes for BMI.

Long range interactions in pre-adipocyte promoter Hi-C data demonstrate that specific GWAS variants have long-range interactions with phenotype-matched genes for BMI.

 

Interrogating the molecular mechanisms of chromatin gene mutations.

Monogenic Mendelian syndromes, although individually rare, constitute a large burden on families and the health care system. Such disorders are caused by rare genetic variants that disrupt protein-coding genes to cause disease. In contrast, common diseases (affecting more than 5% of the population) are polygenic and likely caused by non-coding variants, most of which do not alter the protein and therefore likely regulate gene expression.

Emerging precision medicine initiatives focus on individualized diagnosis, prognosis and treatment based on the integration of clinical, genomic, epigenetic, and other biomarkers. Our lab seeks to advance these goals in the setting of rare Mendelian syndromes. While precision medicine has been wildly successful in providing genetic diagnoses through clinical whole exome sequencing, it has left in its wake a gap between our expanded diagnostic capability and our ability to provide therapies based on the genetic diagnosis.

The majority of patients now have a genetic diagnosis that ends the “diagnostic odyssey”, but leave clinicians with vexing questions regarding prognosis and treatments based on the genetic diagnosis. Our lab seeks to bridge this gap, leveraging both publicly available data for a wide variety of complex diseases and functional genomic data generated from samples in patients with rare monogenic disease. 

Software Resources

In our recent work, From Chemoproteomic-Detected Amino Acids to Genomic Coordinates: Insights into Precise Multi-omic Data Integration, we investigate potential sources of mapping errors in large-scale data integration pipelines and developed useful guidelines for future chemoproteomic-genetic studies. With an optimized workflow in hand, we then mapped publicly available Chemoproteomic Detected Amino Acids (CpDAA) and equivalent undetected amino acids in 3,840 proteins to genome-based predictions of missense deleteriousness and known disease-associated mutations from the ClinVar database. Our analysis revealed detected lysines to be enriched for harmful mutations compared and undetected lysines, and the opposite to be true for cysteine residues. Interestingly, higher cysteine reactivity was found to be associated with higher deleteriousness scores compared to less reactive CpD cysteine. Lastly, functional validation with the cysteine protease caspase-8 showcases how chemoproteomic measurements can complement genetic-based annotations to accurately identify functional amino acids in the human proteome.

Final Manuscript available here

https://www.biorxiv.org/content/10.1101/2020.07.03.186007v2

To explore the data:

http://mfpalafox.shinyapps.io/CpDAA

GitHub: https://mfpfox.github.io/MAPPING/


We developed the method DMRscaler to identify regions of differential methylation (DMRs) across the full range of genomic scale. Biologically relevant epigenetic features occur at all levels of genomic scale, from modifications of single basepairs that affect transcription factor binding to genome-wide epigenetic effects in gametogenesis and early development as well as anywhere in between these extremes. We study rare genetic diseases caused by mutations in chromatin modifier genes, where the size of the downstream genomic region of affect is potentially hypervariable. This makes methods that accurately identify the scale of differential epigenetic features critical to developing our understanding of these diseases. In our recent work, DMRscaler: A Scale-Aware Method to Identify Regions of Differential DNA Methylation Spanning Basepair to Muli-Megabase Features, we demonstrate in simulation and real data how DMRscaler outperforms existing methods in the task of identifying the scale of DMRs < 100 bp in length up to > 100 Mb in length, and show in analyses of KAT6A, Sotos, and Weaver syndromes how this can be leveraged to identify higher level features of genome organization, such as gene clusters, that act as the unit of altered epigenetic state.

PrePrint Paper: https://www.biorxiv.org/content/10.1101/2021.02.03.428187v2

Code: https://leroybondhus.github.io/DMRscaler/