Study zeroes in on genes involved in Crohn’s disease

Authors: Sazonovs, A., Stevens, C.R., Venkataraman, G.R. et al. 

Media Release: Study zeroes in on genes involved in Crohn’s disease – Wellcome Sanger Institute

Link to research paper: Large-scale sequencing identifies multiple genes and rare variants associated with Crohn’s disease susceptibility | Nature Genetics


GWASs in CD, and inflammatory bowel disease (IBD) more generally, have successfully identified more than 200 loci contributing to risk of disease1,2,3,4. While most GWAS hits do not immediately implicate an obvious functional variant or gene, a subset have been directly mapped to coding variants (for example, NOD2IL23RATG16L1SLC39A8FUT2TYK2IFIH1SLAMF8PLCG2)5, providing more direct clues to pathogenesis. Further, targeted and genome-wide sequencing approaches have revealed additional, lower-frequency, disease-associated coding variants (for example, CARD9RNF186ADCY7INAVA/C1orf106SLC39A8NOD2)6,7,8,9 originally undetected by GWASs. Such coding variants, common and rare, have led to functional follow-up experiments demonstrating causal mechanisms for at least ten genes and have provided the most direct biological insights to emerge from genetic studies of IBD10,11,12,13.


Here, we demonstrate that large-scale exome sequencing can complement GWASs by pinpointing specific genes both indirectly implicated by GWASs as well as those not yet observed in GWASs. With high sensitivity to directly test individual variants down to 0.01% MAF, as well as assess burden of ultra-rare mutations, we begin to fill in the low-frequency and rare-variant component of the genetic architecture of CD. This component was not observable by earlier generations of CD GWAS meta-analyses, which have had more limited coverage of low-frequency and rare variation.

Past findings in IBD5, and most other complex diseases, suggest that while coding variants are vastly outnumbered by noncoding variation, they are highly enriched for associations to common and rare diseases. Furthermore, associated coding variants tend to have stronger effects than their noncoding counterparts, often keeping them lower in frequency via natural selection. While this alone validates the use of exome sequencing for efficiency’s sake, the primary advantage of targeting coding regions for discovery is that coding variants uniquely pinpoint genes, and often pathogenetic mechanisms, in a fashion that is at present far more challenging to achieve routinely for noncoding associations.

In the case of several of the new findings (for example, RELATAGAP), the coding variation here provides concrete evidence of genes previously indirectly implicated by independent noncoding GWAS associations. These identify the likely gene underlying these associations and build allelic series of natural perturbations at these genes. Moreover, IL10RA and RELA are known to harbor mutations causing rare, Mendelian, inflammatory gastrointestinal disorders, and this study extends the phenotypic spectrum resulting from perturbing genetic variation to more complex forms of CD.

From a functional perspective, the novel genes identified in the current study reiterate the central roles of innate and adaptive immune cells as well as autophagy in CD pathogenesis. Moreover, the involvement of PDLIM5SDF2L1HGFACPAF-R and CCR7 pathways, in addition to the previously reported causal variant in SMAD3 (ref. 5), highlights the emerging role of MCs in the development and maintenance of intestinal inflammation (Fig. 2)18.

Also, while previous studies have demonstrated the disruption of MC biology in IBD, the current findings of coding variants in these genes demonstrate that these cells and functions causally contribute to disease susceptibility. Furthermore, the association of these pathways with CD pathogenesis provides an additional rationale for development of therapeutic modalities that can re-establish the balance to the mesenchymal niche, as it is believed that genetic evidence for a drug target has a measurable impact drug development43,44.

We expect that, in the next year, expanded sequencing efforts underway in ulcerative colitis will come to completion, enabling a more comprehensive survey of low-frequency and rare variation in ulcerative colitis, and IBD in general. Integrated with a much larger GWAS spearheaded in parallel by the IIBDGC, we expect a substantial number of conclusively linked genes and informative allelic series to emerge.