ORCID Profile
0000-0002-5771-2290
Current Organisations
Garvan Institute of Medical Research
,
Murdoch Children's Research Institute
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: Cold Spring Harbor Laboratory
Date: 09-09-2016
DOI: 10.1101/074450
Abstract: Expression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem s les representing 44 tissues and 449 in iduals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in erse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 08-05-2015
Abstract: Human genomes show extensive genetic variation across in iduals, but we have only just started documenting the effects of this variation on the regulation of gene expression. Furthermore, only a few tissues have been examined per genetic variant. In order to examine how genetic expression varies among tissues within in iduals, the Genotype-Tissue Expression (GTEx) Consortium collected 1641 postmortem s les covering 54 body sites from 175 in iduals. They identified quantitative genetic traits that affect gene expression and determined which of these exhibit tissue-specific expression patterns. Melé et al. measured how transcription varies among tissues, and Rivas et al. looked at how truncated protein variants affect expression across tissues. Science , this issue p. 648 , p. 660 , p. 666 see also p. 640
Publisher: Elsevier BV
Date: 2018
DOI: 10.1016/J.NMD.2017.09.017
Abstract: Recessive mutations in MEGF10 (multiple epidermal growth factor 10) have been reported in a severe early onset disorder named Early Myopathy, Areflexia, Respiratory Distress and Dysphagia, and a milder form with cores in the muscle biopsy and a possible genotype-phenotype correlation determining the clinical presentation has been suggested. We undertook exome sequencing in a 66 year old male with a 20 year history of progressive proximal and distal weakness of upper and lower limbs, facial weakness and dysphagia, who developed respiratory failure requiring ventilation while still ambulant in his 50s. Muscle biopsy demonstrated myopathic changes with aggregation of myofibrillar proteins. Mutations in MEGF10 were identified: a novel essential splice site (c.1426+1G>T) and a novel missense variant (c.352T>C, p.(Cys118Arg)). We performed a detailed review of all reported MEGF10 cases (n = 20), and confirmed the presence of a genotype-phenotype correlation, namely that with ≥1 null mutation onset of respiratory dysfunction occurs in the first year of life, whereas with 2 missense mutations, respiratory dysfunction occurs at 10 years old or much later, as in the patient reported here. Our findings expand the phenotype of MEGF10 mutations to include onset in the 5th decade, and discuss the spectrum of MEGF10 related disease.
Publisher: Springer Science and Business Media LLC
Date: 31-10-2012
DOI: 10.1038/NATURE11632
Publisher: Springer Science and Business Media LLC
Date: 12-10-2017
DOI: 10.1038/NATURE24277
Abstract: Characterization of the molecular function of the human genome and its variation across in iduals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across in iduals and erse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-in idual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
Publisher: Springer Science and Business Media LLC
Date: 29-06-2017
DOI: 10.1038/S41598-017-03054-8
Abstract: Obesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS , which were sometimes found with accelerated growth rather than short stature as described previously. Nominally significant associations were found for rare functional variants in BBS1 , BBS9 , GNAS , MKKS , CLOCK and ANGPTL6 . The p.S284X variant in ANGPTL6 drives the association signal (rs201622589, MAF~0.1%, odds ratio = 10.13, p-value = 0.042) and results in complete loss of secretion in cells. Further analysis including additional case-control studies and population controls (N = 260,642) did not support association of this variant with obesity (odds ratio = 2.34, p-value = 2.59 × 10 −3 ), highlighting the challenges of testing rare variant associations and the need for very large s le sizes. Further validation in cohorts with severe obesity and engineering the variants in model organisms will be needed to explore whether human variants in ANGPTL6 and other genes that lead to obesity when deleted in mice, do contribute to obesity. Such studies may yield druggable targets for weight loss therapies.
Publisher: Public Library of Science (PLoS)
Date: 24-05-2018
Publisher: Springer Science and Business Media LLC
Date: 19-10-2015
DOI: 10.1038/SREP15145
Abstract: Aging is one of the most important biological processes and is a known risk factor for many age-related diseases in human. Studying age-related transcriptomic changes in tissues across the whole body can provide valuable information for a holistic understanding of this fundamental process. In this work, we catalogue age-related gene expression changes in nine tissues from nearly two hundred in iduals collected by the Genotype-Tissue Expression (GTEx) project. In general, we find the aging gene expression signatures are very tissue specific. However, enrichment for some well-known aging components such as mitochondria biology is observed in many tissues. Different levels of cross-tissue synchronization of age-related gene expression changes are observed and some essential tissues (e.g., heart and lung) show much stronger “co-aging” than other tissues based on a principal component analysis. The aging gene signatures and complex disease genes show a complex overlapping pattern and only in some cases, we see that they are significantly overlapped in the tissues affected by the corresponding diseases. In summary, our analyses provide novel insights to the co-regulation of age-related gene expression in multiple tissues it also presents a tissue-specific view of the link between aging and age-related diseases.
Publisher: Springer Science and Business Media LLC
Date: 14-09-2016
DOI: 10.1038/NATURE19356
Publisher: Springer Science and Business Media LLC
Date: 09-2014
DOI: 10.1038/NG.3050
Publisher: Cold Spring Harbor Laboratory
Date: 06-2018
Abstract: Variation in RNA splicing (i.e., alternative splicing) plays an important role in many diseases. Variants near 5′ and 3′ splice sites often affect splicing, but the effects of these variants on splicing and disease have not been fully characterized beyond the two “essential” splice nucleotides flanking each exon. Here we provide quantitative measurements of tolerance to mutational disruptions by position and reference allele–alternative allele combinations. We show that certain reference alleles are particularly sensitive to mutations, regardless of the alternative alleles into which they are mutated. Using public RNA-seq data, we demonstrate that in iduals carrying such variants have significantly lower levels of the correctly spliced transcript, compared to in iduals without them, and confirm that these specific substitutions are highly enriched for known Mendelian mutations. Our results propose a more refined definition of the “splice region” and offer a new way to prioritize and provide functional interpretation of variants identified in diagnostic sequencing and association studies.
Publisher: Elsevier BV
Date: 07-2013
Publisher: Springer Science and Business Media LLC
Date: 16-05-2018
DOI: 10.1038/S41467-018-04332-3
Abstract: Neuromyelitis optica (NMO) is a rare autoimmune disease that affects the optic nerve and spinal cord. Most NMO patients ( 70%) are seropositive for circulating autoantibodies against aquaporin 4 (NMO-IgG+). Here, we meta-analyze whole-genome sequences from 86 NMO cases and 460 controls with genome-wide SNP array from 129 NMO cases and 784 controls to test for association with SNPs and copy number variation (total N = 215 NMO cases, 1244 controls). We identify two independent signals in the major histocompatibility complex (MHC) region associated with NMO-IgG+, one of which may be explained by structural variation in the complement component 4 genes. Mendelian Randomization analysis reveals a significant causal effect of known systemic lupus erythematosus (SLE), but not multiple sclerosis (MS), risk variants in NMO-IgG+. Our results suggest that genetic variants in the MHC region contribute to the etiology of NMO-IgG+ and that NMO-IgG+ is genetically more similar to SLE than MS.
Publisher: Elsevier BV
Date: 04-2019
Publisher: Oxford University Press (OUP)
Date: 06-11-2009
Publisher: American Association for the Advancement of Science (AAAS)
Date: 11-09-2020
Abstract: Telomere length within an in idual varies in a correlated manner across most tissues.
Publisher: Springer Science and Business Media LLC
Date: 06-10-2016
DOI: 10.1038/JHG.2016.116
Publisher: Elsevier BV
Date: 09-2003
DOI: 10.1086/377590
Publisher: Springer Science and Business Media LLC
Date: 15-03-2017
DOI: 10.1038/IJO.2017.72
Publisher: Cold Spring Harbor Laboratory
Date: 27-03-2022
DOI: 10.1101/2022.03.24.485707
Abstract: Large-scale next-generation sequencing datasets have been transformative for informing clinical variant interpretation and as reference panels for statistical and population genetic efforts. While such resources are often treated as ground truth, we find that in widely used reference datasets such as the Genome Aggregation Database (gnomAD), some variants pass gold standard filters yet are systematically different in their genotype calls across genotype discovery approaches. The inclusion of such discordant sites in study designs involving multiple genotype discovery strategies could bias results and lead to false-positive hits in association studies due to technological artifacts rather than a true relationship to the phenotype. Here, we describe this phenomenon of discordant genotype calls across genotype discovery approaches, characterize the error mode of wrong calls, provide a blacklist of discordant sites identified in gnomAD that should be treated with caution in analyses, and present a metric and machine learning classifier trained on gnomAD data to identify likely discordant variants in other datasets. We find that different genotype discovery approaches have different sets of variants at which this problem occurs but that there are characteristic variant features that can be used to predict discordant behavior. Discordant sites are largely shared across ancestry groups, though different populations are powered for discovery of different variants. We find that the most common error mode is that of a variant being heterozygous for one approach and homozygous for the other, with heterozygous in the genomes and homozygous reference in the exomes making up the majority of miscalls.
Publisher: American Diabetes Association
Date: 24-08-2017
DOI: 10.2337/DB17-0187
Abstract: Type 2 diabetes (T2D) affects more than 415 million people worldwide, and its costs to the health care system continue to rise. To identify common or rare genetic variation with potential therapeutic implications for T2D, we analyzed and replicated genome-wide protein coding variation in a total of 8,227 in iduals with T2D and 12,966 in iduals without T2D of Latino descent. We identified a novel genetic variant in the IGF2 gene associated with ∼20% reduced risk for T2D. This variant, which has an allele frequency of 17% in the Mexican population but is rare in Europe, prevents splicing between IGF2 exons 1 and 2. We show in vitro and in human liver and adipose tissue that the variant is associated with a specific, allele-dosage–dependent reduction in the expression of IGF2 isoform 2. In in iduals who do not carry the protective allele, expression of IGF2 isoform 2 in adipose is positively correlated with both incidence of T2D and increased plasma glycated hemoglobin in in iduals without T2D, providing support that the protective effects are mediated by reductions in IGF2 isoform 2. Broad phenotypic examination of carriers of the protective variant revealed no association with other disease states or impaired reproductive health. These findings suggest that reducing IGF2 isoform 2 expression in relevant tissues has potential as a new therapeutic strategy for T2D, even beyond the Latin American population, with no major adverse effects on health or reproduction.
Publisher: Hindawi Limited
Date: 21-03-2022
DOI: 10.1002/HUMU.24366
Abstract: Exome and genome sequencing have become the tools of choice for rare disease diagnosis, leading to large amounts of data available for analyses. To identify causal variants in these datasets, powerful filtering and decision support tools that can be efficiently used by clinicians and researchers are required. To address this need, we developed seqr - an open-source, web-based tool for family-based monogenic disease analysis that allows researchers to work collaboratively to search and annotate genomic callsets. To date, seqr is being used in several research pipelines and one clinical diagnostic lab. In our own experience through the Broad Institute Center for Mendelian Genomics, seqr has enabled analyses of over 10,000 families, supporting the diagnosis of more than 3,800 in iduals with rare disease and discovery of over 300 novel disease genes. Here, we describe a framework for genomic analysis in rare disease that leverages seqr's capabilities for variant filtration, annotation, and causal variant identification, as well as support for research collaboration and data sharing. The seqr platform is available as open source software, allowing low-cost participation in rare disease research, and a community effort to support diagnosis and gene discovery in rare disease.
Publisher: Oxford University Press (OUP)
Date: 28-11-2016
DOI: 10.1093/NAR/GKW971
Publisher: Springer Science and Business Media LLC
Date: 10-06-2016
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 11-2015
Publisher: Springer Science and Business Media LLC
Date: 12-06-2011
DOI: 10.1038/NG.862
Publisher: American Society for Clinical Investigation
Date: 21-03-2019
Publisher: Springer Science and Business Media LLC
Date: 12-10-2017
DOI: 10.1038/NATURE24265
Abstract: X chromosome inactivation (XCI) silences transcription from one of the two X chromosomes in female mammalian cells to balance expression dosage between XX females and XY males. XCI is, however, incomplete in humans: up to one-third of X-chromosomal genes are expressed from both the active and inactive X chromosomes (Xa and Xi, respectively) in female cells, with the degree of ‘escape’ from inactivation varying between genes and in iduals 1,2 . The extent to which XCI is shared between cells and tissues remains poorly characterized 3,4 , as does the degree to which incomplete XCI manifests as detectable sex differences in gene expression 5 and phenotypic traits 6 . Here we describe a systematic survey of XCI, integrating over 5,500 transcriptomes from 449 in iduals spanning 29 tissues from GTEx (v6p release) and 940 single-cell transcriptomes, combined with genomic sequence data. We show that XCI at 683 X-chromosomal genes is generally uniform across human tissues, but identify ex les of heterogeneity between tissues, in iduals and cells. We show that incomplete XCI affects at least 23% of X-chromosomal genes, identify seven genes that escape XCI with support from multiple lines of evidence and demonstrate that escape from XCI results in sex biases in gene expression, establishing incomplete XCI as a mechanism that is likely to introduce phenotypic ersity 6,7 . Overall, this updated catalogue of XCI across human tissues helps to increase our understanding of the extent and impact of the incompleteness in the maintenance of XCI.
Publisher: Springer Science and Business Media LLC
Date: 08-01-2020
Publisher: Springer Science and Business Media LLC
Date: 12-10-2017
DOI: 10.1038/NATURE24267
Abstract: Rare genetic variants are abundant in humans and are expected to contribute to in idual disease risk 1,2,3,4 . While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants 1,5 . Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles 1,6,7 , but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues 8,9,10,11 , but their effects across tissues are unknown. Here we identify gene expression outliers, or in iduals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release 12 . We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in in idual genomes.
Publisher: Oxford University Press (OUP)
Date: 04-01-2008
DOI: 10.1093/HMG/DDM380
Abstract: A common nonsense polymorphism (R577X) in the ACTN3 gene results in complete deficiency of the fast skeletal muscle fiber protein alpha-actinin-3 in an estimated one billion humans worldwide. The XX null genotype is under-represented in elite sprint athletes, associated with reduced muscle strength and sprint performance in non-athletes, and is over-represented in endurance athletes, suggesting that alpha-actinin-3 deficiency increases muscle endurance at the cost of power generation. Here we report that muscle from Actn3 knockout mice displays reduced force generation, consistent with results from human association studies. Detailed analysis of knockout mouse muscle reveals reduced fast fiber diameter, increased activity of multiple enzymes in the aerobic metabolic pathway, altered contractile properties, and enhanced recovery from fatigue, suggesting a shift in the properties of fast fibers towards those characteristic of slow fibers. These findings provide the first mechanistic explanation for the reported associations between R577X and human athletic performance and muscle function.
Publisher: Springer Science and Business Media LLC
Date: 03-02-2021
DOI: 10.1038/S41586-020-03174-8
Abstract: A Correction to this paper has been published: 0.1038/s41586-020-03174-8.
Publisher: Cold Spring Harbor Laboratory
Date: 04-07-2017
DOI: 10.1101/159228
Abstract: Short tandem repeat (STR) expansions have been identified as the causal DNA mutation in dozens of Mendelian diseases. Historically, pathogenic STR expansions could only be detected by single locus techniques, such as PCR and electrophoresis. The ability to use short read sequencing data to screen for STR expansions has the potential to reduce both the time and cost to reaching diagnosis and enable the discovery of new causal STR loci. Most existing tools detect STR variation within the read length, and so are unable to detect the majority of pathogenic expansions. Those tools that can detect large expansions are limited to a set of known disease loci and as yet no new disease causing STR expansions have been identified with high-throughput sequencing technologies. Here we address this by presenting STRetch, a new genome-wide method to detect STR expansions at all loci across the human genome. We demonstrate the use of STRetch for detecting pathogenic STR expansions in short-read whole genome sequencing data with a very low false discovery rate. We further demonstrate the application of STRetch to solve cases of patients with undiagnosed disease and apply STRetch to the analysis of 97 whole genomes to reveal variation at STR loci. STRetch assesses expansions at all STR loci in the genome and allows screening for novel disease-causing STRs. STRetch is open source software, available from github.com/Oshlack/STRetch .
Publisher: Cold Spring Harbor Laboratory
Date: 25-09-2016
DOI: 10.1101/077180
Abstract: As part of a broader collaborative network of exome sequencing studies, we developed a jointly called data set of 5,685 Ashkenazi Jewish exomes. We make publicly available a resource of site and allele frequencies, which should serve as a reference for medical genetics in the Ashkenazim. We estimate that 30% of protein-coding alleles present in the Ashkenazi Jewish population at frequencies greater than 0.2% are significantly more frequent (mean 7.6-fold) than their maximum frequency observed in other reference populations. Arising via a well-described founder effect, this catalog of enriched alleles can contribute to differences in genetic risk and overall prevalence of diseases between populations. As validation we document 151 AJ enriched protein-altering alleles that overlap with “pathogenic” ClinVar alleles, including those that account for 10-100 fold differences in prevalence between AJ and non-AJ populations of some rare diseases including Gaucher disease ( GBA , p.Asn409Ser, 8-fold enrichment) Canavan disease ( ASPA , p.Glu285Ala, 12-fold enrichment) and Tay-Sachs disease ( HEXA , c.1421+1G C, 27-fold enrichment p.Tyr427IlefsTer5, 12-fold enrichment). We next sought to use this catalog, of well-established relevance to Mendelian disease, to explore Crohn’s disease, a common disease with an estimated two to four-fold excess prevalence in AJ. We specifically evaluate whether strong acting rare alleles, enriched by the same founder-effect, contribute excess genetic risk to Crohn’s disease in AJ, and find that ten rare genetic risk factors in NOD2 and LRRK2 are strongly enriched in AJ, including several novel contributing alleles, show evidence of association to CD. Independently, we find that genomewide common variant risk defined by GWAS shows a strong difference between AJ and non-AJ European control population s les (0.97 s.d. higher, p −16 ). Taken together, the results suggest coordinated selection in AJ population for higher CD risk alleles in general. The results and approach illustrate the value of exome sequencing data in case-control studies along with reference data sets like ExAC to pinpoint genetic variation that contributes to variable disease predisposition across populations.
Publisher: Springer Science and Business Media LLC
Date: 17-11-2017
Publisher: Cold Spring Harbor Laboratory
Date: 04-08-2020
DOI: 10.1101/2020.08.03.235358
Abstract: Two intriguing forms of genome structural variation (SV) – dispersed duplications, and de novo rearrangements of complex, multi-allelic loci – have long escaped genomic analysis. We describe a new way to find and characterize such variation by utilizing identity-by-descent (IBD) relationships between siblings together with high-precision measurements of segmental copy number. Analyzing whole-genome sequence data from 706 families, we find hundreds of “IBD-discordant” (IBDD) CNVs: loci at which siblings’ CNV measurements and IBD states are mathematically inconsistent. We found that commonly-IBDD CNVs identify dispersed duplications we mapped 95 of these common dispersed duplications to their true genomic locations through family-based linkage and population linkage disequilibrium (LD), and found several to be in strong LD with genome-wide association (GWAS) signals for common diseases or gene expression variation at their revealed genomic locations. Other CNVs that were IBDD in a single family appear to involve de novo mutations in complex and multi-allelic loci we identified 26 de novo structural mutations that had not been previously detected in earlier analyses of the same families by erse SV analysis methods. These included a de novo mutation of the amylase gene locus and multiple de novo mutations at chromosome 15q14. Combining these complex mutations with more-conventional CNVs, we estimate that segmental mutations larger than 1kb arise in about one per 22 human meioses. These methods are complementary to previous techniques in that they interrogate genomic regions that are home to segmental duplication, high CNV allele frequencies, and multi-allelic CNVs. Copy number variation is an important form of genetic variation in which in iduals differ in the number of copies of segments of their genomes. Certain aspects of copy number variation have traditionally been difficult to study using short-read sequencing data. For ex le, standard analyses often cannot tell whether the duplicated copies of a segment are located near the original copy or are dispersed to other regions of the genome. Another aspect of copy number variation that has been difficult to study is the detection of mutations in the copy number of DNA segments passed down from parents to their children, particularly when the mutations affect genome segments which already display common copy number variation in the population. We develop an analytical approach to solving these problems when sequencing data is available for all members of families with at least two children. This method is based on determining the number of parental haplotypes the two siblings share at each location in their genome, and using that information to determine the possible inheritance patterns that might explain the copy numbers we observe in each family member. We show that dispersed duplications and mutations can be identified by looking for copy number variants that do not follow these expected inheritance patterns. We use this approach to determine the location of 95 common duplications which are dispersed to distant regions of the genome, and demonstrate that these duplications are linked to genetic variants that affect disease risk or gene expression levels. We also identify a set of copy number mutations not detected by previous analyses of sequencing data from a large cohort of families, and show that repetitive and complex regions of the genome undergo frequent mutations in copy number.
Publisher: Springer Science and Business Media LLC
Date: 07-06-2021
DOI: 10.1038/S41467-021-23134-8
Abstract: The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants’ effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.
Publisher: Springer Science and Business Media LLC
Date: 03-2021
Publisher: Cold Spring Harbor Laboratory
Date: 12-05-2016
DOI: 10.1101/052886
Abstract: Recent research has uncovered an important role for de novo variation in neurodevelopmental disorders. Using aggregated data from 9246 families with autism spectrum disorder, intellectual disability, or developmental delay, we show ~1/3 of de novo variants are independently observed as standing variation in the Exome Aggregation Consortium’s cohort of 60,706 adults, and these de novo variants do not contribute to neurodevelopmental risk. We further use a loss-of-function (LoF)-intolerance metric, pLI, to identify a subset of LoF-intolerant genes that contain the observed signal of associated de novo protein truncating variants (PTVs) in neurodevelopmental disorders. LoF-intolerant genes also carry a modest excess of inherited PTVs though the strongest de novo impacted genes contribute little to this, suggesting the excess of inherited risk resides lower-penetrant genes. These findings illustrate the importance of population-based reference cohorts for the interpretation of candidate pathogenic variants, even for analyses of complex diseases and de novo variation.
Publisher: Springer Science and Business Media LLC
Date: 25-12-2013
DOI: 10.1038/NATURE12828
Publisher: American Medical Association (AMA)
Date: 11-06-2014
Publisher: Oxford University Press (OUP)
Date: 02-05-2011
DOI: 10.1093/HMG/DDR196
Abstract: Sarcomeric α-actinins (α-actinin-2 and -3) are a major component of the Z-disk in skeletal muscle, where they crosslink actin and other structural proteins to maintain an ordered myofibrillar array. Homozygosity for the common null polymorphism (R577X) in ACTN3 results in the absence of fast fiber-specific α-actinin-3 in ∼20% of the general population. α-Actinin-3 deficiency is associated with decreased force generation and is detrimental to sprint and power performance in elite athletes, suggesting that α-actinin-3 is necessary for optimal forceful repetitive muscle contractions. Since Z-disks are the structures most vulnerable to eccentric damage, we sought to examine the effects of α-actinin-3 deficiency on sarcomeric integrity. Actn3 knockout mouse muscle showed significantly increased force deficits following eccentric contraction at 30% stretch, suggesting that α-actinin-3 deficiency results in an increased susceptibility to muscle damage at the extremes of muscle performance. Microarray analyses demonstrated an increase in muscle remodeling genes, which we confirmed at the protein level. The loss of α-actinin-3 and up-regulation of α-actinin-2 resulted in no significant changes to the total pool of sarcomeric α-actinins, suggesting that alterations in fast fiber Z-disk properties may be related to differences in functional protein interactions between α-actinin-2 and α-actinin-3. In support of this, we demonstrated that the Z-disk proteins, ZASP, titin and vinculin preferentially bind to α-actinin-2. Thus, the loss of α-actinin-3 changes the overall protein composition of fast fiber Z-disks and alters their elastic properties, providing a mechanistic explanation for the loss of force generation and increased susceptibility to eccentric damage in α-actinin-3-deficient in iduals.
Publisher: BMJ
Date: 19-06-2018
Publisher: Springer Science and Business Media LLC
Date: 22-01-2021
Publisher: Cold Spring Harbor Laboratory
Date: 12-06-2017
DOI: 10.1101/148353
Abstract: Given increasing numbers of patients who are undergoing exome or genome sequencing, it is critical to establish tools and methods to interpret the impact of genetic variation. While the ability to predict deleteriousness for any given variant is limited, missense variants remain a particularly challenging class of variation to interpret, since they can have drastically different effects depending on both the precise location and specific amino acid substitution of the variant. In order to better evaluate missense variation, we leveraged the exome sequencing data of 60,706 in iduals from the Exome Aggregation Consortium (ExAC) dataset to identify sub-genic regions that are depleted of missense variation. We further used this depletion as part of a novel missense deleteriousness metric named MPC. We applied MPC to de novo missense variants and identified a category of de novo missense variants with the same impact on neurodevelopmental disorders as truncating mutations in intolerant genes, supporting the value of incorporating regional missense constraint in variant interpretation.
Publisher: Elsevier BV
Date: 05-2019
Publisher: Cold Spring Harbor Laboratory
Date: 18-01-2019
DOI: 10.1101/524256
Abstract: Primary Hyperoxaluria Type 1 (PH1) is a rare autosomal recessive metabolic disorder of oxalate metabolism leading to kidney failure as well as multi-organ damage. Overproduction of oxalate occurs in the liver due to an inherited genetic defect in the enzyme alanine-glyoxylate aminotransferase ( AGXT ), causing pathology due to the insolubility of calcium oxalate crystals in body fluids. The main current therapy is dual liver-kidney transplant, which incurs high morbidity and has poor availability in some health systems where PH1 is more prevalent. One approach currently in active clinical investigation targets HAO1 (hydroxyacid oxidase 1), encoding glycolate oxidase, to reduce substrate levels for oxalate production. To inform drug development, we sought in iduals with reduced HAO1 function due to naturally occurring genetic variation. Analysis of loss of function variants in 141,456 sequenced in iduals suggested in iduals with complete HAO1 knockout would only be observed in 1 in 30 million outbred people. However in a large sequencing and health records program (Genes & Health), in populations with substantial autozygosity, we identified a healthy adult in idual predicted to have complete knockout of HAO1 due to an ultra rare homozygous frameshift variant (rs1186715161, ENSP00000368066.3:p.Leu333SerfsTer4). Primary care and hospital health records confirmed no apparently related clinical phenotype. At recall, urine and plasma oxalate levels were normal, however plasma glycolate levels (171 nmol/mL) were 12 times the upper limit of normal in healthy, reference in iduals (mean+2sd=14 nmol/mL, n=67) while her urinary glycolate levels were 6 times the upper limit of normal. Comparison with preclinical and phase 1 clinical trial data of an RNAi therapeutic targeting HAO1 (lumasiran) suggests the in idual likely retains % residual glycolate oxidase activity. These results provide important data to support the safety of HAO1 inhibition as a potential chronic therapy for a devastating metabolic disease (PH1). We also suggest that the effect of glycolate oxidase suppression in any potential other roles in humans beyond glycolate oxidation do not lead to clinical phenotypes, at least in this specific in idual. This demonstrates the value of studying the lifelong complete knockdown of a target protein in a living human to aid development of a potential therapeutic, both in de-risking the approach and providing potential hypotheses to optimize its development. Furthermore, therapy for PH1 is likely to be required lifelong, in contrast to data from chronicity studies in non-human species or relatively short-term therapeutic studies in people. Our approach demonstrates the potential for improved drug discovery through unlocking relevant evidence hiding in the ersity of human genetic variation.
Publisher: Springer Science and Business Media LLC
Date: 14-12-2016
DOI: 10.1038/NATURE16068
Abstract: Thousands of transiting exoplanets have been discovered, but spectral analysis of their atmospheres has so far been dominated by a small number of exoplanets and data spanning relatively narrow wavelength ranges (such as 1.1-1.7 micrometres). Recent studies show that some hot-Jupiter exoplanets have much weaker water absorption features in their near-infrared spectra than predicted. The low litude of water signatures could be explained by very low water abundances, which may be a sign that water was depleted in the protoplanetary disk at the planet's formation location, but it is unclear whether this level of depletion can actually occur. Alternatively, these weak signals could be the result of obscuration by clouds or hazes, as found in some optical spectra. Here we report results from a comparative study of ten hot Jupiters covering the wavelength range 0.3-5 micrometres, which allows us to resolve both the optical scattering and infrared molecular absorption spectroscopically. Our results reveal a erse group of hot Jupiters that exhibit a continuum from clear to cloudy atmospheres. We find that the difference between the planetary radius measured at optical and infrared wavelengths is an effective metric for distinguishing different atmosphere types. The difference correlates with the spectral strength of water, so that strong water absorption lines are seen in clear-atmosphere planets and the weakest features are associated with clouds and hazes. This result strongly suggests that primordial water depletion during formation is unlikely and that clouds and hazes are the cause of weaker spectral signatures.
Publisher: Springer Science and Business Media LLC
Date: 03-04-2017
DOI: 10.1038/NG.3831
Publisher: Springer Science and Business Media LLC
Date: 16-10-2018
DOI: 10.1038/S41467-018-06540-3
Abstract: Phenome-wide association studies (PheWAS) have been proposed as a possible aid in drug development through elucidating mechanisms of action, identifying alternative indications, or predicting adverse drug events (ADEs). Here, we select 25 single nucleotide polymorphisms (SNPs) linked through genome-wide association studies (GWAS) to 19 candidate drug targets for common disease indications. We interrogate these SNPs by PheWAS in four large cohorts with extensive health information (23andMe, UK Biobank, FINRISK, CHOP) for association with 1683 binary endpoints in up to 697,815 in iduals and conduct meta-analyses for 145 mapped disease endpoints. Our analyses replicate 75% of known GWAS associations ( P 0.05) and identify nine study-wide significant novel associations (of 71 with FDR 0.1). We describe associations that may predict ADEs, e.g., acne, high cholesterol, gout, and gallstones with rs738409 (p.I148M) in PNPLA3 and asthma with rs1990760 (p.T946A) in IFIH1 . Our results demonstrate PheWAS as a powerful addition to the toolkit for drug discovery.
Publisher: Springer Science and Business Media LLC
Date: 04-2014
DOI: 10.1038/NATURE13127
Publisher: Hindawi Limited
Date: 26-03-2015
DOI: 10.1002/HUMU.22768
Publisher: Cold Spring Harbor Laboratory
Date: 19-09-2016
DOI: 10.1101/073957
Abstract: X chromosome inactivation (XCI) silences the transcription from one of the two X chromosomes in mammalian female cells to balance expression dosage between XX females and XY males. XCI is, however, characteristically incomplete in humans: up to one third of X-chromosomal genes are expressed from both the active and inactive X chromosomes (Xa and Xi, respectively) in female cells, with the degree of “escape” from inactivation varying between genes and in iduals 1, 2 (Fig. 1). However, the extent to which XCI is shared between cells and tissues remains poorly characterized 3,4 , as does the degree to which incomplete XCI manifests as detectable sex differences in gene expression 5 and phenotypic traits 6 . Here we report a systematic survey of XCI using a combination of over 5,500 transcriptomes from 449 in iduals spanning 29 tissues, and 940 single-cell transcriptomes, integrated with genomic sequence data (Fig. 1). By combining information across these data types we show that XCI at the 683 X-chromosomal genes assessed is generally uniform across human tissues, but identify ex les of heterogeneity between tissues, in iduals and cells. We show that incomplete XCI affects at least 23% of X-chromosomal genes, identify seven new escape genes supported by multiple lines of evidence, and demonstrate that escape from XCI results in sex biases in gene expression, thus establishing incomplete XCI as a likely mechanism introducing phenotypic ersity 6,7 . Overall, this updated catalogue of XCI across human tissues informs our understanding of the extent and impact of the incompleteness in the maintenance of XCI.
Publisher: Cold Spring Harbor Laboratory
Date: 09-06-2017
DOI: 10.1101/148247
Abstract: There is a limited understanding about the impact of rare protein truncating variants across multiple phenotypes. We explore the impact of this class of variants on 13 quantitative traits and 10 diseases using whole-exome sequencing data from 100,296 in iduals. Protein truncating variants in genes intolerant to this class of mutations increased risk of autism, schizophrenia, bipolar disorder, intellectual disability, ADHD. In in iduals without these disorders, there was an association with shorter height, lower education, increased hospitalization and reduced age. Gene sets implicated from GWAS did not show a significant protein truncating variants-burden beyond what captured by established Mendelian genes. In conclusion, we provide the most thorough investigation to date of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk. PTV = Protein Truncating Variants PI = Protein Truncating Intolerant PI-PTV = Protein Truncating Variant in genes that are Intolerant to Protein Truncating Variants
Publisher: Elsevier BV
Date: 07-2012
DOI: 10.1016/J.TIG.2012.05.001
Abstract: Cheap, high-throughput approaches to generating biological data are transforming biology into a data-driven science and promise to similarly transform medicine. However, the road to genomic medicine is paved with challenges and uncertainty.
Publisher: Elsevier BV
Date: 10-2011
DOI: 10.1016/J.BONE.2011.07.009
Abstract: Bone mineral density (BMD) is a complex trait that is the single best predictor of the risk of osteoporotic fractures. Candidate gene and genome-wide association studies have identified genetic variations in approximately 30 genetic loci associated with BMD variation in humans. α-Actinin-3 (ACTN3) is highly expressed in fast skeletal muscle fibres. There is a common null-polymorphism R577X in human ACTN3 that results in complete deficiency of the α-actinin-3 protein in approximately 20% of Eurasians. Absence of α-actinin-3 does not cause any disease phenotypes in muscle because of compensation by α-actinin-2. However, α-actinin-3 deficiency has been shown to be detrimental to athletic sprint ower performance. In this report we reveal additional functions for α-actinin-3 in bone. α-Actinin-3 but not α-actinin-2 is expressed in osteoblasts. The Actn3(-/-) mouse displays significantly reduced bone mass, with reduced cortical bone volume (-14%) and trabecular number (-61%) seen by microCT. Dynamic histomorphometry indicated this was due to a reduction in bone formation. In a cohort of postmenopausal Australian women, ACTN3 577XX genotype was associated with lower BMD in an additive genetic model, with the R577X genotype contributing 1.1% of the variance in BMD. Microarray analysis of cultured osteoprogenitors from Actn3(-/-) mice showed alterations in expression of several genes regulating bone mass and osteoblast/osteoclast activity, including Enpp1, Opg and Wnt7b. Our studies suggest that ACTN3 likely contributes to the regulation of bone mass through alterations in bone turnover. Given the high frequency of R577X in the general population, the potential role of ACTN3 R577X as a factor influencing variations in BMD in elderly humans warrants further study.
Publisher: Springer Science and Business Media LLC
Date: 25-04-2019
Publisher: Cold Spring Harbor Laboratory
Date: 07-12-2016
DOI: 10.1101/090720
Abstract: The interpretation of genetic variants identified during clinical sequencing has come to rely heavily on reference population databases such as the Exome Aggregation Consortium (ExAC). Genuinely pathogenic variants, particularly in genes associated with severe autosomal dominant conditions, are assumed to be absent or extremely rare in these databases. Clinical exome sequencing of a six-year-old female patient with seizures, global developmental delay, dysmorphic features and failure to thrive identified an ASXL1 variant that was previously reported as causative of Bohring-Opitz syndrome (BOS). Surprisingly, the variant was observed seven times in the ExAC database, presumably in in iduals without BOS. Although the BOS phenotype matched the presentation of the patient, the presence of the variant in reference population databases introduced ambiguity in result interpretation. Interrogation of the literature revealed that acquired somatic mosaicism of ASXL1 variants (including known pathogenic variants) during hematopoietic clonal expansion may be concomitant with aging in healthy in iduals. We examined all high quality ASXL1 predicted truncating variant calls in the ExAC database and determined the majority could be attributed to this phenomenon. Failure to consider somatic mosaicism may lead to the inaccurate assumption that conditions like Bohring-Opitz syndrome have reduced penetrance, or the misclassification of potentially pathogenic variants.
Publisher: Elsevier BV
Date: 12-2018
Publisher: Wiley
Date: 25-05-2016
DOI: 10.1002/ANA.24687
Abstract: To evaluate the diagnostic outcomes in a large cohort of congenital muscular dystrophy (CMD) patients using traditional and next generation sequencing (NGS) technologies. A total of 123 CMD patients were investigated using the traditional approaches of histology, immunohistochemical analysis of muscle biopsy, and candidate gene sequencing. Undiagnosed patients available for further testing were investigated using NGS. Muscle biopsy and immunohistochemical analysis found deficiencies of laminin α2, α-dystroglycan, or collagen VI in 50% of patients. Candidate gene sequencing and chromosomal microarray established a genetic diagnosis in 32% (39 of 123). Of 85 patients presenting in the past 20 years, 28 of 51 who lacked a confirmed genetic diagnosis (55%) consented to NGS studies, leading to confirmed diagnoses in a further 11 patients. Using the combination of approaches, a confirmed genetic diagnosis was achieved in 51% (43 of 85). The diagnoses within the cohort were heterogeneous. Forty-five of 59 probands with confirmed or probable diagnoses had variants in genes known to cause CMD (76%), and 11 of 59 (19%) had variants in genes associated with congenital myopathies, reflecting overlapping features of these conditions. One patient had a congenital myasthenic syndrome, and 2 had microdeletions. Within the cohort, 5 patients had variants in novel (PIGY and GMPPB) or recently published genes (GFPT1 and MICU1), and 7 had variants in TTN or RYR1, large genes that are technically difficult to Sanger sequence. These data support NGS as a first-line tool for genetic evaluation of patients with a clinical phenotype suggestive of CMD, with muscle biopsy reserved as a second-tier investigation. Ann Neurol 2016 :101-111.
Publisher: Springer Science and Business Media LLC
Date: 03-02-2021
DOI: 10.1038/S41586-020-03175-7
Abstract: A Correction to this paper has been published: 0.1038/s41586-020-03175-7
Publisher: Elsevier BV
Date: 04-2011
DOI: 10.1016/J.EXGER.2010.11.006
Abstract: Deficiency of the fast-twitch muscle protein α-actinin-3 due to homozygosity for a nonsense polymorphism (R577X) in the ACTN3 gene is common in humans. α-Actinin-3 deficiency (XX) is associated with reduced muscle strength ower and enhanced endurance performance in elite athletes and in the general population. The association between R577X and loss in muscle mass and function (sarcopenia) has previously been investigated in a number of studies in elderly humans. The majority of studies report loss of ACTN3 genotype association with muscle traits in the elderly, however, there is some indication that the XX genotype may be associated with faster muscle function decline. To further explore these potential age-related effects and the underlying mechanisms, we examined the effect of α-actinin-3 deficiency in aging male and female Actn3 knockout (KO) mice (2, 6, 12, and 18 months). Our findings support previous reports of a diminished influence of ACTN3 genotype on muscle performance in the elderly: genotype differences in intrinsic exercise performance, fast muscle force generation and male muscle mass were lost in aged mice, but were maintained for other muscle function traits such as grip strength. The loss of genotype difference in exercise performance occurred despite the maintenance of some "slower" muscle characteristics in KO muscles, such as increased oxidative metabolism and greater force recovery after fatigue. Interestingly, muscle mass decline in aged 18 month old male KO mice was greater compared to wild-type controls (WT) (-12.2% in KO -6.5% in WT). These results provide further support that α-actinin-3 deficient in iduals may experience faster decline in muscle function with increasing age.
Publisher: Springer Science and Business Media LLC
Date: 21-08-2018
Publisher: American Association for the Advancement of Science (AAAS)
Date: 11-09-2020
Abstract: Sex differences in the human transcriptome are widespread and tissue specific, and they contribute to complex traits.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 17-02-2012
Abstract: Identifying genes that give rise to diseases is one of the major goals of sequencing human genomes. However, putative loss-of-function genes, which are often some of the first identified targets of genome and exome sequencing, have often turned out to be sequencing errors rather than true genetic variants. In order to identify the true scope of loss-of-function genes within the human genome, MacArthur et al. (p. 823 see the Perspective by Quintana-Murci ) extensively validated the genomes from the 1000 Genomes Project, as well as an additional European in idual, and found that the average person has about 100 true loss-of-function alleles of which approximately 20 have two copies within an in idual. Because many known disease-causing genes were identified in “normal” in iduals, the process of clinical sequencing needs to reassess how to identify likely causative alleles.
Publisher: Oxford University Press (OUP)
Date: 20-08-2015
DOI: 10.1093/HMG/DDV331
Publisher: American Association for the Advancement of Science (AAAS)
Date: 11-09-2020
Abstract: Outliers in the human transcriptome reveal the functional effects of rare genetic variants.
Publisher: Springer Science and Business Media LLC
Date: 26-11-2018
Publisher: Oxford University Press (OUP)
Date: 12-02-2015
DOI: 10.1093/BRAIN/AWV013
Abstract: Dystroglycanopathies are a heterogeneous group of diseases with a broad phenotypic spectrum ranging from severe disorders with congenital muscle weakness, eye and brain structural abnormalities and intellectual delay to adult-onset limb-girdle muscular dystrophies without mental retardation. Most frequently the disease onset is congenital or during childhood. The exception is FKRP mutations, in which adult onset is a common presentation. Here we report eight patients from five non-consanguineous families where next generation sequencing identified mutations in the GMPPB gene. Six patients presented as an adult or adolescent-onset limb-girdle muscular dystrophy, one presented with isolated episodes of rhabdomyolysis, and one as a congenital muscular dystrophy. This report expands the phenotypic spectrum of GMPPB mutations to include limb-girdle muscular dystrophies with adult onset with or without intellectual disability, or isolated rhabdomyolysis.
Publisher: Cold Spring Harbor Laboratory
Date: 20-03-2019
Abstract: Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more erse than originally thought. To date, this ersity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as “Linked-Reads”. This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC , SMN1 , and SMN2 . Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
Publisher: Springer Science and Business Media LLC
Date: 26-09-2019
DOI: 10.1038/S41431-019-0519-X
Abstract: A distinct neurodevelopmental phenotype characterised mainly by mild motor and language delay and facial dysmorphism, caused by heterozygous de novo or dominant variants in the TLK2 gene has recently been described. All cases reported carried either truncating variants located throughout the gene, or missense changes principally located at the C-terminal end of the protein mostly resulting in haploinsufficiency of TLK2 . Through whole exome sequencing, we identified a homozygous missense variant in TLK2 in a patient showing more severe symptoms than those previously described, including cerebellar vermis hypoplasia and West syndrome. Both parents are heterozygous for the variant and clinically unaffected highlighting that recessive variants in TLK2 can also be disease causing and may act through a different pathomechanism.
Publisher: American Physiological Society
Date: 11-2018
DOI: 10.1152/PHYSIOLGENOMICS.00036.2018
Abstract: Next-generation sequencing is commonly used to screen for pathogenic mutations in families with Mendelian disorders, but due to the pace of discoveries, gaps have widened for some diseases between genetic and pathophysiological knowledge. We recruited and analyzed 16 families with limb-girdle muscular dystrophy (LGMD) of Arab descent from Saudi Arabia and Sudan who did not have confirmed genetic diagnoses. The analysis included both traditional and next-generation sequencing approaches. Cellular and metabolic studies were performed on Pyroxd1 siRNA C2C12 myoblasts and controls. Pathogenic mutations were identified in eight of the 16 families. One Sudanese family of Arab descent residing in Saudi Arabia harbored a homozygous c.464A G, p.Asn155Ser mutation in PYROXD1, a gene recently reported in association with myofibrillar myopathy and whose protein product reduces thiol residues. Pyroxd1 deficiency in murine C2C12 myoblasts yielded evidence for impairments of cellular proliferation, migration, and differentiation, while CG10721 (Pyroxd1 fly homolog) knockdown in Drosophila yielded a lethal phenotype. Further investigations indicated that Pyroxd1 does not localize to mitochondria, yet Pyroxd1 deficiency is associated with decreased cellular respiration. This study identified pathogenic mutations in half of the LGMD families from the cohort, including one in PYROXD1. Developmental impairments were demonstrated in vitro for Pyroxd1 deficiency and in vivo for CG10721 deficiency, with reduced metabolic activity in vitro for Pyroxd1 deficiency.
Publisher: Public Library of Science (PLoS)
Date: 13-05-2015
Publisher: Public Library of Science (PLoS)
Date: 31-07-2014
Publisher: Cold Spring Harbor Laboratory
Date: 28-01-2019
DOI: 10.1101/530881
Abstract: Human genetics has informed the clinical development of new drugs, and is beginning to influence the selection of new drug targets. Large-scale DNA sequencing studies have created a catalogue of naturally occurring genetic variants predicted to cause loss of function in human genes, which in principle should provide powerful in vivo models of human genetic “knockouts” to complement model organism knockout studies and inform drug development. Here, we consider the use of predicted loss-of-function (pLoF) variation catalogued in the Genome Aggregation Database (gnomAD) for the evaluation of genes as potential drug targets. Many drug targets, including the targets of highly successful inhibitors such as aspirin and statins, are under natural selection at least as extreme as known haploinsufficient genes, with pLoF variants almost completely depleted from the population. Thus, metrics of gene essentiality should not be used to eliminate genes from consideration as potential targets. The identification of in idual humans harboring “knockouts” (biallelic gene inactivation), followed by in idual recall and deep phenotyping, is highly valuable to study gene function. In most genes, pLoF alleles are sufficiently rare that ascertainment will be largely limited to heterozygous in iduals in outbred populations. S ling of erse bottlenecked populations and consanguineous in iduals will aid in identification of total “knockouts”. Careful filtering and curation of pLoF variants in a gene of interest is necessary in order to identify true LoF in iduals for follow-up, and the positional distribution or frequency of true LoF variants may reveal important disease biology. Our analysis suggests that the value of pLoF variant data for drug discovery lies in deep curation informed by the nature of the drug and its indication, as well as the biology of the gene, followed by recall-by-genotype studies in targeted populations.
Publisher: Proceedings of the National Academy of Sciences
Date: 21-01-2020
Abstract: De novo mutations (DNMs), or mutations that appear in an in idual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of erse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed s les. Genome-wide heterozygosity does correlate with DNM rate, but only explains % of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between in iduals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed in iduals. However, we did find significantly fewer DNMs in Amish in iduals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability ( h 2 ), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.
Publisher: Springer Science and Business Media LLC
Date: 11-09-2020
DOI: 10.1186/S13059-020-02122-Z
Abstract: Allele expression (AE) analysis robustly measures cis -regulatory effects. Here, we present and demonstrate the utility of a vast AE resource generated from the GTEx v8 release, containing 15,253 s les spanning 54 human tissues for a total of 431 million measurements of AE at the SNP level and 153 million measurements at the haplotype level. In addition, we develop an extension of our tool phASER that allows effect sizes of cis -regulatory variants to be estimated using haplotype-level AE data. This AE resource is the largest to date, and we are able to make haplotype-level data publicly available. We anticipate that the availability of this resource will enable future studies of regulatory variation across human tissues.
Publisher: Cold Spring Harbor Laboratory
Date: 09-05-2019
DOI: 10.1101/632794
Abstract: Transcriptome data holds substantial promise for better interpretation of rare genetic variants in basic research and clinical settings. Here, we introduce ANalysis of Expression VAriation (ANEVA) to quantify genetic variation in gene dosage from allelic expression (AE) data in a population. Application to GTEx data showed that this variance estimate is robust across datasets and is correlated with selective constraint in a gene. We next used ANEVA variance estimates in a Dosage Outlier Test (ANEVA-DOT) to identify genes in an in idual that are affected by a rare regulatory variant with an unusually strong effect. Applying ANEVA-DOT to AE data form 70 Mendelian muscular disease patients showed high accuracy in detecting genes with pathogenic variants in previously resolved cases, and lead to one confirmed and several potential new diagnoses in cases previously unresolved. Using our reference estimates from GTEx data, ANEVA-DOT can be readily incorporated in rare disease diagnostic pipelines to better utilize RNA-seq data. New statistical framework for modelling allelic expression characterizes genetic regulatory variation in populations and informs diagnosis in rare disease patients
Publisher: Springer Science and Business Media LLC
Date: 03-02-2021
DOI: 10.1038/S41586-020-03176-6
Abstract: A Correction to this paper has been published: 0.1038/s41586-020-03176-6.
Publisher: Elsevier BV
Date: 03-2017
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41467-019-10717-9
Abstract: Upstream open reading frames (uORFs) are tissue-specific cis -regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORF-perturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes.
Publisher: Wiley
Date: 24-08-2016
DOI: 10.1002/MUS.25094
Publisher: American Association for the Advancement of Science (AAAS)
Date: 11-09-2020
Abstract: The Genotype-Tissue Expression (GTEx) project dissects how genetic variation affects gene expression and splicing.
Publisher: Hindawi Limited
Date: 13-01-2018
DOI: 10.1002/HUMU.23385
Publisher: Cold Spring Harbor Laboratory
Date: 08-09-2016
DOI: 10.1101/074153
Abstract: Exome and whole-genome sequencing are becoming increasingly routine approaches in Mendelian disease diagnosis. Despite their success, the current diagnostic rate for genomic analyses across a variety of rare diseases is approximately 25-50%. Here, we explore the utility of transcriptome sequencing (RNA-seq) as a complementary diagnostic tool in a cohort of 50 patients with genetically undiagnosed rare muscle disorders. We describe an integrated approach to analyze patient muscle RNA-seq, leveraging an analysis framework focused on the detection of transcript-level changes that are unique to the patient compared to over 180 control skeletal muscle s les. We demonstrate the power of RNA-seq to validate candidate splice-disrupting mutations and to identify splice-altering variants in both exonic and deep intronic regions, yielding an overall diagnosis rate of 35%. We also report the discovery of a highly recurrent de novo intronic mutation in COL6A1 that results in a dominantly acting splice-gain event, disrupting the critical glycine repeat motif of the triple helical domain. We identify this pathogenic variant in a total of 27 genetically unsolved patients in an external collagen VI-like dystrophy cohort, thus explaining approximately 25% of patients clinically suggestive of collagen VI dystrophy in whom prior genetic analysis is negative. Overall, this study represents a large systematic application of transcriptome sequencing to rare disease diagnosis and highlights its utility for the detection and interpretation of variants missed by current standard diagnostic approaches. Transcriptome sequencing improves the diagnostic rate for Mendelian disease in patients for whom genetic analysis has not returned a diagnosis.
Publisher: Cold Spring Harbor Laboratory
Date: 04-12-2019
DOI: 10.1101/19013599
Abstract: Refractive error is caused by a disparity between the axial length and focusing power of the eye. Nanophthalmos is a rare ocular abnormality in which both eyes are abnormally small, typically causing extreme hyperopic refractive error, and associated with an increased risk of angle-closure glaucoma. A cohort of 40 in iduals from 13 unrelated nanophthalmos kindreds were recruited, with 11 probands subjected to exome sequencing. Nine probands (69.2%) were assigned a genetic diagnosis, with variants in PRSS56 (4), MFRP (3), and previously reported variants in TMEM98 (1) and MYRF (1). Two of the four PRSS56 probands harboured the previously described c.1066dupC frameshift variant implicated in over half of all reported PRSS56 kindreds, with surrounding haplotypes distinct from each other, and from a previously reported Tunisian c.1066dupC haplotype. In iduals with a genetic diagnosis had shorter mean axial lengths ( P =7.22×10 −9 ) and more extreme hyperopia ( P =0.0005) than those without a genetic diagnosis, with recessive forms associated with the shortest axial lengths and highest hyperopia. All in iduals with an axial length below 18 mm in their smaller eye (17/17) were assigned a genetic diagnosis. These findings detail the genetic architecture of nanophthalmos in an Australian cohort of predominantly European ancestry, their relative clinical phenotypes, and highlight the shared genetic architecture of rare and common disorders of refractive error.
Publisher: Cold Spring Harbor Laboratory
Date: 27-02-2019
DOI: 10.1101/561472
Abstract: Human genetic variants causing loss-of-function (LoF) of protein-coding genes provide natural in vivo models of gene inactivation, which are powerful indicators of gene function and the potential toxicity of therapeutic inhibitors targeting these genes 1,2 . Gain-of-kinase-function variants in LRRK2 are known to significantly increase the risk of Parkinson’s disease 3,4 , suggesting that inhibition of LRRK2 kinase activity is a promising therapeutic strategy. Whilst preclinical studies in model organisms have raised some on-target toxicity concerns 5–8 , the biological consequences of LRRK2 inhibition have not been well characterized in humans. Here we systematically analyse LoF variants in LRRK2 observed across 141,456 in iduals sequenced in the Genome Aggregation Database (gnomAD) 9 and over 4 million participants in the 23andMe genotyped dataset, to assess their impact at a molecular and phenotypic level. After thorough variant curation, we identify 1,358 in iduals with high-confidence predicted LoF variants in LRRK2 , several with experimental validation. We show that heterozygous LoF of LRRK2 reduces LRRK2 protein level by ~50% but is not associated with reduced life expectancy, or with any specific phenotype or disease state. These data suggest that therapeutics that downregulate LRRK2 levels or kinase activity by up to 50% are unlikely to have major on-target safety liabilities. Our results demonstrate the value of large scale genomic databases and phenotyping of human LoF carriers for target validation in drug discovery.
Publisher: Springer Science and Business Media LLC
Date: 27-01-2016
Publisher: Springer Science and Business Media LLC
Date: 04-2017
DOI: 10.1038/NATURE22034
Publisher: Elsevier BV
Date: 05-2014
Publisher: Elsevier BV
Date: 08-2014
Publisher: Cold Spring Harbor Laboratory
Date: 23-07-2021
DOI: 10.1101/2021.07.23.453510
Abstract: Databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance however, until now, genetic databases such as the Genome Aggregation Database (gnomAD) have ignored the mitochondrial genome (mtDNA). Here we present a pipeline to call mtDNA variants that addresses three technical challenges: (i) detecting homoplasmic and heteroplasmic variants, present respectively in all or a fraction of mtDNA molecules, (ii) circular mtDNA genome, and (iii) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded s les prone to NUMT misalignment (few mtDNA copies per cell), cell line artifacts (many mtDNA copies per cell), or with contamination and we reported variants with heteroplasmy greater than 10%. We applied this pipeline to 56,434 whole genome sequences in the gnomAD v3.1 database that includes in iduals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 in iduals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mitochondrial population allele frequencies are publicly available at gnomad.broadinstitute.org and will aid in diagnostic interpretation and research studies.
Publisher: Elsevier BV
Date: 05-2020
Publisher: Springer Science and Business Media LLC
Date: 09-08-2016
DOI: 10.1038/NCOMMS12342
Abstract: Protein-truncating variants protective against human disease provide in vivo validation of therapeutic targets. Here we used targeted sequencing to conduct a search for protein-truncating variants conferring protection against inflammatory bowel disease exploiting knowledge of common variants associated with the same disease. Through replication genotyping and imputation we found that a predicted protein-truncating variant (rs36095412, p.R179X, genotyped in 11,148 ulcerative colitis patients and 295,446 controls, MAF=up to 0.78%) in RNF186 , a single-exon ring finger E3 ligase with strong colonic expression, protects against ulcerative colitis (overall P =6.89 × 10 −7 , odds ratio=0.30). We further demonstrate that the truncated protein exhibits reduced expression and altered subcellular localization, suggesting the protective mechanism may reside in the loss of an interaction or function via mislocalization and/or loss of an essential transmembrane domain.
Publisher: Cold Spring Harbor Laboratory
Date: 14-03-2019
DOI: 10.1101/578674
Abstract: Structural variants (SVs) rearrange large segments of the genome and can have profound consequences for evolution and human diseases. As national biobanks, disease association studies, and clinical genetic testing grow increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD) have become integral for interpreting genetic variation. To date, no large-scale reference maps of SVs exist from high-coverage sequencing comparable to those available for point mutations in protein-coding genes. Here, we constructed a reference atlas of SVs across 14,891 genomes from erse global populations (54% non-European) as a component of gnomAD. We discovered a rich landscape of 433,371 distinct SVs, including 5,295 multi-breakpoint complex SVs across 11 mutational subclasses, and ex les of localized chromosome shattering, as in chromothripsis. The average in idual harbored 7,439 SVs, which accounted for 25-29% of all rare protein-truncating events per genome. We found strong correlations between constraint against damaging point mutations and rare SVs that both disrupt and duplicate protein-coding sequence, suggesting intolerance to reciprocal dosage alterations for a subset of tightly regulated genes. We also uncovered modest selection against noncoding SVs in cis -regulatory elements, although selection against protein-truncating SVs was stronger than any effect on noncoding SVs. Finally, we benchmarked carrier rates for medically relevant SVs, finding very large (≥1Mb) rare SVs in 3.8% of genomes (~1:26 in iduals) and clinically reportable incidental SVs in 0.18% of genomes (~1:556 in iduals). These data have been integrated directly into the gnomAD browser ( gnomad.broadinstitute.org ) and will have broad utility for population genetics, disease association, and diagnostic screening.
Publisher: Hindawi Limited
Date: 03-12-2019
DOI: 10.1002/HUMU.23938
Publisher: Cold Spring Harbor Laboratory
Date: 24-09-2020
DOI: 10.1101/2020.09.22.20195529
Abstract: Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier will develop the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult in iduals (38,618 multi-ancestral in iduals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we applied clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias displayed effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers averaged below 60% in both studies for all conditions except monogenic diabetes. We assessed additional epidemiologic and genetic factors contributing to risk prediction, demonstrating that inclusion of common polygenic variation significantly improved biomarker estimation for two monogenic dyslipidemias.
Publisher: Cold Spring Harbor Laboratory
Date: 11-03-2013
Abstract: Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional effects lags behind that of other types of variants. Using population-scale sequencing, we have identified a high-quality set of 1.6 million indels from 179 in iduals representing three erse human populations. We show that rates of indel mutagenesis are highly heterogeneous, with 43%–48% of indels occurring in 4.03% of the genome, whereas in the remaining 96% their prevalence is 16 times lower than SNPs. Polymerase slippage can explain upwards of three-fourths of all indels, with the remainder being mostly simple deletions in complex sequence. However, insertions do occur and are significantly associated with pseudo-palindromic sequence features compatible with the fork stalling and template switching (FoSTeS) mechanism more commonly associated with large structural variations. We introduce a quantitative model of polymerase slippage, which enables us to identify indel-hypermutagenic protein-coding genes, some of which are associated with recurrent mutations leading to disease. Accounting for mutational rate heterogeneity due to sequence context, we find that indels across functional sequence are generally subject to stronger purifying selection than SNPs. We find that indel length modulates selection strength, and that indels affecting multiple functionally constrained nucleotides undergo stronger purifying selection. We further find that indels are enriched in associations with gene expression and find evidence for a contribution of nonsense-mediated decay. Finally, we show that indels can be integrated in existing genome-wide association studies (GWAS) although we do not find direct evidence that potentially causal protein-coding indels are enriched with associations to known disease-associated SNPs, our findings suggest that the causal variant underlying some of these associations may be indels.
Publisher: Springer Science and Business Media LLC
Date: 07-2012
DOI: 10.1038/487427A
Publisher: Elsevier BV
Date: 09-2009
Publisher: Cold Spring Harbor Laboratory
Date: 27-10-2010
Abstract: Small insertions and deletions (indels) are a common and functionally important type of sequence polymorphism. Most of the focus of studies of sequence variation is on single nucleotide variants (SNVs) and large structural variants. In principle, high-throughput sequencing studies should allow identification of indels just as SNVs. However, inference of indels from next-generation sequence data is challenging, and so far methods for identifying indels lag behind methods for calling SNVs in terms of sensitivity and specificity. We propose a Bayesian method to call indels from short-read sequence data in in iduals and populations by realigning reads to candidate haplotypes that represent alternative sequence to the reference. The candidate haplotypes are formed by combining candidate indels and SNVs identified by the read mapper, while allowing for known sequence variants or candidates from other methods to be included. In our probabilistic realignment model we account for base-calling errors, mapping errors, and also, importantly, for increased sequencing error indel rates in long homopolymer runs. We show that our method is sensitive and achieves low false discovery rates on simulated and real data sets, although challenges remain. The algorithm is implemented in the program Dindel, which has been used in the 1000 Genomes Project call sets.
Publisher: Wiley
Date: 05-03-2020
DOI: 10.1111/CGE.13722
Publisher: Springer Science and Business Media LLC
Date: 03-03-2017
DOI: 10.1038/S41525-017-0006-7
Abstract: Childhood-onset muscle disorders are genetically heterogeneous. Diagnostic workup has traditionally included muscle biopsy, protein-based studies of muscle specimens, and candidate gene sequencing. High throughput or massively parallel sequencing is transforming the approach to diagnosis of rare diseases however, evidence for cost-effectiveness is lacking. Patients presenting with suspected congenital muscular dystrophy or nemaline myopathy were ascertained over a 15-year period. Patients were investigated using traditional diagnostic approaches. Undiagnosed patients were investigated using either massively parallel sequencing of a panel of neuromuscular disease genes panel, or whole exome sequencing. Cost data were collected for all diagnostic investigations. The diagnostic yield and cost effectiveness of a molecular approach to diagnosis, prior to muscle biopsy, were compared with the traditional approach. Fifty-six patients were analysed. Compared with the traditional invasive muscle biopsy approach, both the neuromuscular disease panel and whole exome sequencing had significantly increased diagnostic yields (from 46 to 75% for the neuromuscular disease panel, and 79% for whole exome sequencing), and reduced the cost per diagnosis from USD$16,495 (95% CI: $12,413–$22,994) to USD$3706 (95% CI: $3086–$4453) for the neuromuscular disease panel and USD$5646 (95% CI: $4501–$7078) for whole exome sequencing. The neuromuscular disease panel was the most cost-effective, saving USD$17,075 (95% CI: $10,654–$30,064) per additional diagnosis, over the traditional diagnostic pathway. Whole exome sequencing saved USD$10,024 (95% CI: $5795–$17,135) per additional diagnosis. This study demonstrates the cost-effectiveness of investigation using massively parallel sequencing technologies in paediatric muscle disease. The findings emphasise the value of implementing these technologies in clinical practice, with particular application for diagnosis of Mendelian diseases, and provide evidence crucial for government subsidy and equitable access.
Publisher: Elsevier BV
Date: 09-2020
Publisher: American Physiological Society
Date: 10-2008
DOI: 10.1152/AJPCELL.00179.2008
Abstract: The actin-binding protein α-actinin-3 is one of the two isoforms of α-actinin that are found in the Z-discs of skeletal muscle. α-Actinin-3 is exclusively expressed in fast glycolytic muscle fibers. Homozygosity for a common polymorphism in the ACTN3 gene results in complete deficiency of α-actinin-3 in about 1 billion in iduals worldwide. Recent genetic studies suggest that the absence of α-actinin-3 is detrimental to sprint and power performance in elite athletes and in the general population. In contrast, α-actinin-3 deficiency appears to be beneficial for endurance athletes. To determine the effect of α-actinin-3 deficiency on the contractile properties of skeletal muscle, we studied isolated extensor digitorum longus (fast-twitch) muscles from a specially developed α-actinin-3 knockout (KO) mouse. α-Actinin-3-deficient muscles showed similar levels of damage to wild-type (WT) muscles following lengthening contractions of 20% strain, suggesting that the presence or absence of α-actinin-3 does not significantly influence the mechanical stability of the sarcomere in the mouse. α-Actinin-3 deficiency does not result in any change in myosin heavy chain expression. However, compared with α-actinin-3-positive muscles, α-actinin-3-deficient muscles displayed longer twitch half-relaxation times, better recovery from fatigue, smaller cross-sectional areas, and lower twitch-to-tetanus ratios. We conclude that α-actinin-3 deficiency results in fast-twitch, glycolytic fibers developing slower-twitch, more oxidative properties. These changes in the contractile properties of fast-twitch skeletal muscle from α-actinin-3-deficient in iduals would be detrimental to optimal sprint and power performance, but beneficial for endurance performance.
Publisher: Springer Science and Business Media LLC
Date: 10-2015
Publisher: Cold Spring Harbor Laboratory
Date: 13-08-2020
DOI: 10.1101/2020.08.12.248526
Abstract: Current clinical guidelines recommend three genetic tests for the assessment of fetal structural anomalies: karyotype to detect microscopically-visible balanced and unbalanced chromosomal rearrangements, chromosomal microarray (CMA) to detect sub-microscopic copy number variants (CNVs), and exome sequencing (ES) to identify in idual nucleotide changes in coding sequence. Advances in genome sequencing (GS) analysis suggest that it is poised to displace the sequential application of all three conventional tests to become a single diagnostic approach for the assessment of fetal structural anomalies. However, systematic benchmarking is required to assure that GS can capture the full mutational spectrum associated with fetal structural anomalies and to accurately quantify the added diagnostic yield of GS. We applied a novel GS analytic framework that included the discovery, filtration, and interpretation of nine classes of genomic variation to 7,195 in iduals. We assessed the sensitivity of GS to detect diagnostic variants (pathogenic or likely pathogenic) from three standard-of-care tests using 1,612 autism spectrum disorder quartet families (ASD n=6,448) with matched GS, ES, and CMA data, and validated these findings in 46 fetuses with a clinically reportable variant originally identified by karyotype, CMA, or ES. We then assessed the added diagnostic yield of GS in 249 trios (n=747) comprising a fetus with a structural anomaly detected by ultrasound and two unaffected parents that were pre-screened with a combination of all three standard-of-care tests. Across both cohorts, our GS analytic framework identified 98.2% of all diagnostic variants detected by standard-of-care tests, including 100% of those originally detected by CMA (n=88) and ES (n=61), as well as 78.6% (n=11/14) of the chromosomal rearrangements identified by karyotype. The diagnostic yield from GS was 7.8% across all 1,612 ASD probands, almost two-fold more than CMA (4.4%) and three-fold more than ES (3.0%). We also demonstrated that the yield of ES can approach that of GS when CNVs are captured with high sensitivity from exome data (7.4% vs. 7.8%, respectively). In 249 pre-screened fetuses with structural anomalies, GS provided an additional diagnostic yield of 0.4% beyond the combination of all three tests (karyotype, CMA, and ES). Applying our benchmarking results to existing data indicates that GS can achieve an overall diagnostic yield of 46.1% in unselected fetuses with fetal structural anomalies, providing an estimated 17.2% increase in diagnostic yield over karyotype, 14.1% over CMA, and 36.1% over ES when sequence variants are assessed, and 4.1% when CNVs are also identified from exome data. In this study we demonstrate that GS is sensitive to the detection of almost all pathogenic variation captured by karyotype, CMA, and ES, provides a superior diagnostic yield than any in idual test by a wide margin, and contributes a modest increase in diagnostic yield beyond the combination of all three tests. We also outline several strategies to aid the interpretation of GS variants that are cryptic to conventional technologies, which we anticipate will be increasingly encountered as comprehensive variant identification from GS is performed. Taken together, these data suggest GS warrants consideration as a first-tier diagnostic approach for fetal structural anomalies.
Publisher: Springer Science and Business Media LLC
Date: 08-05-2018
DOI: 10.1038/S41467-018-03621-1
Abstract: Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.
Publisher: Springer Science and Business Media LLC
Date: 29-08-2020
Publisher: Springer Science and Business Media LLC
Date: 12-07-2017
DOI: 10.1038/NCOMMS16015
Abstract: Hand grip strength is a widely used proxy of muscular fitness, a marker of frailty, and predictor of a range of morbidities and all-cause mortality. To investigate the genetic determinants of variation in grip strength, we perform a large-scale genetic discovery analysis in a combined s le of 195,180 in iduals and identify 16 loci associated with grip strength ( P × 10 −8 ) in combined analyses. A number of these loci contain genes implicated in structure and function of skeletal muscle fibres ( ACTG1 ), neuronal maintenance and signal transduction ( PEX14, TGFA, SYT1 ), or monogenic syndromes with involvement of psychomotor impairment ( PEX14, LRPPRC and KANSL1 ). Mendelian randomization analyses are consistent with a causal effect of higher genetically predicted grip strength on lower fracture risk. In conclusion, our findings provide new biological insight into the mechanistic underpinnings of grip strength and the causal role of muscular strength in age-related morbidities and mortality.
Publisher: Springer Science and Business Media LLC
Date: 03-02-2021
DOI: 10.1038/S41586-020-03177-5
Abstract: A Correction to this paper has been published: 0.1038/s41586-020-03177-5
Publisher: Springer Science and Business Media LLC
Date: 18-03-2014
DOI: 10.1038/NG.3469
Publisher: Springer Science and Business Media LLC
Date: 11-10-2006
Abstract: The functional allele (577R) of ACTN3, which encodes human alpha-actinin-3, has been reported to be associated with elite athletic status and with response to resistance training, while the nonfunctional allele (577X) has been proposed as a candidate metabolically thrifty allele. In a study of 992 adolescent Greeks, we show that there is a significant association (P=0.003) between the ACTN3 R577X polymorphism and 40 m sprint time in males that accounts for 2.3% of phenotypic variance, with the 577R allele contributing to faster times in an additive manner. The R577X polymorphism is not associated with other power phenotypes related to 40 m sprint, nor with an endurance phenotype. Furthermore, the polymorphism is not associated with obesity-related phenotypes in our population, suggesting that the 577X allele is not a thrifty allele, and thus the persistence of this null allele must be explained in other terms.
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41586-020-2308-7
Abstract: Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large s le sizes 1 . Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
Publisher: Elsevier BV
Date: 11-2017
DOI: 10.1016/J.NMD.2017.07.006
Abstract: Defects of O-linked glycosylation of alpha-dystroglycan cause a wide spectrum of muscular dystrophies ranging from severe congenital muscular dystrophy associated with abnormal brain and eye development to mild limb girdle muscular dystrophy. We report a female patient who developed isolated pelvic girdle muscle weakness and wasting, which became symptomatic at age 42. Exome sequencing uncovered a homozygous c.131T > G (p.Leu44Pro) substitution in DPM3, encoding dolichol-P-mannose (DPM) synthase subunit 3, leading to a 50% reduction of enzymatic activity. Decreased availability of DPM as an essential donor substrate for protein O-mannosyltransferase (POMT) 1 and 2 explains defective skeletal muscle alpha-dystroglycan O-glycosylation. Our findings show that DPM3 mutations may lead to an isolated and mild limb girdle muscular dystrophy phenotype without cardiomyopathy.
Publisher: Cold Spring Harbor Laboratory
Date: 30-10-2017
DOI: 10.1101/211292
Abstract: With the advent of gene therapies for inherited retinal degenerations (IRDs), genetic diagnostics will have an increasing role in clinical decision-making. Yet the genetic cause of disease cannot be identified using exon-based sequencing for a significant portion of patients. We hypothesized that non-coding mutations contribute significantly to the genetic causality of IRDs and evaluated patients with single coding mutations in RPGRIP1 to test this hypothesis. IRD families underwent targeted panel sequencing. Unsolved cases were explored by whole exome and genome sequencing looking for additional mutations. Candidate mutations were then validated by Sanger sequencing, quantitative PCR, and in vitro splicing assays in two cell lines analyzed through licon sequencing. Among 1722 families, three had biallelic loss of function mutations in RPGRIP1 while seven had a single disruptive coding mutation. Whole exome and genome sequencing revealed potential non-coding mutations in these seven families. In six, the non-coding mutations were shown to lead to loss of function in vitro . Non-coding mutations were identified in 6 of 7 families with single coding mutations in RPGRIP1 . The results suggest that non-coding mutations contribute significantly to the genetic causality of IRDs and RPGRIP1 –mediated IRDs are more common than previously thought.
Publisher: Hindawi Limited
Date: 21-03-2017
DOI: 10.1002/HUMU.23203
Publisher: Springer Science and Business Media LLC
Date: 29-05-2020
DOI: 10.1186/S13073-020-00744-3
Abstract: Distinct prevalence of inherited genetic predisposition may partially explain the difference of cancer risks across ancestries. Ancestry-specific analyses of germline genomes are required to inform cancer genetic risk and prognosis of erse populations. We conducted analyses using germline and somatic sequencing data generated by The Cancer Genome Atlas. Collapsing pathogenic and likely pathogenic variants to cancer predisposition genes (CPG), we analyzed the association between CPGs and cancer types within ancestral groups. We also identified the predisposition-associated two-hit events and gene expression effects in tumors. Genetic ancestry analysis classified the cohort of 9899 cancer cases into in iduals of primarily European ( N = 8184, 82.7%), African ( N = 966, 9.8%), East Asian ( N = 649, 6.6%), South Asian ( N = 48, 0.5%), Native/Latin American ( N = 41, 0.4%), and admixed ( N = 11, 0.1%) ancestries. In the African ancestry, we discovered a potentially novel association of BRCA2 in lung squamous cell carcinoma (OR = 41.4 [95% CI, 6.1–275.6] FDR = 0.002) previously identified in Europeans, along with a known association of BRCA2 in ovarian serous cystadenocarcinoma (OR = 8.5 [95% CI, 1.5–47.4] FDR = 0.045). In the East Asian ancestry, we discovered one previously known association of BRIP1 in stomach adenocarcinoma (OR = 12.8 [95% CI, 1.8–90.8] FDR = 0.038). Rare variant burden analysis further identified 7 suggestive associations in African ancestry in iduals previously described in European ancestry, including SDHB in pheochromocytoma and paraganglioma, ATM in prostate adenocarcinoma, VHL in kidney renal clear cell carcinoma, FH in kidney renal papillary cell carcinoma, and PTEN in uterine corpus endometrial carcinoma. Most predisposing variants were found exclusively in one ancestry in the TCGA and gnomAD datasets. Loss of heterozygosity was identified for 7 out of the 15 African ancestry carriers of predisposing variants. Further, tumors from the SDHB or BRCA2 carriers showed simultaneous allelic-specific expression and low gene expression of their respective affected genes, and FH splice-site variant carriers showed mis-splicing of FH . While several CPGs are shared across patients, many pathogenic variants are found to be ancestry-specific and trigger somatic effects. Studies using larger cohorts of erse ancestries are required to pinpoint ancestry-specific genetic predisposition and inform genetic screening strategies.
Publisher: American Medical Association (AMA)
Date: 12-2015
DOI: 10.1001/JAMANEUROL.2015.2274
Abstract: To our knowledge, the efficacy of transferring next-generation sequencing from a research setting to neuromuscular clinics has never been evaluated. To translate whole-exome sequencing (WES) to clinical practice for the genetic diagnosis of a large cohort of patients with limb-girdle muscular dystrophy (LGMD) for whom protein-based analyses and targeted Sanger sequencing failed to identify the genetic cause of their disorder. We performed WES on 60 families with LGMDs (100 exomes). Data analysis was performed between January 6 and December 19, 2014, using the xBrowse bioinformatics interface (Broad Institute). Patients with LGMD were ascertained retrospectively through the Institute for Neuroscience and Muscle Research Biospecimen Bank between 2006 and 2014. Enrolled patients had been extensively investigated via protein studies and candidate gene sequencing and remained undiagnosed. Patients presented with more than 2 years of muscle weakness and with dystrophic or myopathic changes present in muscle biopsy specimens. The diagnostic rate of LGMD in Australia and the relative frequencies of the different LGMD subtypes. Our central goals were to improve the genetic diagnosis of LGMD, investigate whether the WES platform provides adequate coverage of known LGMD-related genes, and identify new LGMD-related genes. With WES, we identified likely pathogenic mutations in known myopathy genes for 27 of 60 families. Twelve families had mutations in known LGMD-related genes. However, 15 families had variants in disease-related genes not typically associated with LGMD, highlighting the clinical overlap between LGMD and other myopathies. Common causes of phenotypic overlap were due to mutations in congenital muscular dystrophy-related genes (4 families) and collagen myopathy-related genes (4 families). Less common myopathies included metabolic myopathy (2 families), congenital myasthenic syndrome (DOK7), congenital myopathy (ACTA1), tubular aggregate myopathy (STIM1), myofibrillar myopathy (FLNC), and mutation of CHD7, usually associated with the CHARGE syndrome. Inclusion of family members increased the diagnostic efficacy of WES, with a diagnostic rate of 60% for "trios" (an affected proband with both parents) vs 40% for single probands. A follow-up screening of patients whose conditions were undiagnosed on a targeted neuromuscular disease-related gene panel did not improve our diagnostic yield. With WES, we achieved a diagnostic success rate of 45.0% in our difficult-to-diagnose cohort of patients with LGMD. We expand the clinical phenotypes associated with known myopathy genes, and we stress the importance of accurate clinical examination and histopathological results for interpretation of WES, with many diagnoses requiring follow-up review and ancillary investigations of biopsy specimens or serum s les.
Publisher: Springer Science and Business Media LLC
Date: 11-12-1994
DOI: 10.1038/NATURE15393
Publisher: Oxford University Press (OUP)
Date: 30-08-2010
DOI: 10.1093/HMG/DDQ365
Publisher: Springer Science and Business Media LLC
Date: 21-01-2019
DOI: 10.1038/S41467-018-07863-X
Abstract: Cranial growth and development is a complex process which affects the closely related traits of head circumference (HC) and intracranial volume (ICV). The underlying genetic influences shaping these traits during the transition from childhood to adulthood are little understood, but might include both age-specific genetic factors and low-frequency genetic variation. Here, we model the developmental genetic architecture of HC, showing this is genetically stable and correlated with genetic determinants of ICV. Investigating up to 46,000 children and adults of European descent, we identify association with final HC and/or final ICV + HC at 9 novel common and low-frequency loci, illustrating that genetic variation from a wide allele frequency spectrum contributes to cranial growth. The largest effects are reported for low-frequency variants within TP53 , with 0.5 cm wider heads in increaser-allele carriers versus non-carriers during mid-childhood, suggesting a previously unrecognized role of TP53 transcripts in human cranial development.
Publisher: Oxford University Press (OUP)
Date: 28-06-2023
Publisher: Elsevier BV
Date: 10-2016
Publisher: Springer Science and Business Media LLC
Date: 07-10-2009
DOI: 10.1038/NATURE08516
Publisher: Cold Spring Harbor Laboratory
Date: 03-10-2019
DOI: 10.1101/787903
Abstract: The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues, and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the v8 data, based on 17,382 RNA-sequencing s les from 54 tissues of 948 post-mortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans , showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large ersity of tissues, we provide insights into the tissue-specificity of genetic effects, and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.
Publisher: Wiley
Date: 08-12-2016
DOI: 10.1002/ACN3.267
Publisher: Cold Spring Harbor Laboratory
Date: 2011
DOI: 10.1101/GAD.1968411
Abstract: The first wave of personal genomes documents how no single in idual genome contains the full complement of functional genes. Here, we describe the extent of variation in gene and pseudogene numbers between in iduals arising from inactivation events such as premature termination or aberrant splicing due to single-nucleotide polymorphisms. This highlights the inadequacy of the current reference sequence and gene set. We present a proposal to define a reference gene set that will remain stable as more in iduals are sequenced. In particular, we recommend that the ancestral allele be used to define the reference sequence from which a core human reference gene annotation set can be derived. In addition, we call for the development of an expanded gene set to include human-specific genes that have arisen recently and are absent from the ancestral set.
Publisher: Elsevier BV
Date: 02-2017
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 30-12-2016
Publisher: Elsevier BV
Date: 10-2020
Publisher: Cold Spring Harbor Laboratory
Date: 10-07-2018
DOI: 10.1101/365890
Abstract: Diamond-Blackfan anemia (DBA) is a rare bone marrow failure disorder that affects 1 in 100,000 to 200,000 live births and has been associated with mutations in components of the ribosome. In order to characterize the genetic landscape of this genetically heterogeneous disorder, we recruited a cohort of 472 in iduals with a clinical diagnosis of DBA and performed whole exome sequencing (WES). Overall, we identified rare and predicted damaging mutations in likely causal genes for 78% of in iduals. The majority of mutations were singletons, absent from population databases, predicted to cause loss of function, and in one of 19 previously reported genes encoding for a erse set of ribosomal proteins (RPs). Using WES exon coverage estimates, we were able to identify and validate 31 deletions in DBA associated genes. We also observed an enrichment for extended splice site mutations and validated the erse effects of these mutations using RNA sequencing in patientderived cell lines. Leveraging the size of our cohort, we observed several robust genotype-phenotype associations with congenital abnormalities and treatment outcomes. In addition to comprehensively identifying mutations in known genes, we further identified rare mutations in 7 previously unreported RP genes that may cause DBA. We also identified several distinct disorders that appear to phenocopy DBA, including 9 in iduals with biallelic CECR1 mutations that result in deficiency of ADA2. However, no new genes were identified at exome-wide significance, suggesting that there are no unidentified genes containing mutations readily identified by WES that explain 5% of DBA cases. Overall, this comprehensive report should not only inform clinical practice for DBA patients, but also the design and analysis of future rare variant studies for heterogeneous Mendelian disorders.
Publisher: Cold Spring Harbor Laboratory
Date: 23-03-2023
DOI: 10.1101/2023.03.19.533370
Abstract: Recessive diseases arise when both the maternal and the paternal copies of a gene are impacted by a damaging genetic variant in the affected in idual. When a patient carries two different potentially causal variants in a gene for a given disorder, accurate diagnosis requires determining that these two variants occur on different copies of the chromosome (i.e., are in trans ) rather than on the same copy (i.e. in cis ). However, current approaches for determining phase, beyond parental testing, are limited in clinical settings. We developed a strategy for inferring phase for rare variant pairs within genes, leveraging genotypes observed in exome sequencing data from the Genome Aggregation Database (gnomAD v2, n=125,748). When applied to trio data where phase can be determined by transmission, our approach estimates phase with 95.7% accuracy and remains accurate even for very rare variants (allele frequency 1×10 −4 ). We also correctly phase 95.9% of variant pairs in a set of 293 patients with Mendelian conditions carrying presumed causal compound heterozygous variants. We provide a public resource of phasing estimates from gnomAD, including phasing estimates for coding variants across the genome and counts per gene of rare variants in trans , that can aid interpretation of rare co-occurring variants in the context of recessive disease.
Publisher: Springer Science and Business Media LLC
Date: 30-07-2018
Publisher: Springer Science and Business Media LLC
Date: 21-04-2023
Publisher: Cold Spring Harbor Laboratory
Date: 06-08-2018
DOI: 10.1101/384271
Publisher: Cold Spring Harbor Laboratory
Date: 21-10-2020
DOI: 10.1101/2020.10.20.347294
Abstract: The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants’ effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6,121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.
Publisher: Oxford University Press (OUP)
Date: 20-01-2010
DOI: 10.1093/HMG/DDQ010
Abstract: Approximately one billion people worldwide are homozygous for a stop codon polymorphism in the ACTN3 gene (R577X) which results in complete deficiency of the fast fibre muscle protein alpha-actinin-3. ACTN3 genotype is associated with human athletic performance and alpha-actinin-3 deficient mice [Actn3 knockout (KO) mice] have a shift in the properties of fast muscle fibres towards slower fibre properties, with increased activity of multiple enzymes in the aerobic metabolic pathway and slower contractile properties. alpha-Actinins have been shown to interact with a number of muscle proteins including the key metabolic regulator glycogen phosphorylase (GPh). In this study, we demonstrated a link between alpha-actinin-3 and glycogen metabolism which may underlie the metabolic changes seen in the KO mouse. Actn3 KO mice have higher muscle glycogen content and a 50% reduction in the activity of GPh. The reduction in enzyme activity is accompanied by altered post-translational modification of GPh, suggesting that alpha-actinin-3 regulates GPh activity by altering its level of phosphorylation. We propose that the changes in glycogen metabolism underlie the downstream metabolic consequences of alpha-actinin-3 deficiency. Finally, as GPh has been shown to regulate calcium handling, we examined calcium handling in KO mouse primary mouse myoblasts and find changes that may explain the slower contractile properties previously observed in these mice. We propose that the alteration in GPh activity in the absence of alpha-actinin-3 is a fundamental mechanistic link in the association between ACTN3 genotype and human performance.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 20-01-1970
DOI: 10.1126/SCITRANSLMED.AAD5169
Abstract: Large genomic reference data sets reveal a spectrum of pathogenicity in the prion protein gene and provide genetic validation for a therapeutic strategy in prion disease.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 11-09-2020
Abstract: Cell type composition, estimated from bulk tissue, maps the cellular specificity of genetic variants.
Publisher: Public Library of Science (PLoS)
Date: 18-08-2014
Publisher: Oxford University Press (OUP)
Date: 11-2009
DOI: 10.1534/GENETICS.109.107722
Abstract: We have evaluated the extent to which SNPs identified by genomewide surveys as showing unusually high levels of population differentiation in humans have experienced recent positive selection, starting from a set of 32 nonsynonymous SNPs in 27 genes highlighted by the HapMap1 project. These SNPs were genotyped again in the HapMap s les and in the Human Genome Diversity Project–Centre d'Etude du Polymorphisme Humain (HGDP–CEPH) panel of 52 populations representing worldwide ersity extended haplotype homozygosity was investigated around all of them, and full resequence data were examined for 9 genes (5 from public sources and 4 from new data sets). For 7 of the genes, genotyping errors were responsible for an artifactual signal of high population differentiation and for 2, the population differentiation did not exceed our significance threshold. For the 18 genes with confirmed high population differentiation, 3 showed evidence of positive selection as measured by unusually extended haplotypes within a population, and 7 more did in between-population analyses. The 9 genes with resequence data included 7 with high population differentiation, and 5 showed evidence of positive selection on the haplotype carrying the nonsynonymous SNP from skewed allele frequency spectra in addition, 2 showed evidence of positive selection on unrelated haplotypes. Thus, in humans, high population differentiation is (apart from technical artifacts) an effective way of enriching for recently selected genes, but is not an infallible pointer to recent positive selection supported by other lines of evidence.
Publisher: American Society for Clinical Investigation
Date: 16-09-2013
DOI: 10.1172/JCI67691
Publisher: Elsevier BV
Date: 10-2020
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41467-019-12438-5
Abstract: Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an in idual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
Publisher: Springer Science and Business Media LLC
Date: 15-03-2017
DOI: 10.1038/EJHG.2017.16
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 29-06-2018
Abstract: Steroid-resistant nephrotic syndrome (SRNS) is a frequent cause of CKD. The discovery of monogenic causes of SRNS has revealed specific pathogenetic pathways, but these monogenic causes do not explain all cases of SRNS. To identify novel monogenic causes of SRNS, we screened 665 patients by whole-exome sequencing. We then evaluated the in vitro functional significance of two genes and the mutations therein that we discovered through this sequencing and conducted complementary studies in podocyte-like Drosophila nephrocytes. We identified conserved, homozygous missense mutations of GAPVD1 in two families with early-onset NS and a homozygous missense mutation of ANKFY1 in two siblings with SRNS. GAPVD1 and ANKFY1 interact with the endosomal regulator RAB5. Coimmunoprecipitation assays indicated interaction between GAPVD1 and ANKFY1 proteins, which also colocalized when expressed in HEK293T cells. Silencing either protein diminished the podocyte migration rate. Compared with wild-type GAPVD1 and ANKFY1, the mutated proteins produced upon ectopic expression of GAPVD1 or ANKFY1 bearing the patient-derived mutations exhibited altered binding affinity for active RAB5 and reduced ability to rescue the knockout-induced defect in podocyte migration. Coimmunoprecipitation assays further demonstrated a physical interaction between nephrin and GAPVD1, and immunofluorescence revealed partial colocalization of these proteins in rat glomeruli. The patient-derived GAPVD1 mutations reduced nephrin-GAPVD1 binding affinity. In Drosophila , silencing Gapvd1 impaired endocytosis and caused mistrafficking of the nephrin ortholog. Mutations in GAPVD1 and probably in ANKFY1 are novel monogenic causes of NS. The discovery of these genes implicates RAB5 regulation in the pathogenesis of human NS.
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41586-020-2287-8
Abstract: Structural variants (SVs) rearrange large segments of DNA 1 and can have profound consequences in evolution and human disease 2,3 . As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD) 4 have become integral in the interpretation of single-nucleotide variants (SNVs) 5 . However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across erse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25–29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage 6 . We also uncovered modest selection against noncoding SVs in cis -regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of s les, and estimate that 0.13% of in iduals may carry an SV that meets the existing criteria for clinically important incidental findings 7 . This SV resource is freely distributed via the gnomAD browser 8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
Publisher: Mary Ann Liebert Inc
Date: 10-2015
Publisher: Informa UK Limited
Date: 20-01-2015
Publisher: Elsevier BV
Date: 2013
Publisher: Springer Science and Business Media LLC
Date: 06-03-2015
DOI: 10.1038/NCOMMS6681
Abstract: Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project ( N =2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 ( N =16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P =6.15 × 10 −9 ) and a new independent variant in PDE8B (MAF=10.4%, P =5.94 × 10 −14 ). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P =1.27 × 10 −9 ) tagging a rare TTR variant (MAF=0.4%, P =2.14 × 10 −11 ). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF %) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.
Publisher: Hindawi Limited
Date: 16-12-2021
DOI: 10.1002/HUMU.24309
Publisher: Springer Science and Business Media LLC
Date: 14-09-2015
DOI: 10.1038/NATURE14962
Publisher: Cold Spring Harbor Laboratory
Date: 19-08-2016
DOI: 10.1101/070581
Abstract: Worldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60,706 in iduals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available, and has already been used extensively by clinical laboratories worldwide.
Publisher: Cold Spring Harbor Laboratory
Date: 08-12-2017
DOI: 10.1101/230946
Abstract: Large-scale population based analyses coupled with advances in technology have demonstrated that the human genome is more erse than originally thought. To date, this ersity has largely been uncovered using short read whole genome sequencing. However, standard short-read approaches, used primarily due to accuracy, throughput and costs, fail to give a complete picture of a genome. They struggle to identify large, balanced structural events, cannot access repetitive regions of the genome and fail to resolve the human genome into its two haplotypes. Here we describe an approach that retains long range information while harnessing the advantages of short reads. Starting from only ∼1ng of DNA, we produce barcoded short read libraries. The use of novel informatic approaches allows for the barcoded short reads to be associated with the long molecules of origin producing a novel datatype known as ‘Linked-Reads’. This approach allows for simultaneous detection of small and large variants from a single Linked-Read library. We have previously demonstrated the utility of whole genome Linked-Reads (lrWGS) for performing diploid, de novo assembly of in idual genomes (Weisenfeld et al. 2017). In this manuscript, we show the advantages of Linked-Reads over standard short read approaches for reference based analysis. We demonstrate the ability of Linked-Reads to reconstruct megabase scale haplotypes and to recover parts of the genome that are typically inaccessible to short reads, including phenotypically important genes such as STRC, SMN 1 and SMN 2 . We demonstrate the ability of both lrWGS and Linked-Read Whole Exome Sequencing (lrWES) to identify complex structural variations, including balanced events, single exon deletions, and single exon duplications. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 22-04-2016
Abstract: On average, most people's genomes contain approximately 100 completely nonfunctional genes. These loss-of-function (LOF) mutations tend to be rare and/or occur only as a single copy within in iduals. Narasimhan et al. investigated LOF in a Pakistani population with high levels of consanguinity. Examining LOF alleles that were identical by descent, they found, as expected, an absence of homozygote LOF for certain protein-coding genes. However, they also identified many homozygote LOF alleles with no apparent deleterious phenotype, including some that were expected to confer genetic disease. Indeed, one family had lost the recombination-associated gene PRDM9 . Science , this issue p. 474
Publisher: Springer Science and Business Media LLC
Date: 08-2016
DOI: 10.1038/NATURE19057
Publisher: Cold Spring Harbor Laboratory
Date: 11-10-2017
Abstract: Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a erse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 in iduals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 11-2007
Publisher: Wiley
Date: 2004
DOI: 10.1002/BIES.20061
Abstract: The alpha-actinins are an ancient family of actin-binding proteins that play structural and regulatory roles in cytoskeletal organisation and muscle contraction. alpha-actinin-3 is the most-highly specialised of the four mammalian alpha-actinins, with its expression restricted largely to fast glycolytic fibres in skeletal muscle. Intriguingly, a significant proportion ( approximately 18%) of the human population is totally deficient in alpha-actinin-3 due to homozygosity for a premature stop codon polymorphism (R577X) in the ACTN3 gene. Recent work in our laboratory has revealed a strong association between R577X genotype and performance in a variety of athletic endeavours. We are currently exploring the function and evolutionary history of the ACTN3 gene and other alpha-actinin family members. The alpha-actinin family provides a fascinating case study in molecular evolution, illustrating phenomena such as functional redundancy in duplicate genes, the evolution of protein function, and the action of natural selection during recent human evolution.
Publisher: Walter de Gruyter GmbH
Date: 12-02-2022
Abstract: Hillslope hydrology in agricultural landscapes is complex due to a variety of hydropedological processes and field management possibilities. The aim was to test if there are any differences in soil properties and water regime along the hillslope and to compare vineyard rows (vine) with inter-rows (grass) area for those properties. The study determined that there are significant differences in the contents of soil particle fractions, pH, and humus content along the slope ( P 0.0001), with lower confidence level in bulk density (P 0.05). Differences between row and inter-row space were significant for the pH, humus, and silt content, but for sand and clay content, and bulk density differences were not determined. The study determined differences in soil water content among five slope positions ( P 0.0001), and between row and inter-row vineyard space (all with P 0.05). Where in the upper slope positions (e. g., P1) soil water content was higher than on lower slope positions. Higher soil water content was observed at higher slope positions, associated with clay content. However, it can be concluded that the retention of moisture on the slope is more influenced by local-scale soil properties (primarily soil texture) and variability of the crop (row/inter-row) than the position on the slope.
Publisher: Springer Science and Business Media LLC
Date: 06-09-2017
Publisher: Elsevier BV
Date: 06-2013
Publisher: Springer Science and Business Media LLC
Date: 03-04-2012
DOI: 10.1038/NRG3218
Publisher: F1000 Research Ltd
Date: 23-05-2017
DOI: 10.12688/WELLCOMEOPENRES.11640.1
Abstract: This software repository provides a pipeline for converting raw ClinVar data files into analysis-friendly tab-delimited tables, and also provides these tables for the most recent ClinVar release. Separate tables are generated for genome builds GRCh37 and GRCh38 as well as for mono-allelic variants and complex multi-allelic variants. Additionally, the tables are augmented with allele frequencies from the ExAC and gnomAD datasets as these are often consulted when analyzing ClinVar variants. Overall, this work provides ClinVar data in a format that is easier to work with and can be directly loaded into a variety of popular analysis tools such as R, python pandas, and SQL databases.
Publisher: Elsevier BV
Date: 11-2016
Publisher: Springer Science and Business Media LLC
Date: 05-06-2015
DOI: 10.1038/NCOMMS8074
Abstract: The analysis of in iduals with ciliary chondrodysplasias can shed light on sensitive mechanisms controlling ciliogenesis and cell signalling that are essential to embryonic development and survival. Here we identify TCTEX1D2 mutations causing Jeune asphyxiating thoracic dystrophy with partially penetrant inheritance. Loss of TCTEX1D2 impairs retrograde intraflagellar transport (IFT) in humans and the protist Chlamydomonas , accompanied by destabilization of the retrograde IFT dynein motor. We thus define TCTEX1D2 as an integral component of the evolutionarily conserved retrograde IFT machinery. In complex with several IFT dynein light chains, it is required for correct vertebrate skeletal formation but may be functionally redundant under certain conditions.
Publisher: Cold Spring Harbor Laboratory
Date: 26-10-2020
DOI: 10.1101/2020.10.26.337352
Abstract: There has not yet been a systematic analysis of hESC whole genomes at a single nucleotide resolution. We therefore performed whole genome sequencing (WGS) of 143 hESC lines and annotated their single nucleotide and structural genetic variants. We found that while a substantial fraction of hESC lines contained large deleterious structural variants, finer scale structural and single nucleotide variants (SNVs) that are ascertainable only through WGS analyses were present in hESCs genomes and human blood-derived genomes at similar frequencies. However, WGS did identify SNVs associated with cancer or other diseases that will likely alter cellular phenotypes and may compromise the safety of hESC-derived cellular products transplanted into humans. As a resource to enable reproducible hESC research and safer translation, we provide a user-friendly WGS data portal and a data-driven scheme for cell line maintenance and selection. Merkle and Ghosh et al. describe insights from the whole genome sequences of commonly used human embryonic stem cell (hESC) lines. Analyses of these sequences show that while hESC genomes had more large structural variants than humans do from genetic inheritance, hESCs did not have an observable excess of finer-scale variants. However, many hESC lines contained rare loss-of-function variants and combinations of common variants that may profoundly shape their biological phenotypes. Thus, genome sequencing data can be valuable to those selecting cell lines for a given biological or clinical application, and the sequences and analysis reported here should facilitate such choices. One third of hESCs we analysed are siblings, and almost all are of European ancestry Large structural variants are common in hESCs, but finer-scale variation is similar to that human populations Many strong-effect loss-of-function mutations and cancer-associated mutations are present in specific hESC lines We provide user-friendly resources for rational hESC line selection based on genome sequence
Publisher: Cold Spring Harbor Laboratory
Date: 24-02-2016
DOI: 10.1101/041111
Abstract: The accurate interpretation of variation in Mendelian disease genes has lagged behind data generation as sequencing has become increasingly accessible. Ongoing large sequencing efforts present huge interpretive challenges, but also provide an invaluable opportunity to characterize the spectrum and importance of rare variation. Here we analyze sequence data from 7,855 clinical cardiomyopathy cases and 60,706 ExAC reference s les to better understand genetic variation in a representative autosomal dominant disorder. We show that in some genes previously reported as important causes of a given cardiomyopathy, rare variation is not clinically informative and there is a high likelihood of false positive interpretation. By contrast, in other genes, we find that diagnostic laboratories may be overly conservative when assessing variant pathogenicity. We outline improved interpretation approaches for specific genes and variant classes and propose that these will increase the clinical utility of testing across a range of Mendelian diseases.
Publisher: Elsevier BV
Date: 10-2017
DOI: 10.1038/GIM.2017.26
Publisher: The Endocrine Society
Date: 09-2013
DOI: 10.1210/JC.2013-1102
Publisher: American Association for the Advancement of Science (AAAS)
Date: 08-05-2015
Abstract: Human genomes show extensive genetic variation across in iduals, but we have only just started documenting the effects of this variation on the regulation of gene expression. Furthermore, only a few tissues have been examined per genetic variant. In order to examine how genetic expression varies among tissues within in iduals, the Genotype-Tissue Expression (GTEx) Consortium collected 1641 postmortem s les covering 54 body sites from 175 in iduals. They identified quantitative genetic traits that affect gene expression and determined which of these exhibit tissue-specific expression patterns. Melé et al. measured how transcription varies among tissues, and Rivas et al. looked at how truncated protein variants affect expression across tissues. Science , this issue p. 648 , p. 660 , p. 666 see also p. 640
Publisher: Elsevier BV
Date: 05-2021
Publisher: Cold Spring Harbor Laboratory
Date: 15-07-2022
DOI: 10.1101/2022.07.13.499964
Abstract: Synonymous mutations change the DNA sequence of a gene without affecting the amino acid sequence of the encoded protein. Although emerging evidence suggests that synonymous mutations can impact RNA splicing, translational efficiency, and mRNA stability 1 , studies in human genetics, mutagenesis screens, and other experiments and evolutionary analyses have repeatedly shown that most synonymous variants are neutral or only weakly deleterious, with some notable exceptions. In their recent article, Shen et al. claim to have disproved these well-established findings. They perform mutagenesis experiments in yeast and conclude that synonymous mutations frequently reduce fitness to the same extent as nonsynonymous mutations 2 . Based on their findings, the authors state that their results “imply that synonymous mutations are nearly as important as nonsynonymous mutations in causing disease.” An accompanying News and Views argues that “revising our expectations about synonymous mutations should expand our view of the genetic underpinnings of human health” 3 . Considering potential technical concerns with these experiments 4 and a large, coherent body of knowledge establishing the predominant neutrality of synonymous variants, we caution against interpreting this study in the context of human disease.
Publisher: Cold Spring Harbor Laboratory
Date: 22-11-2017
DOI: 10.1101/223297
Abstract: Constructed from the consensus of multiple variant callers based on short-read data, existing benchmark datasets for evaluating variant calling accuracy are biased toward easy regions accessible by known algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two human cell lines that are homozygous across the whole genome. This benchmark provides a more accurate and less biased estimate of the error rate of small variant calls in a realistic context.
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41586-020-2329-2
Abstract: The acceleration of DNA sequencing in s les from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable ex le of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy in iduals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD) 1 , we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the ‘proportion expressed across transcripts’, which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue s les from the Genotype Tissue Expression (GTEx) project 2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.
Publisher: Springer Science and Business Media LLC
Date: 13-02-2017
DOI: 10.1038/NG.3789
Publisher: Cold Spring Harbor Laboratory
Date: 12-03-2017
DOI: 10.1101/115964
Publisher: Cold Spring Harbor Laboratory
Date: 14-11-2015
DOI: 10.1101/031641
Abstract: Complete gene knockouts are highly informative about gene function. We exome sequenced 3,222 British Pakistani-heritage adults with high parental relatedness, discovering 1,111 rare-variant homozygous likely loss of function (rhLOF) genotypes predicted to disrupt (knockout) 781 genes. Based on depletion of rhLOF genotypes, we estimate that 13.6% of knockouts are incompatible with adult life, finding on average 1.6 heterozygous recessive lethal LOF variants per adult. Linking to lifelong health records, we observed no association of rhLOF genotypes with prescription- or doctor-consultation rate, and no disease-related phenotypes in 33 of 42 in iduals with rhLOF genotypes in recessive Mendelian disease genes. Phased genome sequencing of a healthy PRDM9 knockout mother, her child and controls, showed meiotic recombination sites localised away from PRDM9-dependent hotspots, demonstrating PRDM9 redundancy in humans.
Publisher: Springer Science and Business Media LLC
Date: 22-10-2013
DOI: 10.1038/MP.2013.125
Publisher: Proceedings of the National Academy of Sciences
Date: 11-12-2017
Abstract: CRISPR-Cas9 holds enormous potential for therapeutic genome editing. Effective therapy requires treatment to be efficient and safe with minimal toxicity. The sequence-based targeting for CRISPR systems necessitates consideration of the unique genomes for each patient targeted for therapy. We show using 7,444 whole-genome sequences that SNPs and indels can reduce on-target CRISPR activity and increase off-target potential when targeting therapeutically implicated loci however, these occurrences are relatively rare. We further identify that differential allele frequencies among populations may result in population-specific alterations in CRISPR targeting specificity. Our findings suggest that human genetic variation should be considered in the design and evaluation of CRISPR-based therapy to minimize risk of treatment failure and/or adverse outcomes.
Publisher: Springer Science and Business Media LLC
Date: 11-04-2016
DOI: 10.1038/NBT.3555
Publisher: BMJ
Date: 22-04-2017
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41591-020-0893-5
Abstract: Human genetic variants predicted to cause loss-of-function of protein-coding genes (pLoF variants) provide natural in vivo models of human gene inactivation and can be valuable indicators of gene function and the potential toxicity of therapeutic inhibitors targeting these genes 1,2 . Gain-of-kinase-function variants in LRRK2 are known to significantly increase the risk of Parkinson’s disease 3,4 , suggesting that inhibition of LRRK2 kinase activity is a promising therapeutic strategy. While preclinical studies in model organisms have raised some on-target toxicity concerns 5–8 , the biological consequences of LRRK2 inhibition have not been well characterized in humans. Here, we systematically analyze pLoF variants in LRRK2 observed across 141,456 in iduals sequenced in the Genome Aggregation Database (gnomAD) 9 , 49,960 exome-sequenced in iduals from the UK Biobank and over 4 million participants in the 23andMe genotyped dataset. After stringent variant curation, we identify 1,455 in iduals with high-confidence pLoF variants in LRRK2 . Experimental validation of three variants, combined with previous work 10 , confirmed reduced protein levels in 82.5% of our cohort. We show that heterozygous pLoF variants in LRRK2 reduce LRRK2 protein levels but that these are not strongly associated with any specific phenotype or disease state. Our results demonstrate the value of large-scale genomic databases and phenotyping of human loss-of-function carriers for target validation in drug discovery.
Publisher: Cold Spring Harbor Laboratory
Date: 19-02-2019
DOI: 10.1101/554444
Abstract: The acceleration of DNA sequencing in patients and population s les has resulted in unprecedented catalogues of human genetic variation, but the interpretation of rare genetic variants discovered using such technologies remains extremely challenging. A striking ex le of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy in iduals. Through manual curation of putative loss of function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)( 1 ), we show that one explanation for this paradox involves alternative mRNA splicing, which allows exons of a gene to be expressed at varying levels across cell types. Currently, no existing annotation tool systematically incorporates this exon expression information into variant interpretation. Here, we develop a transcript-level annotation metric, the proportion expressed across transcripts (pext), which summarizes isoform quantifications for variants. We calculate this metric using 11,706 tissue s les from the Genotype Tissue Expression project( 2 ) (GTEx) and show that it clearly differentiates between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder (ASD) and developmental disorders and intellectual disability (DD/ID) to show that pLoF variants in weakly expressed regions have effect sizes similar to those of synonymous variants, while pLoF variants in highly expressed exons are most strongly enriched among cases versus controls. Our annotation is fast, flexible, and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for rare disease diagnosis, rare variant burden analyses in complex disorders, and curation and prioritization of variants in recall-by-genotype studies.
Publisher: Oxford University Press (OUP)
Date: 17-12-2016
DOI: 10.1093/HMG/DDV613
Publisher: Cold Spring Harbor Laboratory
Date: 14-11-2017
DOI: 10.1101/201178
Abstract: Comprehensive disease gene discovery in both common and rare diseases will require the efficient and accurate detection of all classes of genetic variation across tens to hundreds of thousands of human s les. We describe here a novel assembly-based approach to variant calling, the GATK HaplotypeCaller (HC) and Reference Confidence Model (RCM), that determines genotype likelihoods independently per-s le but performs joint calling across all s les within a project simultaneously. We show by calling over 90,000 s les from the Exome Aggregation Consortium (ExAC) that, in contrast to other algorithms, the HC-RCM scales efficiently to very large s le sizes without loss in accuracy and that the accuracy of indel variant calling is superior in comparison to other algorithms. More importantly, the HC-RCM produces a fully squared-off matrix of genotypes across all s les at every genomic position being investigated. The HC-RCM is a novel, scalable, assembly-based algorithm with abundant applications for population genetics and clinical studies.
Publisher: Elsevier BV
Date: 12-2012
Publisher: Elsevier BV
Date: 04-2020
Publisher: Springer Science and Business Media LLC
Date: 31-10-2016
DOI: 10.1038/NCOMMS13293
Abstract: As new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 in iduals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of in iduals. With 500K in iduals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of .00001 in healthy humans, consistent with very strong intolerance to gene inactivation.
Publisher: Springer Science and Business Media LLC
Date: 22-02-2005
DOI: 10.1007/S00439-005-1261-8
Abstract: Physical fitness is a complex phenotype influenced by a myriad of environmental and genetic factors, and variation in human physical performance and athletic ability has long been recognised as having a strong heritable component. Recently, the development of technology for rapid DNA sequencing and genotyping has allowed the identification of some of the in idual genetic variations that contribute to athletic performance. This review will examine the evidence that has accumulated over the last three decades for a strong genetic influence on human physical performance, with an emphasis on two sets of physical traits, viz. cardiorespiratory and skeletal muscle function, which are particularly important for performance in a variety of sports. We will then review recent studies that have identified in idual genetic variants associated with variation in these traits and the polymorphisms that have been directly associated with elite athlete status. Finally, we explore the scientific implications of our rapidly growing understanding of the genetic basis of variation in performance.
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 2007
Publisher: BMJ
Date: 24-11-2017
Abstract: Mutations in the gene coding for protein O-mannosyl-transferase 2 ( POMT2 ) are known to cause severe congenital muscular dystrophy, and recently, mutations in POMT2 have also been linked to a milder limb-girdle muscular dystrophy (LGMD) phenotype, named LGMD type 2N (LGMD2N). Only four cases have been reported so far. ClinicalTrials.gov ID: NCT02759302 We report 12 new cases of LGMD2N, aged 18–63 years. Muscle involvement was assessed by MRI, muscle strength testing and muscle biopsy analysis. Other clinical features were also recorded. Presenting symptoms were difficulties in walking, pain during exercise, delayed motor milestones and learning disabilities at school. All had some degree of cognitive impairment. Brain MRIs were abnormal in 3 of 10 patients, showing ventricular enlargement in one, periventricular hyperintensities in another and frontal atrophy of the left hemisphere in a third patient. Most affected muscle groups were hip and knee flexors and extensors on strength testing. On MRI, most affected muscles were hamstrings followed by paraspinal and gluteal muscles. The 12 patients in our cohort carried 11 alleles with known mutations, whereas 11 novel mutations accounted for the remaining 13 alleles. We describe the first cohort of patients with LGMD2N and show that unlike other LGMD types, all patients had cognitive impairment. Primary muscle involvement was found in hamstring, paraspinal and gluteal muscles on MRI, which correlated well with reduced muscle strength in hip and knee flexors and extensors. The study expands the mutational spectrum for LGMD2N, with the description of 11 novel POMT2 mutations in the association with LGMD2N. NCT02759302.
Publisher: Springer Science and Business Media LLC
Date: 17-08-2016
DOI: 10.1038/NG.3638
Publisher: Elsevier BV
Date: 09-2019
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 02-09-2016
Publisher: Elsevier BV
Date: 06-2018
Publisher: American Association for the Advancement of Science (AAAS)
Date: 19-04-2017
DOI: 10.1126/SCITRANSLMED.AAL5209
Abstract: Transcriptome sequencing improves the diagnostic rate for Mendelian disease in patients for whom genetic analysis has not returned a diagnosis.
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 24-08-2018
Abstract: Congenital anomalies of the kidney and urinary tract (CAKUT) are the most prevalent cause of kidney disease in the first three decades of life. Previous gene panel studies showed monogenic causation in up to 12% of patients with CAKUT. We applied whole-exome sequencing to analyze the genotypes of in iduals from 232 families with CAKUT, evaluating for mutations in single genes known to cause human CAKUT and genes known to cause CAKUT in mice. In consanguineous or multiplex families, we additionally performed a search for novel monogenic causes of CAKUT. In 29 families (13%), we detected a causative mutation in a known gene for isolated or syndromic CAKUT that sufficiently explained the patient’s CAKUT phenotype. In three families (1%), we detected a mutation in a gene reported to cause a phenocopy of CAKUT. In 15 of 155 families with isolated CAKUT, we detected deleterious mutations in syndromic CAKUT genes. Our additional search for novel monogenic causes of CAKUT in consanguineous and multiplex families revealed a potential single, novel monogenic CAKUT gene in 19 of 232 families (8%). We identified monogenic mutations in a known human CAKUT gene or CAKUT phenocopy gene as the cause of disease in 14% of the CAKUT families in this study. Whole-exome sequencing provides an etiologic diagnosis in a high fraction of patients with CAKUT and will provide a new basis for the mechanistic understanding of CAKUT.
Publisher: MDPI AG
Date: 20-01-2016
DOI: 10.3390/JPM6010005
Publisher: Cold Spring Harbor Laboratory
Date: 07-02-2017
DOI: 10.1101/106468
Abstract: Variants predicted to result in the loss of function (LoF) of human genes have attracted interest because of their clinical impact and surprising prevalence in healthy in iduals. Here, we present ALoFT (Annotation of Loss-of-Function Transcripts), a method to annotate and predict the disease-causing potential of LoF variants. Using data from Mendelian disease-gene discovery projects, we show that ALoFT can distinguish between LoF variants deleterious as heterozygotes and those causing disease only in the homozygous state. Investigation of variants discovered in healthy populations suggests that each in idual carries at least two heterozygous premature stop alleles that could potentially lead to disease if present as homozygotes. When applied to de novo pLoF variants in autism-affected families, ALoFT distinguishes between deleterious variants in patients and benign variants in unaffected siblings. Finally, analysis of somatic variants in 6,500 cancer exomes shows that pLoF variants predicted to be deleterious by ALoFT are enriched in known driver genes.
Publisher: Elsevier BV
Date: 11-2017
DOI: 10.1016/J.NMD.2017.06.013
Abstract: Mutations in the gene encoding the giant skeletal muscle protein titin are associated with a variety of muscle disorders, including recessive congenital myopathies ±cardiomyopathy, limb girdle muscular dystrophy (LGMD) and late onset dominant distal myopathy. Heterozygous truncating mutations have also been linked to dilated cardiomyopathy. The phenotypic spectrum of titinopathies is emerging and expanding, as next generation sequencing techniques make this large gene amenable to sequencing. We undertook whole exome sequencing in four in iduals with LGMD. An essential splice site mutation, previously reported in dilated cardiomyopathy, was identified in all families in combination with a second truncating mutation. Affected in iduals presented with childhood onset proximal weakness associated with joint contractures and elevated CK. Cardiac dysfunction was present in two in iduals. Muscle biopsy showed increased internal nuclei and immunoblotting identified reduction or absence of calpain-3 and demonstrated a marked reduction of C-terminal titin fragments. We confirm the co-occurrence of cardiac and skeletal myopathies associated with recessive truncating titin mutations. Compound heterozygosity of a truncating mutation previously associated with dilated cardiomyopathy and a 'second truncation' in TTN was identified as causative in our skeletal myopathy patients. These findings add to the complexity of interpretation and genetic counselling for titin mutations.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 18-10-2019
Abstract: Genetic variation is high among in iduals, which makes it difficult to identify any one specific pathogenetic variant in patients with idiopathic disease, especially those that are in noncoding regions of the genome. Examining tissue-specific and population-level RNA sequencing data, Mohammadi et al. developed a statistical test, analysis of expression variation (ANEVA), that can quantify how one in idual's gene expression fits in the context of the variation within the general population. By applying ANEVA to a dosage outlier test, the authors identified pathogenic gene transcripts in patients with Mendelian muscle dystrophy. Science , this issue p. 351
Publisher: Cold Spring Harbor Laboratory
Date: 28-01-2019
DOI: 10.1101/531210
Abstract: Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large s le sizes 1 . Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved human mutation rate model, we classify human protein-coding genes along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
Publisher: American Geophysical Union (AGU)
Date: 04-2017
DOI: 10.1002/2017GC006854
Publisher: Springer Science and Business Media LLC
Date: 30-01-2017
Publisher: Springer Science and Business Media LLC
Date: 09-2013
DOI: 10.1038/NATURE12531
Publisher: Springer Science and Business Media LLC
Date: 03-08-2022
DOI: 10.1038/S41586-022-05035-Y
Abstract: Regulation of transcript structure generates transcript ersity and plays an important role in human disease
Publisher: Oxford University Press (OUP)
Date: 07-2009
DOI: 10.1093/BIOINFORMATICS/BTP412
Abstract: Summary: We present a program to improve haplotype reconstruction by incorporating information from paired-end reads, and demonstrate its utility on simulated data. We find that given a fixed coverage, longer reads (implying fewer of them) are preferable. Availability: The executable and user manual can be freely downloaded from ftp://ftp.sanger.ac.uk ub/zn1/HI. Contact: ql2@sanger.ac.uk
Publisher: Cold Spring Harbor Laboratory
Date: 07-05-2015
Abstract: Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue s les from 178 in iduals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, in iduals, and tissues.
Publisher: Springer Science and Business Media LLC
Date: 29-08-2017
DOI: 10.1038/S41467-017-00443-5
Abstract: Variants predicted to result in the loss of function of human genes have attracted interest because of their clinical impact and surprising prevalence in healthy in iduals. Here, we present ALoFT (annotation of loss-of-function transcripts), a method to annotate and predict the disease-causing potential of loss-of-function variants. Using data from Mendelian disease-gene discovery projects, we show that ALoFT can distinguish between loss-of-function variants that are deleterious as heterozygotes and those causing disease only in the homozygous state. Investigation of variants discovered in healthy populations suggests that each in idual carries at least two heterozygous premature stop alleles that could potentially lead to disease if present as homozygotes. When applied to de novo putative loss-of-function variants in autism-affected families, ALoFT distinguishes between deleterious variants in patients and benign variants in unaffected siblings. Finally, analysis of somatic variants in cancer exomes shows that putative loss-of-function variants predicted to be deleterious by ALoFT are enriched in known driver genes.
Publisher: Cold Spring Harbor Laboratory
Date: 10-03-2019
DOI: 10.1101/573378
Abstract: Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an in idual, are a clinically and biologically important class of genetic variation. However, existing tools for variant interpretation typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,996,125 MNVs across the genome with constituent variants falling within 2 bp distance of one another, of which 31,510 exist within the same codon, including 405 predicted to result in gain of a nonsense mutation, 1,818 predicted to rescue a nonsense mutation event that would otherwise be caused by one of the constituent variants, and 16,481 additional variants predicted to alter protein sequences. We show that the distribution of MNVs is highly non-uniform across the genome, and that this non-uniformity can be largely explained by a variety of known mutational mechanisms, such as CpG deamination, replication error by polymerase zeta, or polymerase slippage at repeat junctions. We also provide an estimate of the dinucleotide mutation rate caused by polymerase zeta. Finally, we show that differential CpG methylation drives MNV differences across functional categories. Our results demonstrate the importance of incorporating haplotype-aware annotation for accurate functional interpretation of genetic variation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
Publisher: Elsevier BV
Date: 07-2018
DOI: 10.1016/J.NMD.2018.04.012
Abstract: We describe two Finnish siblings in whom an incidentally detected elevated creatine kinase activity eventually led to a diagnosis of limb-girdle muscular dystrophy-dystroglycanopathy (Type C12 MDDGC12). When diagnosed at age 10 and 13 years, they were mildly affected with a slow or non-progressive disease course. The main symptoms comprised infrequent hip cr s triggered by flexion, neck cr s triggered by yawning, transient growing pains, calf hypertrophy and mild proximal muscle weakness. Their cognitive and motor developments were unremarkable and they were physically active. Whole-exome sequencing revealed compound heterozygous mutations, both of which were novel, in the protein O-mannosyl kinase (POMK) gene in both siblings a missense mutation, p.Pro322Leu (c.965C > T), and a nonsense mutation, p.Arg46Ter (c.136C > T). The results were confirmed by Sanger sequencing, showing that the parents were heterozygous carriers of one mutation each. This report adds to the literature by providing phenotype and genotype data on this ultra-rare POMK-related dystroglycanopathy.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 24-09-2021
Abstract: Over the next decade, the primary challenge in human genetics will be to understand the biological mechanisms by which genetic variants influence phenotypes, including disease risk. Although the scale of this challenge is daunting, better methods for functional variant interpretation will have transformative consequences for disease diagnosis, risk prediction, and the development of new therapies. An array of new methods for characterizing variant impact at scale, using patient tissue s les as well as in vitro models, are already being applied to dissect variant mechanisms across a range of human cell types and environments. These approaches are also increasingly being deployed in clinical settings. We discuss the rationale, approaches, applications, and future outlook for characterizing the molecular and cellular effects of genetic variants.
Publisher: Cold Spring Harbor Laboratory
Date: 24-01-2022
Abstract: Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules (2) circular mtDNA genome and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated s les, cell lines, and s les prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes in iduals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 in iduals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies.
Publisher: Elsevier BV
Date: 2008
DOI: 10.1016/J.NMD.2007.08.009
Abstract: We characterized the frequency of limb-girdle muscular dystrophy (LGMD) subtypes in a cohort of 76 Australian muscular dystrophy patients using protein and DNA sequence analysis. Calpainopathies (8%) and dysferlinopathies (5%) are the most common causes of LGMD in Australia. In contrast to European populations, cases of LGMD2I (due to mutations in FKRP) are rare in Australasia (3%). We have identified a cohort of patients in whom all common disease candidates have been excluded, providing a valuable resource for identification of new disease genes. Cytoplasmic localization of dysferlin correlates with fiber regeneration in a subset of muscular dystrophy patients. In addition, we have identified a group of patients with unidentified forms of LGMD and with markedly abnormal dysferlin localization that does not correlate with fiber regeneration. This pattern is mimicked in primary caveolinopathy, suggesting a subset of these patients may also possess mutations within proteins required for membrane targeting of dysferlin.
Publisher: Cold Spring Harbor Laboratory
Date: 07-02-2017
DOI: 10.1101/106427
Abstract: Family trees have vast applications in multiple fields from genetics to anthropology and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. Here, we collected 86 million profiles from publicly-available online data from genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million in iduals. We leveraged the data to partition the genetic architecture of longevity by inspecting millions of relative pairs and to provide insights to population genetics theories on the dispersion of families. We also report a simple digital procedure to overlay other datasets with our resource in order to empower studies with population-scale genealogical data. Using massive crowd-sourced genealogy data, we created a population-scale family tree resource for scientific studies.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 13-04-2018
Abstract: Human relationships, as documented by family trees, can elucidate the heritability of a host of medical and biological parameters. Kaplanis et al. collected 86 million publicly available profiles from a crowd-sourced genealogy website and used them to examine the genetic architecture of human longevity and migration patterns (see the Perspective by Lussier and Keinan). Various models of inheritance suggested that life span is predominantly attributable to additive genetic effects, with a smaller component from dominant genetic inheritance. The data also suggested that relatedness between in iduals is less attributable to advances in human transportation than to cultural changes. Science , this issue p. 171 see also p. 153
Publisher: eLife Sciences Publications, Ltd
Date: 24-03-2020
DOI: 10.7554/ELIFE.54363
Abstract: By sequencing autozygous human populations, we identified a healthy adult woman with lifelong complete knockout of HAO1 (expected ~1 in 30 million outbred people). HAO1 (glycolate oxidase) silencing is the mechanism of lumasiran, an investigational RNA interference therapeutic for primary hyperoxaluria type 1. Her plasma glycolate levels were 12 times, and urinary glycolate 6 times, the upper limit of normal observed in healthy reference in iduals (n = 67). Plasma metabolomics and lipidomics (1871 biochemicals) revealed 18 markedly elevated biochemicals ( sd outliers versus n = 25 controls) suggesting additional HAO1 effects. Comparison with lumasiran preclinical and clinical trial data suggested she has % residual glycolate oxidase activity. Cell line p.Leu333SerfsTer4 expression showed markedly reduced HAO1 protein levels and cellular protein mis-localisation. In this woman, lifelong HAO1 knockout is safe and without clinical phenotype, de-risking a therapeutic approach and informing therapeutic mechanisms. Unlocking evidence from the ersity of human genetic variation can facilitate drug development.
Publisher: Elsevier BV
Date: 08-2016
DOI: 10.1016/J.NMD.2016.05.013
Abstract: TorsinA-interacting protein 1 (TOR1AIP1) gene is a novel gene that has recently been described to cause limb-girdle muscular dystrophy (LGMD) with mild dilated cardiomyopathy. We report a family with mutations in TOR1AIP1 where the striking clinical feature is severe cardiac failure requiring cardiac transplant in two siblings, in addition to musculoskeletal weakness and muscular dystrophy. We demonstrate an absence of TOR1AIP1 protein expression in cardiac and skeletal muscles of affected siblings. We expand the phenotype of this gene to demonstrate the cardiac involvement and the importance of cardiac surveillance in patients with mutations in TOR1AIP1.
Publisher: Public Library of Science (PLoS)
Date: 18-06-2008
Publisher: Springer Science and Business Media LLC
Date: 27-05-2020
DOI: 10.1038/S41586-020-2267-Z
Abstract: Naturally occurring human genetic variants that are predicted to inactivate protein-coding genes provide an in vivo model of human gene inactivation that complements knockout studies in cells and model organisms. Here we report three key findings regarding the assessment of candidate drug targets using human loss-of-function variants. First, even essential genes, in which loss-of-function variants are not tolerated, can be highly successful as targets of inhibitory drugs. Second, in most genes, loss-of-function variants are sufficiently rare that genotype-based ascertainment of homozygous or compound heterozygous ‘knockout’ humans will await s le sizes that are approximately 1,000 times those presently available, unless recruitment focuses on consanguineous in iduals. Third, automated variant annotation and filtering are powerful, but manual curation remains crucial for removing artefacts, and is a prerequisite for recall-by-genotype efforts. Our results provide a roadmap for human knockout studies and should guide the interpretation of loss-of-function variants in drug development.
Publisher: Proceedings of the National Academy of Sciences
Date: 05-10-2015
Abstract: Both the mitochondrial respiratory chain and reactive oxygen species (ROS) control numerous physiological and pathological cellular responses. ROS such as hydrogen peroxide (H 2 O 2 ) are thought to initiate signaling by broadly and nonspecifically redox-modifying signaling molecules, suggesting that H 2 O 2 signaling may be distinct from other signal transduction pathways. Here, we provide evidence suggesting that H 2 O 2 signaling is under control of what appears to be a typical signal transduction cascade that connects the respiratory chain to the mitochondrial intermembrane space-localized conserved Syk pathway and results in a focused signaling response in erse cell types. The results thus reveal a mechanism that allows the respiratory chain to communicate with the remainder of the cell in response to ROS.
Publisher: Springer Science and Business Media LLC
Date: 2009
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 15-11-2019
Abstract: The discovery of monogenic causes of nephrotic syndrome led to insights about the role of podocytes and the slit diaphragm in the pathogenesis of the disease. The authors describe novel mutations in TBC1D8B in five families with steroid-resistant nephrotic syndrome. TBC1D8B binds to active RAB11A and RAB11B. Silencing TBC1D8B leads to upregulation of RAB11-dependent processes suggesting TBC1D8B inhibits RAB11. TBC1D8B also interacts and colocalizes with the slit diaphragm protein nephrin. Silencing TBC1D8B in podocyte-like Drosophila nephrocytes causes mistrafficking of fly nephrin. Nephrin trafficking in Drosophila requires Rab11 , whereas overexpression of Rab11 causes a similar phenotype as TBC1D8B silencing. These findings implicate regulation of RAB11-dependent vesicular trafficking by TBC1D8B as a novel pathogenetic pathway in nephrotic syndrome. Mutations in about 50 genes have been identified as monogenic causes of nephrotic syndrome, a frequent cause of CKD. These genes delineated the pathogenetic pathways and rendered significant insight into podocyte biology. We used whole-exome sequencing to identify novel monogenic causes of steroid-resistant nephrotic syndrome (SRNS). We analyzed the functional significance of an SRNS-associated gene in vitro and in podocyte-like Drosophila nephrocytes. We identified hemizygous missense mutations in the gene TBC1D8B in five families with nephrotic syndrome. Coimmunoprecipitation assays indicated interactions between TBC1D8B and active forms of RAB11. Silencing TBC1D8B in HEK293T cells increased basal autophagy and exocytosis, two cellular functions that are independently regulated by RAB11. This suggests that TBC1D8B plays a regulatory role by inhibiting endogenous RAB11. Coimmunoprecipitation assays showed TBC1D8B also interacts with the slit diaphragm protein nephrin, and colocalizes with it in immortalized cell lines. Overexpressed murine Tbc1d8b with patient-derived mutations had lower affinity for endogenous RAB11 and nephrin compared with wild-type Tbc1d8b protein. Knockdown of Tbc1d8b in Drosophila impaired function of the podocyte-like nephrocytes, and caused mistrafficking of Sns, the Drosophila ortholog of nephrin. Expression of Rab11 RNAi in nephrocytes entailed defective delivery of slit diaphragm protein to the membrane, whereas RAB11 overexpression revealed a partial phenotypic overlap to Tbc1d8b loss of function. Novel mutations in TBC1D8B are monogenic causes of SRNS. This gene inhibits RAB11. Our findings suggest that RAB11-dependent vesicular nephrin trafficking plays a role in the pathogenesis of nephrotic syndrome.
Publisher: Springer Science and Business Media LLC
Date: 16-07-2018
Publisher: Cold Spring Harbor Laboratory
Date: 02-09-2016
DOI: 10.1101/073114
Abstract: Whole exome and genome sequencing have transformed the discovery of genetic variants that cause human Mendelian disease, but discriminating pathogenic from benign variants remains a daunting challenge. Rarity is recognised as a necessary, although not sufficient, criterion for pathogenicity, but frequency cutoffs used in Mendelian analysis are often arbitrary and overly lenient. Recent very large reference datasets, such as the Exome Aggregation Consortium (ExAC), provide an unprecedented opportunity to obtain robust frequency estimates even for very rare variants. Here we present a statistical framework for the frequency-based filtering of candidate disease-causing variants, accounting for disease prevalence, genetic and allelic heterogeneity, inheritance mode, penetrance, and s ling variance in reference datasets. Using the ex le of cardiomyopathy, we show that our approach reduces by two-thirds the number of candidate variants under consideration in the average exome, and identifies 43 variants previously reported as pathogenic that can now be reclassified. We present precomputed allele frequency cutoffs for all variants in the ExAC dataset.
Publisher: Public Library of Science (PLoS)
Date: 05-2014
Publisher: Elsevier BV
Date: 2019
Publisher: Cold Spring Harbor Laboratory
Date: 12-11-2015
DOI: 10.1101/031518
Abstract: A major goal of biomedicine is to understand the function of every gene in the human genome. Null mutations can disrupt both copies of a given gene in humans and phenotypic analysis of such 'human knockouts' can provide insight into gene function. To date, comprehensive analysis of genes knocked out in humans has been limited by the fact that null mutations are infrequent in the general population and so, observing an in idual homozygous null for a given gene is exceedingly rare. However, consanguineous unions are more likely to result in offspring who carry homozygous null mutations. In Pakistan, consanguinity rates are notably high. Here, we sequenced the protein-coding regions of 7,078 adult participants living in Pakistan and performed phenotypic analysis to identify homozygous null in iduals and to understand consequences of complete gene disruption in humans. We enumerated 36,850 rare ( % minor allele frequency) null mutations. These homozygous null mutations led to complete inactivation of 961 genes in at least one participant. Homozygosity for null mutations at APOC3 was associated with absent plasma apolipoprotein C-III levels at PLAG27, with absent enzymatic activity of soluble lipoprotein-associated phospholipase A2 at CYP2F1, with higher plasma interleukin-8 concentrations and at either A3GALT2 or NRG4, with markedly reduced plasma insulin C-peptide concentrations. After physiologic challenge with oral fat, APOC3 knockouts displayed marked blunting of the usual post-prandial rise in plasma triglycerides compared to wild-type family members. These observations provide a roadmap to understand the consequences of complete disruption of a large fraction of genes in the human genome.
Publisher: Cold Spring Harbor Laboratory
Date: 23-01-2021
DOI: 10.1101/2021.01.22.427687
Abstract: Regulation of transcript structure generates transcript ersity and plays an important role in human disease. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure. In this paper, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 s les from GTEx tissues and cell lines, complementing the GTEx resource. We identified just under 100,000 new transcripts for annotated genes, and validated the protein expression of a similar proportion of novel and annotated transcripts. We developed a new computational package, LORALS, to analyze genetic effects of rare and common variants on the transcriptome via allele-specific analysis of long reads. We called allele-specific expression and transcript structure events, providing novel insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we use this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.
Publisher: Springer Science and Business Media LLC
Date: 09-09-2007
DOI: 10.1038/NG2122
Abstract: More than a billion humans worldwide are predicted to be completely deficient in the fast skeletal muscle fiber protein alpha-actinin-3 owing to homozygosity for a premature stop codon polymorphism, R577X, in the ACTN3 gene. The R577X polymorphism is associated with elite athlete status and human muscle performance, suggesting that alpha-actinin-3 deficiency influences the function of fast muscle fibers. Here we show that loss of alpha-actinin-3 expression in a knockout mouse model results in a shift in muscle metabolism toward the more efficient aerobic pathway and an increase in intrinsic endurance performance. In addition, we demonstrate that the genomic region surrounding the 577X null allele shows low levels of genetic variation and recombination in in iduals of European and East Asian descent, consistent with strong, recent positive selection. We propose that the 577X allele has been positively selected in some human populations owing to its effect on skeletal muscle metabolism.
Publisher: Cold Spring Harbor Laboratory
Date: 13-11-2017
DOI: 10.1101/218875
Abstract: Phenome-wide association studies (PheWAS), which assess whether a genetic variant is associated with multiple phenotypes across a phenotypic spectrum, have been proposed as a possible aid to drug development through elucidating mechanisms of action, identifying alternative indications, or predicting adverse drug events (ADEs). Here, we evaluate whether PheWAS can inform target validation during drug development. We selected 25 single nucleotide polymorphisms (SNPs) linked through genome-wide association studies (GWAS) to 19 candidate drug targets for common disease therapeutic indications. We independently interrogated these SNPs through PheWAS in four large “real-world data” cohorts (23andMe, UK Biobank, FINRISK, CHOP) for association with a total of 1,892 binary endpoints. We then conducted meta-analyses for 145 harmonized disease endpoints in up to 697,815 in iduals and joined results with summary statistics from 57 published GWAS. Our analyses replicate 70% of known GWAS associations and identify 10 novel associations with study-wide significance after multiple test correction (P .8x10 -6 out of 72 novel associations with FDR .1). By leveraging directionality and point estimate of the effect sizes, we describe new associations that may predict ADEs, e.g., acne, high cholesterol, gout and gallstones for rs738409 (p.I148M) in PNPLA3 or asthma for rs1990760 (p.T946A) in IFIH1 . We further propose how quantitative estimates of genetic safety/efficacy profiles can be used to help prioritize candidate targets for a specific indication. Our results demonstrate PheWAS as a powerful addition to the toolkit for drug discovery. Matching genetics with phenotypes in 800,000 in iduals predicts efficacy and on-target safety of future drugs.
Publisher: Springer Science and Business Media LLC
Date: 12-10-2017
DOI: 10.1038/NATURE24041
Publisher: Hindawi Limited
Date: 03-10-2018
DOI: 10.1002/HUMU.23655
Publisher: Elsevier BV
Date: 2021
Publisher: Elsevier BV
Date: 07-2013
Publisher: Cold Spring Harbor Laboratory
Date: 11-10-2017
Abstract: The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes ( cis -eQTLs). More research is needed to identify effects of genetic variation on distant genes ( trans -eQTLs) and understand their biological mechanisms. One common trans -eQTLs mechanism is “mediation” by a local ( cis ) transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are “ cis -mediators” of trans -eQTLs, including those “ cis -hubs” involved in regulation of many trans -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying trans -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study cis -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among erse s les. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of cis -hubs and trans -eQTL regulation across tissue types.
Publisher: Springer Science and Business Media LLC
Date: 29-05-2013
DOI: 10.1038/NG.2653
Publisher: Proceedings of the National Academy of Sciences
Date: 12-04-2012
Publisher: Elsevier BV
Date: 02-2017
DOI: 10.1038/GIM.2016.90
Publisher: Springer Science and Business Media LLC
Date: 09-01-2017
DOI: 10.1038/NG.3743
No related grants have been discovered for Daniel MacArthur.