ARDC Research Link Australia

Publication

Sex-specific survival bias and interaction modeling in coronary artery disease risk prediction

Publisher: Cold Spring Harbor Laboratory

Date: 28-06-2021

DOI: 10.1101/2021.06.23.21259247

Abstract: The 10-year Atherosclerotic Cardiovascular Disease (ASCVD) risk score is the standard approach to predict risk of incident cardiovascular events and recently, addition of CAD polygenic scores (PGS CAD ) have been evaluated. Although age and sex strongly predict the risk of CAD, their interaction with genetic risk prediction has not been systematically examined. This study performed an in-depth evaluation of age and sex effects in genetic CAD risk prediction. The population-based Norwegian HUNT2 cohort of 51,036 in iduals was used as the primary dataset. Findings were replicated in the UK Biobank (372,410 in iduals). Models for 10-year CAD risk were fitted using Cox proportional hazards and Harrell’s concordance index, sensitivity, and specificity were compared. Inclusion of age and sex interactions of PGS CAD to the prediction models increased C-index and sensitivity likely countering the observed survival bias in the baseline. The sensitivity for females was lower than males in all models including genetic information. The two-step approach identified a total of 82.6% of incident CAD cases (74.1% by ASCVD risk score and an additional 8.5% by the PGS CAD interaction model). These findings highlight the importance and complexity of genetic risk in predicting CAD. There is a need for modeling age and sex-interactions terms with polygenic scores to optimize detection of in iduals at high-risk, those who warrant preventive interventions. Sex-specific studies are needed to understand and estimate CAD risk with genetic information. This study used two large population-based longitudinal datasets to evaluate genetic prediction of CAD including age and sex interactions. The model fit and sensitivity of the prediction models increased when including age and sex interaction of PGS CAD to the prediction models likely countering the observed survival bias in the baseline. The sensitivity for females was lower than for males in all models including genetic information. Our results highlight the importance and complexity of genetic risk and suggest including age and sex interactions with polygenic scores to identify more high-risk in iduals for preventive interventions.

Publication

The Biomarker GlycA Is Associated with Chronic Inflammation and Predicts Long-Term Risk of Severe Infection

Publisher: Elsevier BV

Date: 10-2015

DOI: 10.1016/J.CELS.2015.09.007

Abstract: The biomarker glycoprotein acetylation (GlycA) has been shown to predict risk of cardiovascular disease and all-cause mortality. Here, we characterize biological processes associated with GlycA by leveraging population-based omics data and health records from >10,000 in iduals. Our analyses show that GlycA levels are chronic within in iduals for up to a decade. In apparently healthy in iduals, elevated GlycA corresponded to elevation of myriad inflammatory cytokines, as well as a gene coexpression network indicative of increased neutrophil activity, suggesting that in iduals with high GlycA may be in a state of chronic inflammatory response. Accordingly, analysis of infection-related hospitalization and death records showed that increased GlycA increased long-term risk of severe non-localized and respiratory infections, particularly septicaemia and pneumonia. In total, our work demonstrates that GlycA is a biomarker for chronic inflammation, neutrophil activity, and risk of future severe infection. It also illustrates the utility of leveraging multi-layered omics data and health records to elucidate the molecular and cellular processes associated with biomarkers.

Publication

Genome-wide association and Mendelian randomization analysis prioritizes bioactive metabolites with putative causal effects on common diseases

Publisher: Cold Spring Harbor Laboratory

Date: 04-08-2020

DOI: 10.1101/2020.08.01.20166413

Abstract: Bioactive metabolites are central to numerous pathways and disease pathophysiology, yet many bioactive metabolites are still uncharacterized. Here, we quantified bioactive metabolites using untargeted LC-MS plasma metabolomics in two large cohorts (combined N≈9,300) and utilized genome-wide association analysis and Mendelian randomization to uncover genetic loci with roles in bioactive metabolism and prioritize metabolite features for more in-depth characterization. We identified 118 loci associated with levels of 2,319 distinct metabolite features which replicated across cohorts and reached study-wide significance in meta-analysis. Of these loci, 39 were previously not known to be associated with blood metabolites. Loci harboring SLCO1B1 and UGT1A were highly pleiotropic, accounting for % of all associations. Two-s le Mendelian randomization found 46 causal effects of 31 metabolite features on at least one of five common diseases. Of these, 15, including leukotriene D4, had protective effects on both coronary heart disease and primary sclerosing cholangitis. We further assessed the association between baseline metabolite features and incident coronary heart disease using 16 years of follow-up health records. This study characterizes the genetic landscape of bioactive metabolite features and their putative causal effects on disease.

Publication

Power, false discovery rate and Winner’s Curse in eQTL studies

Publisher: Cold Spring Harbor Laboratory

Date: 25-10-2017

DOI: 10.1101/209171

Abstract: Investigation of the genetic architecture of gene expression traits has aided interpretation of disease and trait-associated genetic variants, however key aspects of expression quantitative trait (eQTL) study design and analysis remain understudied. We used extensive, empirically-driven simulations to explore eQTL study design and the performance of various analysis strategies. Across multiple testing correction methods, false discoveries of genes with eQTLs (eGenes) were substantially inflated when false discovery rate (FDR) control was applied to all tests, and only appropriately controlled using hierarchical procedures. All multiple testing correction procedures had low power and inflated FDR for eGenes whose causal SNPs had small allele frequencies using small s le sizes (e.g. frequency % in 100 s les), indicating that even moderately low frequency eQTL SNPs (eSNPs) in these studies are enriched for false discoveries. In scenarios with ≥80% power, the top eSNP was the true simulated eSNP 90% of the time, but substantially less frequently for very common eSNPs (minor allele frequencies %). Overestimation of eQTL effect sizes, so-called “Winner’s Curse”, was common in low and moderate power settings. To address this, we developed a bootstrap method (BootstrapQTL) which led to more accurate effect size estimation. These insights provide a foundation for future eQTL studies, especially those with s ling constraints and subtly different conditions.

Publication

Using Polygenic Risk Scores for Prioritizing Individuals at Greatest Need of a Cardiovascular Disease Risk Assessment

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 08-2023

DOI: 10.1161/JAHA.122.029296

Abstract: The aim of this study was to provide quantitative evidence of the use of polygenic risk scores for systematically identifying in iduals for invitation for full formal cardiovascular disease (CVD) risk assessment. A total of 108 685 participants aged 40 to 69 years, with measured biomarkers, linked primary care records, and genetic data in UK Biobank were used for model derivation and population health modeling. Prioritization tools using age, polygenic risk scores for coronary artery disease and stroke, and conventional risk factors for CVD available within longitudinal primary care records were derived using sex‐specific Cox models. We modeled the implications of initiating guideline‐recommended statin therapy after prioritizing in iduals for invitation to a formal CVD risk assessment. If primary care records were used to prioritize in iduals for formal risk assessment using age‐ and sex‐specific thresholds corresponding to 5% false‐negative rates, then the numbers of men and women needed to be screened to prevent 1 CVD event are 149 and 280, respectively. In contrast, adding polygenic risk scores to both prioritization and formal assessments, and selecting thresholds to capture the same number of events, resulted in a number needed to screen of 116 for men and 180 for women. Using both polygenic risk scores and primary care records to prioritize in iduals at highest risk of a CVD event for a formal CVD risk assessment can efficiently prioritize those who need interventions the most than using primary care records alone. This could lead to better allocation of resources by reducing the number of risk assessments in primary care while still preventing the same number of CVD events.

Publication

Power, false discovery rate and Winner’s Curse in eQTL studies

Publisher: Oxford University Press (OUP)

Date: 05-09-2018

DOI: 10.1093/NAR/GKY780

Publication

Elevated alpha-1 antitrypsin is a major component of GlycA-associated risk for future morbidity and mortality

Publisher: Cold Spring Harbor Laboratory

Date: 26-04-2018

DOI: 10.1101/309138

Abstract: Integration of electronic health records with systems-level biomolecular data has led to the discovery that GlycA, a complex nuclear magnetic resonance (NMR) spectroscopy biomarker, predicts long-term risk of disease onset and death from myriad causes. To determine the molecular underpinnings of the disease risk of the heterogeneous GlycA signal, we used machine learning to build imputation models for GlycA’s constituent glycoproteins, then estimated glycoprotein levels in 11,861 adults across two population-based cohorts with long-term follow-up. While alpha-1-acid glycoprotein had the strongest correlation with GlycA, our analysis revealed that alpha-1 antitrypsin (AAT) was the most predictive of morbidity and mortality for the widest range of diseases, including heart failure (HR=1.60 per s.d., P=1×10 −10 ), influenza and pneumonia (HR=1.37, P=6×10 −10 ), and liver diseases (HR=1.81, P=1×10 −6 ). Despite emerging evidence of AAT's role in suppressing inflammation, transcriptional analyses revealed elevated expression of erse inflammatory immune pathways with elevated AAT levels, suggesting AAT is elevating to compensate for low-grade chronic inflammation. This study clarifies the molecular underpinnings of the GlycA biomarker and its associated disease risk, and indicates a previously unrecognised association between elevated AAT and severe disease onset and mortality.

Publication

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Publisher: Cold Spring Harbor Laboratory

Date: 19-12-2019

DOI: 10.1101/2019.12.14.876474

Abstract: Common human diseases are frequently polygenic in architecture, comprising a large number of risk alleles with small effects spread across the genome 1–3 . Polygenic scores (PGSs) aggregate these alleles into a metric which represents an in idual’s genetic predisposition to a specific disease. PGSs have shown promise for early risk prediction 4–7 , and there is potential to use PGSs to understand disease biology in parallel 8 . Here, we investigate the role plasma protein levels play in cardiometabolic disease risk in a cohort of 3,087 healthy in iduals using PGSs. We found PGSs for coronary artery disease (CAD), type 2 diabetes (T2D), chronic kidney disease (CKD), and ischaemic stroke (IS) were associated with levels of 49 plasma proteins. These associations were polygenic in architecture, largely independent of cis protein QTLs, and robust to environmental variation. Over a median 7.7 years follow-up, 28 of these plasma proteins were associated with future myocardial infarction (MI) or T2D events, 16 of which were causal mediators between polygenic risk and incident disease. These protein mediators of polygenic disease risk included targets of approved therapies which may have repurposing potential. Our results demonstrate that PGSs can identify proteins with causal roles in disease, and may have utility in drug development.

Publication

Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease

Publisher: Elsevier BV

Date: 2022

DOI: 10.1016/J.XGEN.2021.100086

Publication

Depression and genetic susceptibility to cardiometabolic diseases

Publisher: Springer Science and Business Media LLC

Date: 14-02-2022

DOI: 10.1038/S44161-021-00012-6

Publication

Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort

Publisher: Cold Spring Harbor Laboratory

Date: 13-09-2020

DOI: 10.1101/2020.09.12.20193045

Abstract: Co-evolution between humans and the microbial communities colonizing them has resulted in an intimate assembly of thousands of microbial species mutualistically living on and in their body and impacting multiple aspects of host physiology and health. Several studies examining whether human genetic variation can affect gut microbiota suggest a complex combination of environmental and host factors. Here, we leverage a single large-scale population-based cohort of 5,959 genotyped in iduals with matched gut microbial shotgun metagenomes, dietary information and health records up to 16 years post-s ling, to characterize human genetic variations associated with microbial abundances, and predict possible causal links with various diseases using Mendelian randomization (MR). Genome-wide association study (GWAS) identified 583 independent SNP-taxon associations at genome-wide significance ( p .0×10 -8 ), which included notable strong associations with LCT ( p =5.02×10 -35 ), ABO ( p =1.1×10 -12 ), and MED13L ( p =1.84×10 -12 ). A combination of genetics and dietary habits was shown to strongly shape the abundances of certain key bacterial members of the gut microbiota, and explain their genetic association. Genetic effects from the LCT locus on Bifidobacterium and three other associated taxa significantly differed according to dairy intake. Variation in mucin-degrading Faecalicatena lactaris abundances were associated with ABO , highlighting a preferential utilization of secreted A/B/AB-antigens as energy source in the gut, irrespectively of fibre intake. Enterococcus faecalis levels showed a robust association with a variant in MED13L , with putative links to colorectal cancer. Finally, we identified putative causal relationships between gut microbes and complex diseases using MR, with a predicted effect of Morganella on major depressive disorder that was consistent with observational incident disease analysis. Overall, we present striking ex les of the intricate relationship between humans and their gut microbial communities, and highlight important health implications.

Publication

Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants

Publisher: Cold Spring Harbor Laboratory

Date: 27-09-2021

DOI: 10.1101/2021.09.24.21264079

Abstract: Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy has recently become available in UK Biobank. Here, we describe procedures for quality control and removal of technical variation for this biomarker data, comprising 249 circulating metabolites, lipids, and lipoprotein sub-fractions on approximately 121,000 participants. We identify and characterise technical and biological factors associated with in idual biomarkers and find that linear effects on in idual biomarkers can combine in a non-linear fashion for 61 composite biomarkers and 81 biomarker ratios. We create an R package, ukbnmr, for extracting and normalising the metabolic biomarker data, then use ukbnmr to remove unwanted variation from the UK Biobank data. We make available code for re-deriving the 61 composite biomarkers and 81 ratios, and for further derivation of 76 additional biomarker ratios of potential biological significance. Finally, we demonstrate that removal of technical variation leads to increased signal for genetic and epidemiological studies of the NMR metabolic biomarkers in UK Biobank.

Publication

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Publisher: Springer Science and Business Media LLC

Date: 08-11-2021

DOI: 10.1038/S42255-021-00478-5

Publication

Elevated serum alpha-1 antitrypsin is a major component of GlycA-associated risk for future morbidity and mortality

Publisher: Public Library of Science (PLoS)

Date: 23-10-2019

DOI: 10.1371/JOURNAL.PONE.0223692

Publication

The Polygenic and Monogenic Basis of Blood Traits and Diseases

Publisher: Cold Spring Harbor Laboratory

Date: 03-02-2020

DOI: 10.1101/2020.02.02.20020065

Abstract: Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including 563,946 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering the full allele frequency spectrum of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood cell traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell GWAS to interrogate clinically meaningful variants across the full allelic spectrum of human variation.

Publication

Using polygenic risk scores for prioritising individuals at greatest need of a CVD risk assessment

Publisher: Cold Spring Harbor Laboratory

Date: 22-10-2022

DOI: 10.1101/2022.10.20.22281120

Abstract: To provide quantitative evidence of the use of polygenic risk scores (PRS) for systematically identifying in iduals for invitation for full formal cardiovascular disease (CVD) risk assessment. 108,685 participants aged 40-69, with measured biomarkers, linked primary care records and genetic data in UK Biobank were used for model derivation and population health modelling. Prioritisation tools using age, PRS for coronary artery disease and stroke, and conventional risk factors for CVD available within longitudinal primary care records were derived using sex-specific Cox models. Rescaling to account for the healthy cohort effect, we modelled the implications of initiating guideline-recommended statin therapy after prioritising in iduals for invitation to a formal CVD risk assessment. 1,838 CVD events were observed over median follow up of 8.2 years. If primary care records were used to prioritise in iduals for formal risk assessment using age- and sex-specific thresholds corresponding to 5% false negative rates then we would capture 65% and 43% events amongst men and women respectively. The numbers of men and women needed to be screened to prevent one CVD event (NNS) are 74 and 140 respectively. In contrast, adding PRS to both prioritisation and formal assessments, and selecting thresholds to capture the same number of events resulted in a NNS of 60 for men and 90 for women. The use of PRS together with primary care records to prioritise in iduals at highest risk of a CVD event for a formal CVD risk assessment can more efficiently prioritise those who need interventions the most than using primary care records alone. This could lead to better allocation of resources by reducing the number of formal risk assessments in primary care while still preventing the same number CVD events.

Publication

FastSpar: Rapid and scalable correlation estimation for compositional data

Publisher: Oxford University Press (OUP)

Date: 29-08-2019

DOI: 10.1093/BIOINFORMATICS/BTY734

Abstract: A common goal of microbiome studies is the elucidation of community composition and member interactions using counts of taxonomic units extracted from sequence data. Inference of interaction networks from sparse and compositional data requires specialized statistical approaches. A popular solution is SparCC, however its performance limits the calculation of interaction networks for very high-dimensional datasets. Here we introduce FastSpar, an efficient and parallelizable implementation of the SparCC algorithm which rapidly infers correlation networks and calculates P-values using an unbiased estimator. We further demonstrate that FastSpar reduces network inference wall time by 2–3 orders of magnitude compared to SparCC. FastSpar source code, precompiled binaries and platform packages are freely available on GitHub: cwatts/FastSpar Supplementary data are available at Bioinformatics online.

Publication

Integration of polygenic and gut metagenomic risk prediction for common diseases

Publisher: Cold Spring Harbor Laboratory

Date: 05-08-2023

DOI: 10.1101/2023.07.30.23293396

Abstract: Multi-omics has opened new avenues for non-invasive risk profiling and early detection of complex diseases. Both polygenic risk scores (PRSs) and the human microbiome have shown promise in improving risk assessment of various common diseases. Here, in a prospective population-based cohort (FINRISK 2002 n=5,676) with ∼18 years of e-health record follow-up, we assess the incremental and combined value of PRSs and gut metagenomic sequencing as compared to conventional risk factors for predicting incident coronary artery disease (CAD), type 2 diabetes (T2D), Alzheimer’s disease (AD) and prostate cancer. We found that PRSs improved predictive capacity over conventional risk factors for all diseases (ΔC-indices between 0.010 – 0.027). In sex-stratified analyses, gut metagenomics improved predictive capacity over baseline age for CAD, T2D and prostate cancer however, improvement over all conventional risk factors was only observed for T2D (ΔC-index 0.004) and prostate cancer (ΔC-index 0.005). Integrated risk models of PRSs, gut metagenomic scores and conventional risk factors achieved the highest predictive performance for all diseases studied as compared to models based on conventional risk factors alone. We make our integrated risk models available for the wider research community. This study demonstrates that integrated PRS and gut metagenomic risk models improve the predictive value over conventional risk factors for common chronic diseases.

Publication

Quality control and removal of technical variation of NMR metabolic biomarker data in ~120,000 UK Biobank participants

Publisher: Springer Science and Business Media LLC

Date: 31-01-2023

DOI: 10.1038/S41597-023-01949-Y

Abstract: Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy in approximately 121,000 UK Biobank participants has recently been released as a community resource, comprising absolute concentrations and ratios of 249 circulating metabolites, lipids, and lipoprotein sub-fractions. Here we identify and characterise additional sources of unwanted technical variation influencing in idual biomarkers in the data available to download from UK Biobank. These included s le preparation time, shipping plate well, spectrometer batch effects, drift over time within spectrometer, and outlier shipping plates. We developed a procedure for removing this unwanted technical variation, and demonstrate that it increases signal for genetic and epidemiological studies of the NMR metabolic biomarker data in UK Biobank. We subsequently developed an R package, ukbnmr, which we make available to the wider research community to enhance the utility of the UK Biobank NMR metabolic biomarker data and to facilitate rapid analysis.

Publication

Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort

Publisher: Springer Science and Business Media LLC

Date: 02-2022

DOI: 10.1038/S41588-021-00991-Z

Abstract: Human genetic variation affects the gut microbiota through a complex combination of environmental and host factors. Here we characterize genetic variations associated with microbial abundances in a single large-scale population-based cohort of 5,959 genotyped in iduals with matched gut microbial metagenomes, and dietary and health records (prevalent and follow-up). We identified 567 independent SNP-taxon associations. Variants at the LCT locus associated with Bifidobacterium and other taxa, but they differed according to dairy intake. Furthermore, levels of Faecalicatena lactaris associated with ABO, and suggested preferential utilization of secreted blood antigens as energy source in the gut. Enterococcus faecalis levels associated with variants in the MED13L locus, which has been linked to colorectal cancer. Mendelian randomization analysis indicated a potential causal effect of Morganella on major depressive disorder, consistent with observational incident disease analysis. Overall, we identify and characterize the intricate nature of host-microbiota interactions and their association with disease.

Publication

A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets

Publisher: Elsevier BV

Date: 07-2016

DOI: 10.1016/J.CELS.2016.06.012

Abstract: Network modules-topologically distinct groups of edges and nodes-that are preserved across datasets can reveal common features of organisms, tissues, cell types, and molecules. Many statistics to identify such modules have been developed, but testing their significance requires heuristics. Here, we demonstrate that current methods for assessing module preservation are systematically biased and produce skewed p values. We introduce NetRep, a rapid and computationally efficient method that uses a permutation approach to score module preservation without assuming data are normally distributed. NetRep produces unbiased p values and can distinguish between true and false positives during multiple hypothesis testing. We use NetRep to quantify preservation of gene coexpression modules across murine brain, liver, adipose, and muscle tissues. Complex patterns of multi-tissue preservation were revealed, including a liver-derived housekeeping module that displayed adipose- and muscle-specific association with body weight. Finally, we demonstrate the broader applicability of NetRep by quantifying preservation of bacterial networks in gut microbiota between men and women.

Publication

Experimental and Human Evidence for Lipocalin‐2 (Neutrophil Gelatinase‐Associated Lipocalin [NGAL]) in the Development of Cardiac Hypertrophy and Heart Failure

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 06-11-2017

DOI: 10.1161/JAHA.117.005971

Abstract: Cardiac hypertrophy increases the risk of developing heart failure and cardiovascular death. The neutrophil inflammatory protein, lipocalin‐2 ( LCN 2/ NGAL ), is elevated in certain forms of cardiac hypertrophy and acute heart failure. However, a specific role for LCN 2 in predisposition and etiology of hypertrophy and the relevant genetic determinants are unclear. Here, we defined the role of LCN 2 in concentric cardiac hypertrophy in terms of pathophysiology, inflammatory expression networks, and genomic determinants. We used 3 experimental models: a polygenic model of cardiac hypertrophy and heart failure, a model of intrauterine growth restriction and Lcn2 ‐knockout mouse cultured cardiomyocytes and 2 human cohorts: 114 type 2 diabetes mellitus patients and 2064 healthy subjects of the YFS (Young Finns Study). In hypertrophic heart rats, cardiac and circulating Lcn2 was significantly overexpressed before, during, and after development of cardiac hypertrophy and heart failure. Lcn2 expression was increased in hypertrophic hearts in a model of intrauterine growth restriction, whereas Lcn2 ‐knockout mice had smaller hearts. In cultured cardiomyocytes, Lcn2 activated molecular hypertrophic pathways and increased cell size, but reduced proliferation and cell numbers. Increased LCN 2 was associated with cardiac hypertrophy and diastolic dysfunction in diabetes mellitus. In the YFS , LCN 2 expression was associated with body mass index and cardiac mass and with levels of inflammatory markers. The single‐nucleotide polymorphism, rs13297295, located near LCN 2 defined a significant cis ‐ eQTL for LCN 2 expression. Direct effects of LCN 2 on cardiomyocyte size and number and the consistent associations in experimental and human analyses reveal a central role for LCN 2 in the ontogeny of cardiac hypertrophy and heart failure.

Publication

An atlas of genetic scores to predict multi-omic traits

Publisher: Cold Spring Harbor Laboratory

Date: 17-04-2022

DOI: 10.1101/2022.04.17.488593

Abstract: Genetically predicted levels of multi-omic traits can uncover the molecular underpinnings of common phenotypes in a highly efficient manner. Here, we utilised a large cohort (INTERVAL N=50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, N=3,175 Olink, N=4,822), plasma metabolomics (Metabolon HD4, N=8,153), serum metabolomics (Nightingale, N=37,359), and whole blood Illumina RNA sequencing (N=4,136). We used machine learning to train genetic scores for 17,227 molecular traits, including 10,521 which reached Bonferroni-adjusted significance. We evaluated genetic score performances in external validation across European, Asian and African American ancestries, and assessed their longitudinal stability within erse in iduals. We demonstrated the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of UK Biobank to identify disease associations using a phenome-wide scan. Finally, we developed a portal ( OmicsPred.org ) to facilitate public access to all genetic scores and validation results as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.

Publication

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

Publisher: Springer Science and Business Media LLC

Date: 10-03-2021

DOI: 10.1038/S41588-021-00783-5

Publication

Polygenic risk scores in cardiovascular risk prediction: A cohort study and modelling analyses

Publisher: Public Library of Science (PLoS)

Date: 14-01-2021

DOI: 10.1371/JOURNAL.PMED.1003498

Abstract: Polygenic risk scores (PRSs) can stratify populations into cardiovascular disease (CVD) risk groups. We aimed to quantify the potential advantage of adding information on PRSs to conventional risk factors in the primary prevention of CVD. Using data from UK Biobank on 306,654 in iduals without a history of CVD and not on lipid-lowering treatments (mean age [SD]: 56.0 [8.0] years females: 57% median follow-up: 8.1 years), we calculated measures of risk discrimination and reclassification upon addition of PRSs to risk factors in a conventional risk prediction model (i.e., age, sex, systolic blood pressure, smoking status, history of diabetes, and total and high-density lipoprotein cholesterol). We then modelled the implications of initiating guideline-recommended statin therapy in a primary care setting using incidence rates from 2.1 million in iduals from the Clinical Practice Research Datalink. The C-index, a measure of risk discrimination, was 0.710 (95% CI 0.703–0.717) for a CVD prediction model containing conventional risk predictors alone. Addition of information on PRSs increased the C-index by 0.012 (95% CI 0.009–0.015), and resulted in continuous net reclassification improvements of about 10% and 12% in cases and non-cases, respectively. If a PRS were assessed in the entire UK primary care population aged 40–75 years, assuming that statin therapy would be initiated in accordance with the UK National Institute for Health and Care Excellence guidelines (i.e., for persons with a predicted risk of ≥10% and for those with certain other risk factors, such as diabetes, irrespective of their 10-year predicted risk), then it could help prevent 1 additional CVD event for approximately every 5,750 in iduals screened. By contrast, targeted assessment only among people at intermediate (i.e., 5% to %) 10-year CVD risk could help prevent 1 additional CVD event for approximately every 340 in iduals screened. Such a targeted strategy could help prevent 7% more CVD events than conventional risk prediction alone. Potential gains afforded by assessment of PRSs on top of conventional risk factors would be about 1.5-fold greater than those provided by assessment of C-reactive protein, a plasma biomarker included in some risk prediction guidelines. Potential limitations of this study include its restriction to European ancestry participants and a lack of health economic evaluation. Our results suggest that addition of PRSs to conventional risk factors can modestly enhance prediction of first-onset CVD and could translate into population health benefits if used at scale.

Publication

Genetically personalised organ-specific metabolic models in health and disease

Publisher: Springer Science and Business Media LLC

Date: 29-11-2022

DOI: 10.1038/S41467-022-35017-7

Abstract: Understanding how genetic variants influence disease risk and complex traits (variant-to-function) is one of the major challenges in human genetics. Here we present a model-driven framework to leverage human genome-scale metabolic networks to define how genetic variants affect biochemical reaction fluxes across major human tissues, including skeletal muscle, adipose, liver, brain and heart. As proof of concept, we build personalised organ-specific metabolic flux models for 524,615 in iduals of the INTERVAL and UK Biobank cohorts and perform a fluxome-wide association study (FWAS) to identify 4312 associations between personalised flux values and the concentration of metabolites in blood. Furthermore, we apply FWAS to identify 92 metabolic fluxes associated with the risk of developing coronary artery disease, many of which are linked to processes previously described to play in role in the disease. Our work demonstrates that genetically personalised metabolic models can elucidate the downstream effects of genetic variants on biochemical reactions involved in common human diseases.

Publication

Neurocognitive trajectory and proteomic signature of inherited risk for Alzheimer’s disease

Publisher: Public Library of Science (PLoS)

Date: 09-2022

DOI: 10.1371/JOURNAL.PGEN.1010294

Abstract: For Alzheimer’s disease–a leading cause of dementia and global morbidity–improved identification of presymptomatic high-risk in iduals and identification of new circulating biomarkers are key public health needs. Here, we tested the hypothesis that a polygenic predictor of risk for Alzheimer’s disease would identify a subset of the population with increased risk of clinically diagnosed dementia, subclinical neurocognitive dysfunction, and a differing circulating proteomic profile. Using summary association statistics from a recent genome-wide association study, we first developed a polygenic predictor of Alzheimer’s disease comprised of 7.1 million common DNA variants. We noted a 7.3-fold (95% CI 4.8 to 11.0 p 0.001) gradient in risk across deciles of the score among 288,289 middle-aged participants of the UK Biobank study. In cross-sectional analyses stratified by age, minimal differences in risk of Alzheimer’s disease and performance on a digit recall test were present according to polygenic score decile at age 50 years, but significant gradients emerged by age 65. Similarly, among 30,541 participants of the Mass General Brigham Biobank, we again noted no significant differences in Alzheimer’s disease diagnosis at younger ages across deciles of the score, but for those over 65 years we noted an odds ratio of 2.0 (95% CI 1.3 to 3.2 p = 0.002) in the top versus bottom decile of the polygenic score. To understand the proteomic signature of inherited risk, we performed aptamer-based profiling in 636 blood donors (mean age 43 years) with very high or low polygenic scores. In addition to the well-known apolipoprotein E biomarker, this analysis identified 27 additional proteins, several of which have known roles related to disease pathogenesis. Differences in protein concentrations were consistent even among the youngest subset of blood donors (mean age 33 years). Of these 28 proteins, 7 of the 8 proteins with concentrations available were similarly associated with the polygenic score in participants of the Multi-Ethnic Study of Atherosclerosis. These data highlight the potential for a DNA-based score to identify high-risk in iduals during the prolonged presymptomatic phase of Alzheimer’s disease and to enable biomarker discovery based on profiling of young in iduals in the extremes of the score distribution.

Publication

Neonatal genetics of gene expression reveal potential origins of autoimmune and allergic disease risk

Publisher: Springer Science and Business Media LLC

Date: 28-07-2020

DOI: 10.1038/S41467-020-17477-X

Abstract: Chronic immune-mediated diseases of adulthood often originate in early childhood. To investigate genetic associations between neonatal immunity and disease, we map expression quantitative trait loci (eQTLs) in resting myeloid cells and CD4 + T cells from cord blood s les, as well as in response to lipopolysaccharide (LPS) or phytohemagglutinin (PHA) stimulation, respectively. Cis -eQTLs are largely specific to cell type or stimulation, and 31% and 52% of genes with cis -eQTLs have response eQTLs (reQTLs) in myeloid cells and T cells, respectively. We identified cis regulatory factors acting as mediators of trans effects. There is extensive colocalisation between condition-specific neonatal cis -eQTLs and variants associated with immune-mediated diseases, in particular CTSH had widespread colocalisation across diseases. Mendelian randomisation shows causal neonatal gene expression effects on disease risk for BTN3A2 , HLA-C and others. Our study elucidates the genetics of gene expression in neonatal immune cells, and aetiological origins of autoimmune and allergic diseases.

Publication

Genetically personalised organ-specific metabolic models in health and disease

Publisher: Cold Spring Harbor Laboratory

Date: 31-03-2022

DOI: 10.1101/2022.03.25.22272958

Abstract: Understanding how genetic variants influence disease risk and complex traits (variant-to-function) is one of the major challenges in human genetics. Here we present a model-driven framework to leverage human genome-scale metabolic networks to define how genetic variants affect biochemical reaction fluxes across major human tissues, including skeletal muscle, adipose, liver, brain and heart. As proof of concept, we build personalised organ-specific metabolic flux models for 524,615 in iduals of the INTERVAL and UK Biobank cohorts and perform a fluxome-wide association study (FWAS) to identify 4,411 associations between personalised flux values and the concentration of metabolites in blood. Furthermore, we apply FWAS to identify 97 metabolic fluxes associated with the risk of developing coronary artery disease, many of which are linked to processes previously described to play in role in the disease. Our work demonstrates that genetically personalised metabolic models can elucidate the downstream effects of genetic variants on biochemical reactions involved in common human diseases.

Publication

The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation

Publisher: Cold Spring Harbor Laboratory

Date: 23-05-2020

DOI: 10.1101/2020.05.20.20108217

Abstract: Polygenic [risk] scores (PGS) can enhance prediction and understanding of common diseases and traits. However, the reproducibility of PGS and their subsequent applications in biological and clinical research have been hindered by several factors, including: inadequate and incomplete reporting of PGS development, heterogeneity in evaluation techniques, and inconsistent access to, and distribution of, the information necessary to calculate the scores themselves. To address this we present the PGS Catalog (www.PGSCatalog.org), an open resource for polygenic scores. The PGS Catalog currently contains 192 published PGS from 78 publications for 86 erse traits, including diabetes, cardiovascular diseases, neurological disorders, cancers, as well as traits like BMI and blood lipids. Each PGS is annotated with metadata required for reproducibility as well as accurate application in independent studies. Using the PGS Catalog, we demonstrate that multiple PGS can be systematically evaluated to generate comparable performance metrics. The PGS Catalog has capabilities for user deposition, expert curation and programmatic access, thus providing the community with an open platform for polygenic score research and translation.

Publication

Neonatal genetics of gene expression reveal the origins of autoimmune and allergic disease risk

Publisher: Cold Spring Harbor Laboratory

Date: 27-06-2019

DOI: 10.1101/683086

Abstract: Chronic immune-mediated diseases of adulthood often originate in early childhood. To investigate genetic associations between neonatal immunity and disease, we collected cord blood s les from a birth cohort and mapped expression quantitative trait loci (eQTLs) in resting monocytes and CD4 + T cells as well as in response to lipopolysaccharide (LPS) or phytohemagglutinin (PHA) stimulation, respectively. Cis -eQTLs were largely specific to cell type or stimulation, and response eQTLs were identified for 31% of genes with cis -eQTLs (eGenes) in monocytes and 52% of eGenes in CD4 + T cells. We identified trans -eQTLs and mapped cis regulatory factors which act as mediators of trans effects. There was extensive colocalisation of causal variants for cell type- and stimulation-specific neonatal cis -eQTLs and those of autoimmune and allergic diseases, in particular CTSH (Cathepsin H) which showed widespread colocalisation across diseases. Mendelian randomisation showed causal neonatal gene transcription effects on disease risk for BTN3A2 , HLA-C and many other genes. Our study elucidates the genetics of gene expression in neonatal conditions and cell types as well as the aetiological origins of autoimmune and allergic diseases.

Publication

Biomarker Glycoprotein Acetyls Is Associated With the Risk of a Wide Spectrum of Incident Diseases and Stratifies Mortality Risk in Angiography Patients

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 11-2018

DOI: 10.1161/CIRCGEN.118.002234

Abstract: Integration of systems-level biomolecular information with electronic health records has led to recent interest in the glycoprotein acetyls (GlycA) biomarker—a serum- or plasma-derived nuclear magnetic resonance spectroscopy signal that represents the abundance of circulating glycated proteins. GlycA predicts risk of erse outcomes, including cardiovascular disease, type 2 diabetes mellitus, and all-cause mortality however, the underlying detailed associations of GlycA’s morbidity and mortality risk are currently unknown. We used 2 population-based cohorts totaling 11 861 adults from the Finnish general population to test for an association with 468 common incident hospitalization and mortality outcomes during an 8-year follow-up. Further, we utilized 900 angiography patients to test for GlycA association with mortality risk and potential utility for mortality risk discrimination during 12-year follow-up. New associations with GlycA and incident alcoholic liver disease, chronic renal failure, glomerular diseases, chronic obstructive pulmonary disease, inflammatory polyarthropathies, and hypertension were uncovered, and known incident disease associations were replicated. GlycA associations for incident disease outcomes were in general not attenuated when adjusting for hsCRP (high-sensitivity C-reactive protein). Among 900 patients referred to angiography, GlycA had hazard ratios of 4.87 (95% CI, 2.45–9.65) and 5.00 (95% CI, 2.38–10.48) for 12-year risk of mortality in the fourth and fifth quintiles by GlycA levels, demonstrating its prognostic potential for identification of high-risk in iduals. When modeled together, both hsCRP and GlycA were attenuated but remained significant. GlycA was predictive of myriad incident diseases across many major internal organs and stratified mortality risk in angiography patients. Both GlycA and hsCRP had shared and independent contributions to mortality risk, suggesting chronic inflammation as an etiological factor. GlycA may be useful in improving risk prediction in specific disease settings.

Publication

An interaction map of circulating metabolites, immune gene networks, and their genetic regulation

Publisher: Springer Science and Business Media LLC

Date: 08-2017

DOI: 10.1186/S13059-017-1279-Y

Publication

An atlas of genetic scores to predict multi-omic traits

Publisher: Springer Science and Business Media LLC

Date: 29-03-2023

DOI: 10.1038/S41586-023-05844-9

Publication

Sex-Specific Survival Bias and Interaction Modeling in Coronary Artery Disease Risk Prediction

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 02-2023

DOI: 10.1161/CIRCGEN.121.003542

Abstract: The 10-year Atherosclerotic Cardiovascular Disease risk score is the standard approach to predict risk of incident cardiovascular events, and recently, addition of coronary artery disease (CAD) polygenic scores has been evaluated. Although age and sex strongly predict the risk of CAD, their interaction with genetic risk prediction has not been systematically examined. This study performed an extensive evaluation of age and sex effects in genetic CAD risk prediction. The population-based Norwegian HUNT2 (Trøndelag Health Study 2) cohort of 51 036 in iduals was used as the primary dataset. Findings were replicated in the UK Biobank (372 410 in iduals). Models for 10-year CAD risk were fitted using Cox proportional hazards, and Harrell concordance index, sensitivity, and specificity were compared. Inclusion of age and sex interactions of CAD polygenic score to the prediction models increased the C-index and sensitivity by accounting for nonadditive effects of CAD polygenic score and likely countering the observed survival bias in the baseline. The sensitivity for females was lower than males in all models including genetic information. We identified a total of 82.6% of incident CAD cases by using a 2-step approach: (1) Atherosclerotic Cardiovascular Disease risk score (74.1%) and (2) the CAD polygenic score interaction model for those in low clinical risk (additional 8.5%). These findings highlight the importance and complexity of genetic risk in predicting CAD. There is a need for modeling age- and sex-interaction terms with polygenic scores to optimize detection of in iduals at high risk, those who warrant preventive interventions. Sex-specific studies are needed to understand and estimate CAD risk with genetic information.

Publication

Comparative analysis reveals a role for TGF-β in shaping the residency-related transcriptional signature in tissue-resident memory CD8+ T cells

Publisher: Public Library of Science (PLoS)

Date: 11-02-2019

DOI: 10.1371/JOURNAL.PONE.0210495

Scott Ritchie

Researcher

Research Topics

Top 5 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

Sex-specific survival bias and interaction modeling in coronary artery disease risk prediction

The Biomarker GlycA Is Associated with Chronic Inflammation and Predicts Long-Term Risk of Severe Infection

Genome-wide association and Mendelian randomization analysis prioritizes bioactive metabolites with putative causal effects on common diseases

Power, false discovery rate and Winner’s Curse in eQTL studies

Using Polygenic Risk Scores for Prioritizing Individuals at Greatest Need of a Cardiovascular Disease Risk Assessment

Power, false discovery rate and Winner’s Curse in eQTL studies

Elevated alpha-1 antitrypsin is a major component of GlycA-associated risk for future morbidity and mortality

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease

Depression and genetic susceptibility to cardiometabolic diseases

Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort

Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Elevated serum alpha-1 antitrypsin is a major component of GlycA-associated risk for future morbidity and mortality

The Polygenic and Monogenic Basis of Blood Traits and Diseases

Using polygenic risk scores for prioritising individuals at greatest need of a CVD risk assessment

FastSpar: Rapid and scalable correlation estimation for compositional data

Integration of polygenic and gut metagenomic risk prediction for common diseases

Quality control and removal of technical variation of NMR metabolic biomarker data in ~120,000 UK Biobank participants

Combined effects of host genetics and diet on human gut microbiota and incident disease in a single population cohort

A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets

Experimental and Human Evidence for Lipocalin‐2 (Neutrophil Gelatinase‐Associated Lipocalin [NGAL]) in the Development of Cardiac Hypertrophy and Heart Failure

An atlas of genetic scores to predict multi-omic traits

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

Polygenic risk scores in cardiovascular risk prediction: A cohort study and modelling analyses

Genetically personalised organ-specific metabolic models in health and disease

Neurocognitive trajectory and proteomic signature of inherited risk for Alzheimer’s disease

Neonatal genetics of gene expression reveal potential origins of autoimmune and allergic disease risk

Genetically personalised organ-specific metabolic models in health and disease

The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation

Neonatal genetics of gene expression reveal the origins of autoimmune and allergic disease risk

Biomarker Glycoprotein Acetyls Is Associated With the Risk of a Wide Spectrum of Incident Diseases and Stratifies Mortality Risk in Angiography Patients

An interaction map of circulating metabolites, immune gene networks, and their genetic regulation

An atlas of genetic scores to predict multi-omic traits

Sex-Specific Survival Bias and Interaction Modeling in Coronary Artery Disease Risk Prediction

Comparative analysis reveals a role for TGF-β in shaping the residency-related transcriptional signature in tissue-resident memory CD8+ T cells

Related Organisations

Baker Heart And Diabetes Institute

University Of Melbourne

University Of Cambridge

1-University Of Barcelona, 2-Network Centre Of Biomedical Research Of Neurodegenerative Diseases (CIBERNED), 3-Molecular And Cellular Neurobiotechnology, Institute For Bioengineering Of Catalonia (IBEC)

Related Funding Activities

Discovery Early Career Researcher Award - Grant ID: DE150100278

ARDC NEWSLETTER SIGNUP