ARDC Research Link Australia

Publication

The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource

Publisher: Oxford University Press (OUP)

Date: 09-11-2022

Abstract: The NHGRI-EBI GWAS Catalog (www.ebi.ac.uk/gwas) is a FAIR knowledgebase providing detailed, structured, standardised and interoperable genome-wide association study (GWAS) data to & 000 users per year from academic research, healthcare and industry. The Catalog contains variant-trait associations and supporting metadata for & 000 published GWAS across & human traits, and & 000 full P-value summary statistics datasets. Content is curated from publications or acquired via author submission of prepublication summary statistics through a new submission portal and validation tool. GWAS data volume has vastly increased in recent years. We have updated our software to meet this scaling challenge and to enable rapid release of submitted summary statistics. The scope of the repository has expanded to include additional data types of high interest to the community, including sequencing-based GWAS, gene-based analyses and copy number variation analyses. Community outreach has increased the number of shared datasets from under-represented traits, e.g. cancer, and we continue to contribute to awareness of the lack of population ersity in GWAS. Interoperability of the Catalog has been enhanced through links to other resources including the Polygenic Score Catalog and the International Mouse Phenotyping Consortium, refinements to GWAS trait annotation, and the development of a standard format for GWAS data.

Publication

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

Publisher: Springer Science and Business Media LLC

Date: 10-03-2021

DOI: 10.1038/S41588-021-00783-5

Publication

An atlas of genetic scores to predict multi-omic traits

Publisher: Cold Spring Harbor Laboratory

Date: 17-04-2022

DOI: 10.1101/2022.04.17.488593

Abstract: Genetically predicted levels of multi-omic traits can uncover the molecular underpinnings of common phenotypes in a highly efficient manner. Here, we utilised a large cohort (INTERVAL N=50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, N=3,175 Olink, N=4,822), plasma metabolomics (Metabolon HD4, N=8,153), serum metabolomics (Nightingale, N=37,359), and whole blood Illumina RNA sequencing (N=4,136). We used machine learning to train genetic scores for 17,227 molecular traits, including 10,521 which reached Bonferroni-adjusted significance. We evaluated genetic score performances in external validation across European, Asian and African American ancestries, and assessed their longitudinal stability within erse in iduals. We demonstrated the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of UK Biobank to identify disease associations using a phenome-wide scan. Finally, we developed a portal ( OmicsPred.org ) to facilitate public access to all genetic scores and validation results as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.

Publication

Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity

Publisher: Elsevier BV

Date: 09-2014

DOI: 10.1016/J.CELL.2014.08.009

Publication

Genetically personalised organ-specific metabolic models in health and disease

Publisher: Springer Science and Business Media LLC

Date: 29-11-2022

DOI: 10.1038/S41467-022-35017-7

Abstract: Understanding how genetic variants influence disease risk and complex traits (variant-to-function) is one of the major challenges in human genetics. Here we present a model-driven framework to leverage human genome-scale metabolic networks to define how genetic variants affect biochemical reaction fluxes across major human tissues, including skeletal muscle, adipose, liver, brain and heart. As proof of concept, we build personalised organ-specific metabolic flux models for 524,615 in iduals of the INTERVAL and UK Biobank cohorts and perform a fluxome-wide association study (FWAS) to identify 4312 associations between personalised flux values and the concentration of metabolites in blood. Furthermore, we apply FWAS to identify 92 metabolic fluxes associated with the risk of developing coronary artery disease, many of which are linked to processes previously described to play in role in the disease. Our work demonstrates that genetically personalised metabolic models can elucidate the downstream effects of genetic variants on biochemical reactions involved in common human diseases.

Publication

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Publisher: Cold Spring Harbor Laboratory

Date: 19-12-2019

DOI: 10.1101/2019.12.14.876474

Abstract: Common human diseases are frequently polygenic in architecture, comprising a large number of risk alleles with small effects spread across the genome 1–3 . Polygenic scores (PGSs) aggregate these alleles into a metric which represents an in idual’s genetic predisposition to a specific disease. PGSs have shown promise for early risk prediction 4–7 , and there is potential to use PGSs to understand disease biology in parallel 8 . Here, we investigate the role plasma protein levels play in cardiometabolic disease risk in a cohort of 3,087 healthy in iduals using PGSs. We found PGSs for coronary artery disease (CAD), type 2 diabetes (T2D), chronic kidney disease (CKD), and ischaemic stroke (IS) were associated with levels of 49 plasma proteins. These associations were polygenic in architecture, largely independent of cis protein QTLs, and robust to environmental variation. Over a median 7.7 years follow-up, 28 of these plasma proteins were associated with future myocardial infarction (MI) or T2D events, 16 of which were causal mediators between polygenic risk and incident disease. These protein mediators of polygenic disease risk included targets of approved therapies which may have repurposing potential. Our results demonstrate that PGSs can identify proteins with causal roles in disease, and may have utility in drug development.

Publication

Improving reporting standards for polygenic scores in risk prediction studies

Publisher: Springer Science and Business Media LLC

Date: 10-03-2021

DOI: 10.1038/S41586-021-03243-6

Publication

Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants

Publisher: Cold Spring Harbor Laboratory

Date: 27-09-2021

DOI: 10.1101/2021.09.24.21264079

Abstract: Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy has recently become available in UK Biobank. Here, we describe procedures for quality control and removal of technical variation for this biomarker data, comprising 249 circulating metabolites, lipids, and lipoprotein sub-fractions on approximately 121,000 participants. We identify and characterise technical and biological factors associated with in idual biomarkers and find that linear effects on in idual biomarkers can combine in a non-linear fashion for 61 composite biomarkers and 81 biomarker ratios. We create an R package, ukbnmr, for extracting and normalising the metabolic biomarker data, then use ukbnmr to remove unwanted variation from the UK Biobank data. We make available code for re-deriving the 61 composite biomarkers and 81 ratios, and for further derivation of 76 additional biomarker ratios of potential biological significance. Finally, we demonstrate that removal of technical variation leads to increased signal for genetic and epidemiological studies of the NMR metabolic biomarkers in UK Biobank.

Publication

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Publisher: Springer Science and Business Media LLC

Date: 08-11-2021

DOI: 10.1038/S42255-021-00478-5

Publication

Genetically personalised organ-specific metabolic models in health and disease

Publisher: Cold Spring Harbor Laboratory

Date: 31-03-2022

DOI: 10.1101/2022.03.25.22272958

Abstract: Understanding how genetic variants influence disease risk and complex traits (variant-to-function) is one of the major challenges in human genetics. Here we present a model-driven framework to leverage human genome-scale metabolic networks to define how genetic variants affect biochemical reaction fluxes across major human tissues, including skeletal muscle, adipose, liver, brain and heart. As proof of concept, we build personalised organ-specific metabolic flux models for 524,615 in iduals of the INTERVAL and UK Biobank cohorts and perform a fluxome-wide association study (FWAS) to identify 4,411 associations between personalised flux values and the concentration of metabolites in blood. Furthermore, we apply FWAS to identify 97 metabolic fluxes associated with the risk of developing coronary artery disease, many of which are linked to processes previously described to play in role in the disease. Our work demonstrates that genetically personalised metabolic models can elucidate the downstream effects of genetic variants on biochemical reactions involved in common human diseases.

Publication

The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation

Publisher: Cold Spring Harbor Laboratory

Date: 23-05-2020

DOI: 10.1101/2020.05.20.20108217

Abstract: Polygenic [risk] scores (PGS) can enhance prediction and understanding of common diseases and traits. However, the reproducibility of PGS and their subsequent applications in biological and clinical research have been hindered by several factors, including: inadequate and incomplete reporting of PGS development, heterogeneity in evaluation techniques, and inconsistent access to, and distribution of, the information necessary to calculate the scores themselves. To address this we present the PGS Catalog (www.PGSCatalog.org), an open resource for polygenic scores. The PGS Catalog currently contains 192 published PGS from 78 publications for 86 erse traits, including diabetes, cardiovascular diseases, neurological disorders, cancers, as well as traits like BMI and blood lipids. Each PGS is annotated with metadata required for reproducibility as well as accurate application in independent studies. Using the PGS Catalog, we demonstrate that multiple PGS can be systematically evaluated to generate comparable performance metrics. The PGS Catalog has capabilities for user deposition, expert curation and programmatic access, thus providing the community with an open platform for polygenic score research and translation.

Publication

Quality control and removal of technical variation of NMR metabolic biomarker data in ~120,000 UK Biobank participants

Publisher: Springer Science and Business Media LLC

Date: 31-01-2023

DOI: 10.1038/S41597-023-01949-Y

Abstract: Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy in approximately 121,000 UK Biobank participants has recently been released as a community resource, comprising absolute concentrations and ratios of 249 circulating metabolites, lipids, and lipoprotein sub-fractions. Here we identify and characterise additional sources of unwanted technical variation influencing in idual biomarkers in the data available to download from UK Biobank. These included s le preparation time, shipping plate well, spectrometer batch effects, drift over time within spectrometer, and outlier shipping plates. We developed a procedure for removing this unwanted technical variation, and demonstrate that it increases signal for genetic and epidemiological studies of the NMR metabolic biomarker data in UK Biobank. We subsequently developed an R package, ukbnmr, which we make available to the wider research community to enhance the utility of the UK Biobank NMR metabolic biomarker data and to facilitate rapid analysis.

Publication

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistani and Bangladeshi individuals

Publisher: Springer Science and Business Media LLC

Date: 09-08-2022

DOI: 10.1038/S41467-022-32095-5

Abstract: In iduals with South Asian ancestry have a higher risk of heart disease than other groups but have been largely excluded from genetic research. Using data from 22,000 British Pakistani and Bangladeshi in iduals with linked electronic health records from the Genes & Health cohort, we conducted genome-wide association studies of coronary artery disease and its key risk factors. Using power-adjusted transferability ratios, we found evidence for transferability for the majority of cardiometabolic loci powered to replicate. The performance of polygenic scores was high for lipids and blood pressure, but lower for BMI and coronary artery disease. Adding a polygenic score for coronary artery disease to clinical risk factors showed significant improvement in reclassification. In Mendelian randomisation using transferable loci as instruments, our findings were consistent with results in European-ancestry in iduals. Taken together, trait-specific transferability of trait loci between populations is an important consideration with implications for risk prediction and causal inference.

Publication

Improving reporting standards for polygenic scores in risk prediction studies

Publisher: Cold Spring Harbor Laboratory

Date: 08-05-2020

DOI: 10.1101/2020.04.23.20077099

Abstract: Polygenic risk scores (PRS), often aggregating the results from genome-wide association studies, can bridge the gap between the initial variant discovery efforts and disease risk estimation for clinical applications. However, there is remarkable heterogeneity in the reporting of these risk scores due to a lack of adherence to reporting standards and no accepted standards suited for the current state of PRS development and application. This lack of adherence and best practices hinders the translation of PRS into clinical care. The ClinGen Complex Disease Working Group, in a collaboration with the Polygenic Score (PGS) Catalog, have developed a novel PRS Reporting Statement (PRS-RS), updating previous standards to the current state of the field and to enable downstream utility. Drawing upon experts in epidemiology, statistics, disease-specific applications, implementation, and policy, this 23-item reporting framework defines the minimal information needed to interpret and evaluate a PRS, especially with respect to any downstream clinical applications. Items span detailed descriptions of the study population (recruitment method, key demographics, inclusion/exclusion criteria, and phenotype definition), statistical methods for both PRS development and validation, and considerations for potential limitations of the published risk score and downstream clinical utility. Additionally, emphasis has been placed on data availability and transparency to facilitate reproducibility and benchmarking against other PRS, such as deposition in the publicly available PGS Catalog ( www.PGScatalog.org ). By providing these criteria in a structured format that builds upon existing standards and ontologies, the use of this framework in publishing PRS will facilitate translation of PRS into clinical care and progress towards defining best practices. In recent years, polygenic risk scores (PRS) have become an increasingly studied tool to capture the genome-wide liability underlying many human traits and diseases, hoping to better inform an in idual’s genetic risk. However, a lack of tailored reporting standards has hindered the translation of this important tool into clinical and public health practice with the heterogeneous underreporting of details necessary for benchmarking and reproducibility. To address this gap, the ClinGen Complex Disease Working Group and Polygenic Score (PGS) Catalog have collaborated to develop the 23-item Polygenic Risk Score Reporting Statement (PRS-RS). This framework provides the minimal information expected of authors to promote the validity, transparency, and reproducibility of PRS by requiring authors to detail the study population, statistical methods, and potential clinical utility of a published score. The widespread adoption of this framework will encourage rigorous methodological consideration and facilitate benchmarking to ensure high quality scores are translated into the clinic.

Publication

An atlas of genetic scores to predict multi-omic traits

Publisher: Springer Science and Business Media LLC

Date: 29-03-2023

DOI: 10.1038/S41586-023-05844-9

Publication

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistanis and Bangladeshis

Publisher: Cold Spring Harbor Laboratory

Date: 24-06-2021

DOI: 10.1101/2021.06.22.21259323

Abstract: In iduals with South Asian ancestry have higher risk of heart disease than other groups in Western countries however, most genetic research has focused on European-ancestry (EUR) in iduals. It is unknown whether reported genetic loci and polygenic scores (PGSs) for cardiometabolic traits are transferable to South Asians, and whether PGSs have utility in clinical settings. Using data from 22,000 British Pakistani and Bangladeshi in iduals with linked electronic health records from the Genes & Health cohort (G& H), we conducted genome-wide association studies (GWAS) and characterised the genetic architecture of coronary artery disease (CAD), body mass index (BMI), lipid biomarkers and blood pressure. We applied a new technique to assess the extent to which loci from GWAS in EUR s les were transferable. We tested how well existing findings from EUR studies performed in genetic risk prediction and Mendelian randomisation in G& H. Trans-ancestry genetic correlations between G& H and EUR s les for the tested traits were not significantly lower than 1, except for BMI (r g =0.85, p=0.02). We found evidence for transferability for the vast majority of loci from EUR discovery studies that were sufficiently powered to replicate in G& H. PGSs showed variable transferability in G& H, with the relative accuracy compared to EUR (ratio of incremental r 2 /AUC) ≥0.95 for HDL-C, triglycerides, and blood pressure, but lower for BMI (0.78) and CAD (0.42). We observed significant improvement in categorical net reclassification in G& H (NRI=3.9% 95% CI 0.9–7.0) when adding a previously developed CAD PGS to clinical risk factors (QRISK3). We used transferable loci as genetic instruments in trans-ancestry Mendelian randomisation and found evidence of an increased CAD risk for higher LDL-C and BMI, and for lower HDL-C in G& H, consistent with our findings for EUR s les. The genetic loci for CAD and its risk factors are largely transferable from EUR studies to British Pakistanis and Bangladeshis, whereas the transferability of PGSs varies greatly between traits. Our analyses suggest clinical utility for addition of PGS to existing clinical risk prediction tools for this population. This is the first study to explore the transferability of GWAS findings and PGSs for CAD and related cardiometabolic traits in British Pakistani and Bangladeshi in iduals from a cohort with real-world electronic clinical data. We propose a new approach to assessing transferability of GWAS loci between populations, which can serve as a new methodological standard in this developing field. We find evidence of overall high transferability of GWAS loci in British Pakistanis and Bangladeshis. BMI, lipids and blood pressure show the highest transferability of loci, and CAD the lowest. The transferability of PGSs varied between traits, being high for HDL-C, triglycerides and blood pressure but more modest for CAD, BMI and LDL-C. Our results suggest that, for some traits, the use of transferable GWAS loci improves the robustness of Mendelian randomisation estimates in non-Europeans. The polygenic score for CAD derived from genetic studies of European in iduals improves reclassification on top of clinical risk factors in British Pakistanis and Bangladeshis. The improvement was driven by identification of more cases in younger in iduals (25–54 years old), and of controls in older in iduals (55–84 years old). Incorporation of the polygenic score for CAD into risk prediction models is likely to prevent cardiovascular events and deaths in this population.

Samuel Lambert

Researcher

Publications

The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

An atlas of genetic scores to predict multi-omic traits

Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity

Genetically personalised organ-specific metabolic models in health and disease

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Improving reporting standards for polygenic scores in risk prediction studies

Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Genetically personalised organ-specific metabolic models in health and disease

The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation

Quality control and removal of technical variation of NMR metabolic biomarker data in ~120,000 UK Biobank participants

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistani and Bangladeshi individuals

Improving reporting standards for polygenic scores in risk prediction studies

An atlas of genetic scores to predict multi-omic traits

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistanis and Bangladeshis

Related Organisations

University Of Cambridge

University Of Toronto

University Of Cambridge

Related Funding Activities

Samuel Lambert

Researcher

Related Links

Publications

The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource

The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation

An atlas of genetic scores to predict multi-omic traits

Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity

Genetically personalised organ-specific metabolic models in health and disease

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Improving reporting standards for polygenic scores in risk prediction studies

Quality control and removal of technical variation of NMR metabolic biomarker data in ∼120,000 UK Biobank participants

Integrative analysis of the plasma proteome and polygenic risk of cardiometabolic diseases

Genetically personalised organ-specific metabolic models in health and disease

The Polygenic Score Catalog: an open database for reproducibility and systematic evaluation

Quality control and removal of technical variation of NMR metabolic biomarker data in ~120,000 UK Biobank participants

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistani and Bangladeshi individuals

Improving reporting standards for polygenic scores in risk prediction studies

An atlas of genetic scores to predict multi-omic traits

Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistanis and Bangladeshis

Related Organisations

University Of Cambridge

University Of Toronto

University Of Cambridge

Related Funding Activities

ARDC NEWSLETTER SIGNUP