ARDC Research Link Australia

Publication

MELODI Presto: a fast and agile tool to explore semantic triples derived from biomedical literature

Publisher: Oxford University Press (OUP)

Date: 18-08-2020

Abstract: The field of literature-based discovery is growing in step with the volume of literature being produced. From modern natural language processing algorithms to high quality entity tagging, the methods and their impact are developing rapidly. One annotation object that arises from these approaches, the subject–predicate–object triple, is proving to be very useful in representing knowledge. We have implemented efficient search methods and an application programming interface, to create fast and convenient functions to utilize triples extracted from the biomedical literature by SemMedDB. By refining these data, we have identified a set of triples that focus on the mechanistic aspects of the literature, and provide simple methods to explore both enriched triples from single queries, and overlapping triples across two query lists. melodi-presto.mrcieu.ac.uk/. Supplementary data are available at Bioinformatics online.

Publication

MR-Base: a platform for systematic causal inference across the phenome using billions of genetic associations

Publisher: Cold Spring Harbor Laboratory

Date: 16-12-2016

DOI: 10.1101/078972

Abstract: Published genetic associations can be used to infer causal relationships between phenotypes, bypassing the need for in idual-level genotype or phenotype data. We have curated complete summary data from 1094 genome-wide association studies (GWAS) on diseases and other complex traits into a centralised database, and developed an analytical platform that uses these data to perform Mendelian randomization (MR) tests and sensitivity analyses (MR-Base, www.mrbase.org ). Combined with curated data of published GWAS hits for phenomic measures, the MR-Base platform enables millions of potential causal relationships to be evaluated. We use the platform to predict the impact of lipid lowering on human health. While our analysis provides evidence that reducing LDL-cholesterol, lipoprotein(a) or triglyceride levels reduce coronary disease risk, it also suggests causal effects on a number of other non-vascular outcomes, indicating potential for adverse-effects or drug repositioning of lipid-lowering therapies.

Publication

Genomics and transcriptomics across the diversity of the Nematoda

Publisher: Wiley

Date: 02-2012

DOI: 10.1111/J.1365-3024.2011.01342.X

Abstract: The ersity of biology in nematodes is reflected in the ersity of their genomes. Parasitic species in particular have evolved mechanisms to invade and outwit their hosts, and these offer opportunities for the development of control measures. Genomic analyses can reveal the molecular underpinnings of phenotypes such as parasitism and thus, initiate and support research programmes that explore the manipulation of host and parasite physiologies to achieve favourable outcomes. Wide s ling across nematode ersity allows phylogenetically informed formulation of research hypotheses, identification of core features shared by all species or important evolutionary novelties present in isolated clades. Many nematode species have been investigated through the use of the expressed sequence tag approach, which s les from the transcribed genome. Gene catalogues generated in this way can be explored to reveal the patterns of expression associated with parasitism and candidates for testing as drug targets or vaccine components. Analysis environments, such as NEMBASE facilitate exploitation of these data. The development of new high‐throughput DNA‐sequencing technologies has facilitated transcriptomic and genomic approaches to parasite biology. Whole genome sequencing offers more complete catalogues of genes and assists a systems approach to phenotype dissection. These efforts are being coordinated through the 959 Nematode Genomes initiative.

Publication

Mendelian Randomization analysis reveals a causal influence of circulating sclerostin levels on bone mineral density and fractures

Publisher: Cold Spring Harbor Laboratory

Date: 29-10-2018

DOI: 10.1101/455386

Abstract: In bone, sclerostin is mainly osteocyte-derived and plays an important local role in adaptive responses to mechanical loading. Whether circulating levels of sclerostin also play a functional role is currently unclear, which we aimed to examine by two s le Mendelian Randomisation (MR). A genetic instrument for circulating sclerostin, derived from a genome wide association study (GWAS) meta-analysis of serum sclerostin in 10,584 European-descent in iduals, was examined in relation to femoral neck bone mineral density (BMD n= 32,744) in GEFOS, and estimated BMD by heel ultrasound (eBMD n=426,824), and fracture risk (n=426,795), in UK Biobank. Our GWAS identified two novel serum sclerostin loci, B4GALNT3 (standard deviation (SD)) change in sclerostin per A allele (β=0.20, P=4.6×10 −49 ), and GALNT1 (β=0.11 per G allele, P=4.4×10 −11 ). B4GALNT3 is an N-acetyl-galactosaminyltransferase, adding a terminal LacdiNAc disaccharide to target glycocoproteins, found to be predominantly expressed in kidney, whereas GALNT1 is an enzyme causing mucin-type O-linked glycosylation. Using these two SNPs as genetic instruments, MR revealed an inverse causal relationship between serum sclerostin and femoral neck BMD (β= −0.12, 95%CI= −0.20 to −0.05) and eBMD (β= −0.12, 95%CI= −0.14 to −0.10), and a positive relationship with fracture risk (β= 0.11, 95%CI= 0.01 to 0.21). Colocalization analysis demonstrated common genetic signals within the B4GALNT3 locus for higher sclerostin, lower eBMD, and greater B4GALNT3 expression in arterial tissue (Probability %). Our findings suggest that higher sclerostin levels are causally related to lower BMD and greater fracture risk. Hence, strategies for reducing circulating sclerostin, for ex le by targeting glycosylation enzymes as suggested by our GWAS results, may prove valuable in treating osteoporosis.

Publication

Genome wide analysis for mouth ulcers identifies associations at immune regulatory loci

Publisher: Springer Science and Business Media LLC

Date: 05-03-2019

DOI: 10.1038/S41467-019-08923-6

Abstract: Mouth ulcers are the most common ulcerative condition and encompass several clinical diagnoses, including recurrent aphthous stomatitis (RAS). Despite previous evidence for heritability, it is not clear which specific genetic loci are implicated in RAS. In this genome-wide association study ( n = 461,106) heritability is estimated at 8.2% (95% CI: 6.4%, 9.9%). This study finds 97 variants which alter the odds of developing non-specific mouth ulcers and replicate these in an independent cohort ( n = 355,744) (lead variant after meta-analysis: rs76830965, near IL12A , OR 0.72 (95% CI: 0.71, 0.73) P = 4.4e−483). Additional effect estimates from three independent cohorts with more specific phenotyping and specific study characteristics support many of these findings. In silico functional analyses provide evidence for a role of T cell regulation in the aetiology of mouth ulcers. These results provide novel insight into the pathogenesis of a common, important condition.

Publication

The MR-Base platform supports systematic causal inference across the human phenome

Publisher: eLife Sciences Publications, Ltd

Date: 30-05-2018

DOI: 10.7554/ELIFE.34408

Abstract: Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-s le Mendelian randomization (2SMR) and bypassing the need for in idual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base ( www.mrbase.org ): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies.

Publication

DNA taxonomy of a neglected animal phylum: an unexpected diversity of tardigrades

Publisher: The Royal Society

Date: 07-05-2004

DOI: 10.1098/RSBL.2003.0130

Publication

MicroRNAs as potential therapeutics to enhance chemosensitivity in advanced prostate cancer

Publisher: Cold Spring Harbor Laboratory

Date: 28-02-2018

DOI: 10.1101/273284

Abstract: Docetaxel and cabazitaxel are taxane chemotherapy treatments for metastatic castration-resistant prostate cancer (CRPC). However, therapeutic resistance remains a major issue. MicroRNAs are short non-coding RNAs that can silence multiple genes, regulating several signalling pathways simultaneously. Therefore, synthetic microRNAs may have therapeutic potential in CRPC by regulating genes involved in taxane response and minimise compensatory mechanisms that cause taxane resistance. To identify microRNAs that can improve the efficacy of taxanes in CRPC, we performed a genome-wide screen of 1280 microRNAs in the CRPC cell lines PC3 and DU145 in combination with docetaxel or cabazitaxel treatment. Mimics of miR-217 and miR-181b-5p enhanced apoptosis significantly in PC3 cells in the presence of these taxanes. These mimics downregulated at least a thousand different transcripts, which were enriched for genes with cell proliferation and focal adhesion functions. In idual knockdown of a selection of 46 genes representing these transcripts resulted in toxic or taxane sensitisation effects, indicating that these genes may be mediating the effects of the microRNA mimics. A range of these genes are expressed in CRPC metastases, suggesting that these microRNA mimics may be functional in CRPC. With further development, these microRNA mimics may have therapeutic potential to improve taxane response in CRPC patients.

Publication

Trans-ethnic Mendelian-randomization study reveals causal relationships between cardiometabolic factors and chronic kidney disease

Publisher: Oxford University Press (OUP)

Date: 20-10-2021

DOI: 10.1093/IJE/DYAB203

Abstract: This study was to systematically test whether previously reported risk factors for chronic kidney disease (CKD) are causally related to CKD in European and East Asian ancestries using Mendelian randomization. A total of 45 risk factors with genetic data in European ancestry and 17 risk factors in East Asian participants were identified as exposures from PubMed. We defined the CKD by clinical diagnosis or by estimated glomerular filtration rate of & ml/min/1.73 m2. Ultimately, 51 672 CKD cases and 958 102 controls of European ancestry from CKDGen, UK Biobank and HUNT, and 13 093 CKD cases and 238 118 controls of East Asian ancestry from Biobank Japan, China Kadoorie Biobank and Japan-Kidney-Biobank/ToMMo were included. Eight risk factors showed reliable evidence of causal effects on CKD in Europeans, including genetically predicted body mass index (BMI), hypertension, systolic blood pressure, high-density lipoprotein cholesterol, apolipoprotein A-I, lipoprotein(a), type 2 diabetes (T2D) and nephrolithiasis. In East Asians, BMI, T2D and nephrolithiasis showed evidence of causality on CKD. In two independent replication analyses, we observed that increased hypertension risk showed reliable evidence of a causal effect on increasing CKD risk in Europeans but in contrast showed a null effect in East Asians. Although liability to T2D showed consistent effects on CKD, the effects of glycaemic phenotypes on CKD were weak. Non-linear Mendelian randomization indicated a threshold relationship between genetically predicted BMI and CKD, with increased risk at BMI of & kg/m2. Eight cardiometabolic risk factors showed causal effects on CKD in Europeans and three of them showed causality in East Asians, providing insights into the design of future interventions to reduce the burden of CKD.

Publication

MicroRNA profiling of the pubertal mouse mammary gland identifies miR-184 as a candidate breast tumour suppressor gene

Publisher: Springer Science and Business Media LLC

Date: 13-06-2015

DOI: 10.1186/S13058-015-0593-0

Publication

The variant call format provides efficient and robust storage of GWAS summary statistics

Publisher: Springer Science and Business Media LLC

Date: 13-01-2021

DOI: 10.1186/S13059-020-02248-0

Abstract: GWAS summary statistics are fundamental for a variety of research applications yet no common storage format has been widely adopted. Existing tabular formats ambiguously or incompletely store information about genetic variants and associations, lack essential metadata and are typically not indexed yielding poor query performance and increasing the possibility of errors in data interpretation and post-GWAS analyses. To address these issues, we adapted the variant call format to store GWAS summary statistics (GWAS-VCF) and developed open-source tools to use this format in downstream analyses. We provide open access to over 10,000 complete GWAS summary datasets converted to this format ( gwas.mrcieu.ac.uk ).

Publication

Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases

Publisher: Springer Science and Business Media LLC

Date: 07-09-2020

DOI: 10.1038/S41588-020-0682-6

Publication

MELODI: Mining enriched literature objects to derive intermediates

Publisher: Oxford University Press (OUP)

Date: 12-01-2018

DOI: 10.1093/IJE/DYX251

Publication

EpiGraphDB: a database and data mining platform for health data science

Publisher: Oxford University Press (OUP)

Date: 24-11-2020

DOI: 10.1093/BIOINFORMATICS/BTAA961

Abstract: The wealth of data resources on human phenotypes, risk factors, molecular traits and therapeutic interventions presents new opportunities for population health sciences. These opportunities are paralleled by a growing need for data integration, curation and mining to increase research efficiency, reduce mis-inference and ensure reproducible research. We developed EpiGraphDB (epigraphdb.org/), a graph database containing an array of different biomedical and epidemiological relationships and an analytical platform to support their use in human population health data science. In addition, we present three case studies that illustrate the value of this platform. The first uses EpiGraphDB to evaluate potential pleiotropic relationships, addressing mis-inference in systematic causal analysis. In the second case study, we illustrate how protein–protein interaction data offer opportunities to identify new drug targets. The final case study integrates causal inference using Mendelian randomization with relationships mined from the biomedical literature to ‘triangulate’ evidence from different sources. The EpiGraphDB platform is openly available at epigraphdb.org. Code for replicating case study results is available at github.com/MRCIEU/epigraphdb as Jupyter notebooks using the API, and mrcieu.github.io/epigraphdb-r using the R package. Supplementary data are available at Bioinformatics online.

Publication

MicroRNAs as potential therapeutics to enhance chemosensitivity in advanced prostate cancer

Publisher: Springer Science and Business Media LLC

Date: 18-05-2018

DOI: 10.1038/S41598-018-26050-Y

Abstract: Docetaxel and cabazitaxel are taxane chemotherapy treatments for metastatic castration-resistant prostate cancer (CRPC). However, therapeutic resistance remains a major issue. MicroRNAs are short non-coding RNAs that can silence multiple genes, regulating several signalling pathways simultaneously. Therefore, synthetic microRNAs may have therapeutic potential in CRPC by regulating genes involved in taxane response and minimise compensatory mechanisms that cause taxane resistance. To identify microRNAs that can improve the efficacy of taxanes in CRPC, we performed a genome-wide screen of 1280 microRNAs in the CRPC cell lines PC3 and DU145 in combination with docetaxel or cabazitaxel treatment. Mimics of miR-217 and miR-181b-5p enhanced apoptosis significantly in PC3 cells in the presence of these taxanes. These mimics downregulated at least a thousand different transcripts, which were enriched for genes with cell proliferation and focal adhesion functions. In idual knockdown of a selection of 46 genes representing these transcripts resulted in toxic or taxane sensitisation effects, indicating that these genes may be mediating the effects of the microRNA mimics. A range of these genes are expressed in CRPC metastases, suggesting that these microRNA mimics may be functional in CRPC. With further development, these microRNA mimics may have therapeutic potential to improve taxane response in CRPC patients.

Publication

Badger—an accessible genome exploration environment

Publisher: Oxford University Press (OUP)

Date: 11-08-2013

DOI: 10.1093/BIOINFORMATICS/BTT466

Abstract: Summary: High-quality draft genomes are now easy to generate, as sequencing and assembly costs have dropped dramatically. However, building a user-friendly searchable Web site and database for a newly annotated genome is not straightforward. Here we present Badger, a lightweight and easy-to-install genome exploration environment designed for next generation non-model organism genomes. Availability: Badger is released under the GPL and is available at badger.bio.ed.ac.uk/. We show two working ex les: (i) a test dataset included with the source code, and (ii) a collection of four filarial nematode genomes. Contact: mark.blaxter@ed.ac.uk

Publication

Can the impact of childhood adiposity on disease risk be reversed? A Mendelian randomization study

Publisher: Cold Spring Harbor Laboratory

Date: 05-10-2019

DOI: 10.1101/19008011

Abstract: To evaluate whether early life adiposity has an independent effect on later life disease risk or whether its influence is mediated by adulthood body mass index (BMI). Two-s le univariable and multivariable Mendelian randomization. The UK Biobank (UKB) prospective cohort study and four large-scale genome-wide association study (GWAS) consortia. 453,169 participants enrolled in the UKB and a combined total of over 700,000 in iduals from different GWAS consortia. Measured BMI during adulthood (mean age: 56.5) and self-reported adiposity at age 10. Coronary artery disease (CAD), type 2 diabetes (T2D), breast cancer and prostate cancer. In iduals with genetically predicted higher BMI in early life had increased odds of CAD (OR:1.49, 95% CI:1.33-1.68) and T2D (OR:2.32, 95% CI:1.76-3.05) based on univariable MR (UVMR) analyses. However, there was little evidence of a direct effect (i.e. not via adult BMI) based on multivariable MR (MVMR) estimates (CAD OR:1.02, 95% CI:0.86-1.22, T2D OR:1.16, 95% CI:0.74-1.82). In the MVMR analysis of breast cancer risk, there was strong evidence of a protective direct effect for early BMI (OR:0.59, 95% CI:0.50-0.71), although adult BMI did not appear to have a direct effect on this outcome (OR:1.08, 95% CI:0.93-1.27). Adding age of menarche as an additional exposure provided weak evidence of a total causal effect (UVMR OR:0.98, 95% CI:0.91-1.06) but strong evidence of a direct causal effect, independent of early and adult BMI (MVMR OR:0.90, 95% CI:0.85-0.95). Weak evidence of a causal effect was observed in the MVMR analysis of prostate cancer (early life BMI OR:1.06, 95% CI:0.81-1.40, adult BMI OR:0.87, 95% CI:0.70-1.08). Our findings suggest that increased CAD and T2D risk attributed to early life adiposity can be mitigated if in iduals reduce their weight in later life. However, having a low BMI during childhood may increase risk of breast cancer regardless of changes to weight in later life, with timing of puberty also putatively playing an important role.

Publication

LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis

Publisher: Oxford University Press (OUP)

Date: 22-09-2016

DOI: 10.1093/BIOINFORMATICS/BTW613

Abstract: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large s le sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. The web interface and instructions for using LD Hub are available at ldsc.broadinstitute.org/ Supplementary data are available at Bioinformatics online.

Publication

Phylogenomics of Nematoda

Publisher: Cambridge University Press

Date: 2016

DOI: 10.1017/CBO9781139236355.004

Publication

The variant call format provides efficient and robust storage of GWAS summary statistics

Publisher: Cold Spring Harbor Laboratory

Date: 30-05-2020

DOI: 10.1101/2020.05.29.115824

Abstract: Genome-wide association study (GWAS) summary statistics are a fundamental resource for a variety of research applications 1–6 . Yet despite their widespread utility, no common storage format has been widely adopted, hindering tool development and data sharing, analysis and integration. Existing tabular formats 7,8 often ambiguously or incompletely store information about genetic variants and their associations, and also lack essential metadata increasing the possibility of errors in data interpretation and post-GWAS analyses. Additionally, data in these formats are typically not indexed, requiring the whole file to be read which is computationally inefficient. To address these issues, we propose an adaptation of the variant call format 9 (GWAS-VCF) and have produced a suite of open-source tools for using this format in downstream analyses. Simulation studies determine GWAS-VCF is 9-46x faster than tabular alternatives when extracting variant(s) by genomic position. Our results demonstrate the GWAS-VCF provides a robust and performant solution for sharing, analysis and integration of GWAS data. We provide open access to over 10,000 complete GWAS summary datasets converted to this format (available from: gwas.mrcieu.ac.uk ).

Publication

A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

Publisher: Springer Science and Business Media LLC

Date: 26-01-2012

DOI: 10.1186/1756-0500-5-68

Publication

Comment on Protein Sequences from Mastodon and Tyrannosaurus rex Revealed by Mass Spectrometry

Publisher: American Association for the Advancement of Science (AAAS)

Date: 04-01-2008

DOI: 10.1126/SCIENCE.1147046

Abstract: We used authentication tests developed for ancient DNA to evaluate claims by Asara et al . (Reports, 13 April 2007, p. 280) of collagen peptide sequences recovered from mastodon and Tyrannosaurus rex fossils. Although the mastodon s les pass these tests, absence of amino acid composition data, lack of evidence for peptide deamidation, and association of Î±1(I) collagen sequences with hibians rather than birds suggest that T. rex does not.

Publication

Dihydroartemisinin inhibits the human erythroid cell differentiation by altering the cell cycle

Publisher: Elsevier BV

Date: 10-2012

DOI: 10.1016/J.TOX.2012.05.024

Abstract: Artemisinin derivatives such as dihydroartemisinin (DHA) induce significant depletion of early embryonic erythroblasts in animal models. We have reported previously that DHA specifically targets pro-erythroblasts and basophilic erythroblasts, when human CD34+ stem cells are differentiated toward the erythroid lineage, indicating that a window of susceptibility to artemisinins may exist also in human developmental erythropoiesis during pregnancy. To better investigate the toxicity of artemisinin derivatives, the structure-activity relationship was evaluated against the K562 leukaemia cell line, used as a model for differentiating early human erythroblasts. All artemisinins derivatives, except deoxyartemisinin, inhibited both spontaneous and induced erythroid differentiation, confirming that the peroxide bridge is responsible for the erythro-toxicity. On the contrary, cell growth was markedly reduced by DHA, artemisone and artesunate but not by artemisinin, 10-deoxoartemisinin or deoxy-artemisinin. The substituent at position C-10 is responsible only for the anti-proliferative effect, since 10-deoxoartemisinin did not reduce cell growth but arrested the differentiation of K562 cells. In particular, the results showed that DHA resulted the most potent and rapidly acting compound of the drug family, causing (i) the decreased expression of GpA surface receptors and the down regulation the γ-globin gene (ii) the alteration of S phase of cell cycle and (iii) the induction of programmed cell death of early erythroblasts in a dose dependent manner within 24h. In conclusion, these findings confirm that the active metabolite DHA is responsible for the erythro-toxicity of most of artemisinins used in therapy. Thus, as long as no further clinical data are available, current WHO recommendations of avoiding malaria treatment with artemisinins during the first trimester of pregnancy remain valid.

Publication

A phenome-wide approach to identify causal risk factors for deep vein thrombosis

Publisher: Cold Spring Harbor Laboratory

Date: 22-11-2018

DOI: 10.1101/476135

Abstract: Deep vein thrombosis (DVT) is the formation of a blood clot in a deep vein. DVT can lead to a venous thromboembolism (VTE), the combined term for DVT and pulmonary embolism, a leading cause of death and disability worldwide. Despite the prevalence and associated morbidity of DVT, the underlying causes are not well understood. To leverage publicly available genetic summary association statistics to identify causal risk factors for DVT. We conducted a Mendelian randomization phenome-wide association study (MR-PheWAS) using genetic summary association statistics for 973 exposures and DVT (6,767 cases and 330,392 controls in UK Biobank). There was evidence for a causal effect of 57 exposures on DVT risk, including previously reported risk factors (e.g. body mass index - BMI and height) and novel risk factors (e.g. hyperthyroidism, chronic obstructive pulmonary disease (COPD) and varicose veins). As the majority of identified risk factors were adiposity-related, we explored the molecular link with DVT by undertaking a two-s le MR mediation analysis of BMI-associated circulating proteins on DVT risk. Our results indicate that circulating neurogenic locus notch homolog protein 1 (NOTCH1), inhibin beta C chain (INHBC) and plasminogen activator inhibitor 1 (PAI-1) influence DVT risk, with PAI-1 mediating the BMI-DVT relationship. Using a phenome-wide approach, we provide putative causal evidence that hyperthyroidism, varicose veins, COPD and BMI enhance the risk of DVT. The circulating protein PAI-1 has furthermore a causal role in DVT aetiology and is involved in mediating the BMI-DVT relationship.

Publication

Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study

Publisher: BMJ

Date: 06-05-2020

DOI: 10.1136/BMJ.M1203

Abstract: To evaluate whether body size in early life has an independent effect on risk of disease in later life or whether its influence is mediated by body size in adulthood. Two s le univariable and multivariable mendelian randomisation. The UK Biobank prospective cohort study and four large scale genome-wide association studies (GWAS) consortiums. 453 169 participants enrolled in UK Biobank and a combined total of more than 700 000 people from different GWAS consortiums. Measured body mass index during adulthood (mean age 56.5) and self-reported perceived body size at age 10. Coronary artery disease, type 2 diabetes, breast cancer, and prostate cancer. Having a larger genetically predicted body size in early life was associated with an increased odds of coronary artery disease (odds ratio 1.49 for each change in body size category unless stated otherwise, 95% confidence interval 1.33 to 1.68) and type 2 diabetes (2.32, 1.76 to 3.05) based on univariable mendelian randomisation analyses. However, little evidence was found of a direct effect (ie, not through adult body size) based on multivariable mendelian randomisation estimates (coronary artery disease: 1.02, 0.86 to 1.22 type 2 diabetes:1.16, 0.74 to 1.82). In the multivariable mendelian randomisation analysis of breast cancer risk, strong evidence was found of a protective direct effect for larger body size in early life (0.59, 0.50 to 0.71), with less evidence of a direct effect of adult body size on this outcome (1.08, 0.93 to 1.27). Including age at menarche as an additional exposure provided weak evidence of a total causal effect (univariable mendelian randomisation odds ratio 0.98, 95% confidence interval 0.91 to 1.06) but strong evidence of a direct causal effect, independent of early life and adult body size (multivariable mendelian randomisation odds ratio 0.90, 0.85 to 0.95). No strong evidence was found of a causal effect of either early or later life measures on prostate cancer (early life body size odds ratio 1.06, 95% confidence interval 0.81 to 1.40 adult body size 0.87, 0.70 to 1.08). The findings suggest that the positive association between body size in childhood and risk of coronary artery disease and type 2 diabetes in adulthood can be attributed to in iduals remaining large into later life. However, having a smaller body size during childhood might increase the risk of breast cancer regardless of body size in adulthood, with timing of puberty also putatively playing a role.

Publication

ID4 controls mammary stem cells and marks breast cancers with a stem cell-like phenotype

Publisher: Springer Science and Business Media LLC

Date: 27-03-2015

DOI: 10.1038/NCOMMS7548

Abstract: Basal-like breast cancer (BLBC) is a heterogeneous disease with poor prognosis however, its cellular origins and aetiology are poorly understood. In this study, we show that inhibitor of differentiation 4 (ID4) is a key regulator of mammary stem cell self-renewal and marks a subset of BLBC with a putative mammary basal cell of origin. Using an ID4GFP knock-in reporter mouse and single-cell transcriptomics, we show that ID4 marks a stem cell-enriched subset of the mammary basal cell population. ID4 maintains the mammary stem cell pool by suppressing key factors required for luminal differentiation. Furthermore, ID4 is specifically expressed by a subset of human BLBC that possess a very poor prognosis and a transcriptional signature similar to a mammary stem cell. These studies identify ID4 as a mammary stem cell regulator, deconvolute the heterogeneity of BLBC and link a subset of mammary stem cells to the aetiology of BLBC.

Publication

Integrating Mendelian randomization and literature-mined evidence for breast cancer risk factors

Publisher: Cold Spring Harbor Laboratory

Date: 22-07-2022

DOI: 10.1101/2022.07.19.22277795

Abstract: An increasing challenge in population health research is efficiently utilising the wealth of data available from multiple sources to investigate the mechanisms of disease and identify potential intervention targets. The use of biomedical data integration platforms can facilitate evidence triangulation from these different sources, improving confidence in causal relationships of interest. In this work, we aimed to integrate Mendelian randomization (MR) and literature-mined evidence from the EpiGraphDB knowledge graph to build a comprehensive overview of risk factors for developing breast cancer. We utilised MR-EvE (“Everything-vs-Everything”) data to generate a list of causal risk factors for breast cancer, integrated this data with literature-mined relationships and identified potential mediators. We used multivariable MR to evaluate mediation and estimate the direct effects of these traits. We identified 213 novel and established lifestyle and molecular traits with evidence of an effect on breast cancer. We present the results of this evidence integration for four case studies (insulin-like growth factor I, cardiotrophin-1, childhood body size and age at menopause). We demonstrate that using MR-EvE to identify disease risk factors is an efficient hypothesis-generating approach. Moreover, we show that integrating MR evidence with literature-mined data may identify causal intermediates and uncover the mechanisms behind disease.

Publication

Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases

Publisher: Cold Spring Harbor Laboratory

Date: 05-05-2019

DOI: 10.1101/627398

Abstract: The human proteome is a major source of therapeutic targets. Recent genetic association analyses of the plasma proteome enable systematic evaluation of the causal consequences of variation in plasma protein levels. Here, we estimated the effects of 1002 proteins on 225 phenotypes using two-s le Mendelian randomization (MR) and colocalization. Of 413 associations supported by evidence from MR, 130 (31.5%) were not supported by results of colocalization analyses, suggesting that genetic confounding due to linkage disequilibrium (LD) is widespread in naive phenome-wide association studies of proteins. Combining MR and colocalization evidence in cis-only analyses, we identified 111 putatively causal effects between 65 proteins and 52 disease-related phenotypes ( qtl/ ). Evaluation of data from historic drug development programmes showed that target-indication pairs with MR and colocalization support were more likely to be approved, evidencing the value of our approach in identifying and prioritising potential therapeutic targets.

Publication

Coffee consumption and risk of breast cancer: A Mendelian randomization study.

Publisher: Public Library of Science (PLoS)

Date: 19-01-2021

DOI: 10.1371/JOURNAL.PONE.0236904

Abstract: Observational studies have reported either null or weak protective associations for coffee consumption and risk of breast cancer. We conducted a two-s le Mendelian randomization (MR) analysis to evaluate the relationship between coffee consumption and breast cancer risk using 33 single-nucleotide polymorphisms (SNPs) associated with coffee consumption from a genome-wide association (GWA) study on 212,119 female UK Biobank participants of White British ancestry. Risk estimates for breast cancer were retrieved from publicly available GWA summary statistics from the Breast Cancer Association Consortium (BCAC) on 122,977 cases (of which 69,501 were estrogen receptor (ER)-positive, 21,468 ER-negative) and 105,974 controls of European ancestry. Random-effects inverse variance weighted (IVW) MR analyses were performed along with several sensitivity analyses to assess the impact of potential MR assumption violations. One cup per day increase in genetically predicted coffee consumption in women was not associated with risk of total (IVW random-effects odds ratio (OR): 0.91, 95% confidence intervals (CI): 0.80–1.02, P: 0.12, P for instrument heterogeneity: 7.17e-13), ER-positive (OR = 0.90, 95% CI: 0.79–1.02, P: 0.09) and ER-negative breast cancer (OR: 0.88, 95% CI: 0.75–1.03, P: 0.12). Null associations were also found in the sensitivity analyses using MR-Egger (total breast cancer OR: 1.00, 95% CI: 0.80–1.25), weighted median (OR: 0.97, 95% CI: 0.89–1.05) and weighted mode (OR: 1.00, CI: 0.93–1.07). The results of this large MR study do not support an association of genetically predicted coffee consumption on breast cancer risk, but we cannot rule out existence of a weak association.

Publication

Cancer cell CCL5 mediates bone marrow independent angiogenesis in breast cancer

Publisher: Impact Journals, LLC

Date: 16-11-2016

DOI: 10.18632/ONCOTARGET.13387

Publication

Single cell transcriptomics reveals molecular subtype and functional heterogeneity in models of breast cancer

Publisher: Cold Spring Harbor Laboratory

Date: 14-03-2018

DOI: 10.1101/282079

Abstract: Breast cancer has long been classified into a number of molecular subtypes that predict prognosis and therefore influence clinical treatment decisions. Cellular heterogeneity is also evident in breast cancers and plays a key role in the development, evolution and metastatic progression of many cancers. How clinical heterogeneity relates to cellular heterogeneity is poorly understood, so we approached this question using single cell gene expression analysis of well established in vitro and in vivo models of disease. To explore the cellular heterogeneity in breast cancer we first examined a panel of genes that define the PAM50 classifier of molecular subtype. Five breast cancer cell line models (MCF7, BT474, SKBR3, MDA-MB-231, and MDA-MB-468) were selected as representatives of the intrinsic molecular subtypes (luminal A and B, basal-like, and Her2-enriched). Single cell multiplex RT-PCR was used to isolate and quantify the gene expression of single cells from each of these models, and the PAM50 classifier applied. Using this approach, we identified heterogeneity of intrinsic subtypes at single-cell level, indicating that cells with different subtypes exist within a cell line. Using the Chromium 10X system, this study was extended into thousands of cells from the MCF7 cell-line and an ER+ patient derived xenograft (PDX) model and again identified significant intra-tumour heterogeneity of molecular subtype. Estrogen Receptor (ER) is an important driver and therapeutic target in many breast cancers. It is heterogeneously expressed in a proportion of clinical cases but the significance of this to ER activity is unknown. Significant heterogeneity in the transcriptional activation of ER regulated genes was observed within tumours. This differential activation of the ER cistrome aligned with expression of two known transcriptional co-regulatory factors of ER (FOXA1 and PGR). To examine the degree of heterogeneity for other important phenotypic traits, we used an unsupervised clustering approach to identify cellular sub-populations with erse cancer associated transcriptional properties, such as: proliferation hypoxia and treatment resistance. In particular, we show that we can identify two distinct sub-populations of cells that may have denovo resistance to endocrine therapies in a treatment naïve PDX model of ER+ breast cancer. One of these consists of cells with a non-proliferative transcriptional phenotype that is enriched for transcriptional properties of ERBB2 tumours. The other is heavily enriched for components of the primary cilia. Gene regulatory networks were used to identify transcription factor regulons that are active in each cell, leading us to identify potential transcriptional drivers (such as E2F7, MYB and RFX3) of the cilia associated endocrine resistant cells. This rare subpopulation of cells also has a highly heterogenous mix of intrinsic subtypes highlighting a potential role of intra-tumour subtype heterogeneity in endocrine resistance and metastatic potential. Overall, These results suggest a high degree of cellular heterogeneity within breast cancer models, even cell lines, that can be functionally dissected into sub-populations of cells with transcriptional phenotypes of potential clinical relevance.

Publication

Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome

Publisher: Public Library of Science (PLoS)

Date: 08-01-2021

DOI: 10.1371/JOURNAL.PGEN.1009224

Abstract: Discovering drugs that efficiently treat brain diseases has been challenging. Genetic variants that modulate the expression of potential drug targets can be utilized to assess the efficacy of therapeutic interventions. We therefore employed Mendelian Randomization (MR) on gene expression measured in brain tissue to identify drug targets involved in neurological and psychiatric diseases. We conducted a two-s le MR using cis-acting brain-derived expression quantitative trait loci (eQTLs) from the Accelerating Medicines Partnership for Alzheimer’s Disease consortium (AMP-AD) and the CommonMind Consortium (CMC) meta-analysis study (n = 1,286) as genetic instruments to predict the effects of 7,137 genes on 12 neurological and psychiatric disorders. We conducted Bayesian colocalization analysis on the top MR findings (using P x10 -7 as evidence threshold, Bonferroni-corrected for 80,557 MR tests) to confirm sharing of the same causal variants between gene expression and trait in each genomic region. We then intersected the colocalized genes with known monogenic disease genes recorded in Online Mendelian Inheritance in Man (OMIM) and with genes annotated as drug targets in the Open Targets platform to identify promising drug targets. 80 eQTLs showed MR evidence of a causal effect, from which we prioritised 47 genes based on colocalization with the trait. We causally linked the expression of 23 genes with schizophrenia and a single gene each with anorexia, bipolar disorder and major depressive disorder within the psychiatric diseases and 9 genes with Alzheimer’s disease, 6 genes with Parkinson’s disease, 4 genes with multiple sclerosis and two genes with amyotrophic lateral sclerosis within the neurological diseases we tested. From these we identified five genes ( ACE , GPNMB , KCNQ5 , RERE and SUOX ) as attractive drug targets that may warrant follow-up in functional studies and clinical trials, demonstrating the value of this study design for discovering drug targets in neuropsychiatric diseases.

Publication

NEMBASE4: The nematode transcriptome resource

Publisher: Elsevier BV

Date: 07-2011

DOI: 10.1016/J.IJPARA.2011.03.009

Abstract: Nematode parasites are of major importance in human health and agriculture, and free-living species deliver essential ecosystem services. The genomics revolution has resulted in the production of many datasets of expressed sequence tags (ESTs) from a phylogenetically wide range of nematode species, but these are not easily compared. NEMBASE4 presents a single portal into extensively functionally annotated, EST-derived transcriptomes from over 60 species of nematodes, including plant and animal parasites and free-living taxa. Using the PartiGene suite of tools, we have assembled the publicly available ESTs for each species into a high-quality set of putative transcripts. These transcripts have been translated to produce a protein sequence resource and each is annotated with functional information derived from comparison with well-studied nematode species such as Caenorhabditis elegans and other non-nematode resources. By cross-comparing the sequences within NEMBASE4, we have also generated a protein family assignment for each translation. The data are presented in an openly accessible, interactive database. To demonstrate the utility of NEMBASE4, we have used the database to examine the uniqueness of the transcriptomes of major clades of parasitic nematodes, identifying lineage-restricted genes that may underpin particular parasitic phenotypes, possible viral pathogens of nematodes, and nematode-unique protein families that may be developed as drug targets.

Publication

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana

Publisher: Oxford University Press (OUP)

Date: 09-05-2017

DOI: 10.1093/GIGASCIENCE/GIX035

Publication

Discovering cancer vulnerabilities using high-throughput micro-RNA screening

Publisher: Oxford University Press (OUP)

Date: 16-11-2017

DOI: 10.1093/NAR/GKX1072

Publication

Targeting stromal remodeling and cancer stem cell plasticity to overcome chemoresistance in triple negative breast cancer

Publisher: Cold Spring Harbor Laboratory

Date: 08-11-2017

DOI: 10.1101/215954

Abstract: The cellular and molecular basis of stromal cell recruitment, activation and crosstalk in carcinomas is poorly understood, limiting the development of targeted anti-stromal therapies. In mouse models of triple negative breast cancer (TNBC), Hh ligand produced by neoplastic cells reprogrammed cancer-associated fibroblast (CAF) gene expression, driving tumor growth and metastasis. Hh-activated CAFs upregulated expression of FGF5 and production of fibrillar collagen, leading to FGFR and FAK activation in adjacent neoplastic cells, which then acquired a stem-like, drug-resistant phenotype. Treatment with smoothened inhibitors (SMOi) reversed these phenotypes. Stromal treatment of TNBC patient-derived xenograft (PDX) models with SMOi downregulated the expression of cancer stem cell markers and sensitized tumors to docetaxel, leading to markedly improved survival and reduced metastatic burden. In the phase I clinical trial EDALINE, 3 of 12 patients with metastatic TNBC derived clinical benefit from combination therapy with the SMOi Sonidegib and docetaxel chemotherapy, with one patient experiencing a complete response. Markers of pathway activity correlated with response. These studies identify Hh signaling to CAFs as a novel mediator of cancer stem cell plasticity and an exciting new therapeutic target in TNBC. Compared to other breast cancer subtypes, TNBCs are associated with significantly worse patient outcomes. Standard of care systemic treatment for patients with non-BRCA1/2 positive TNBC is cytotoxic chemotherapy. However, the failure of 70% of treated TNBCs to attain complete pathological response reflects the relative chemoresistance of these tumors. New therapeutic strategies are needed to improve patient survival and quality of life. Here, we provide new insights into the dynamic interactions between heterotypic cells within a tumor. Specifically, we establish the mechanisms by which CAFs define cancer cell phenotype and demonstrate that the bidirectional CAF-cancer cell crosstalk can be successfully targeted in mice and humans using anti-stromal therapy.

Publication

MELODI - Mining Enriched Literature Objects to Derive Intermediates

Publisher: Cold Spring Harbor Laboratory

Date: 20-03-2017

DOI: 10.1101/118513

Abstract: The scientific literature contains a wealth of information from different fields on potential disease mechanisms. However, prioritising mechanisms for further analytical evaluation presents enormous challenges in terms of the quantity and ersity of published research. The application of data mining approaches to the literature offers the potential to identify and prioritise mechanisms for more focused and detailed analysis. Here we present MELODI, a literature mining platform that can identify mechanistic pathways between any two biomedical concepts. Two case studies demonstrate the potential uses of MELODI and how it can generate hypotheses for further investigation. Firstly, an analysis of ERG and prostate cancer derives the intermediate transcription factor SP1, recently confirmed to be physically interacting with ERG. Secondly, examining the relationship between a new potential risk factor for pancreatic cancer identifies possible mechanistic insights which can be studied in vitro. MELODI has been implemented as a Python/Django web application, and is freely available to use at www.melodi.biocompute.org.uk melodi@biocompute.org.uk

Publication

Mendelian Randomization Analysis Reveals a Causal Influence of Circulating Sclerostin Levels on Bone Mineral Density and Fractures

Publisher: Wiley

Date: 02-08-2019

DOI: 10.1002/JBMR.3803

Publication

LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis

Publisher: Cold Spring Harbor Laboratory

Date: 03-05-2016

DOI: 10.1101/051094

Abstract: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large s le sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. In this manuscript, we describe LD Hub – a centralized database of summary-level GWAS results for 177 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. The web interface and instructions for using LD Hub are available at ldsc.broadinstitute.org/

Publication

elswob/AXON: Initial release

Publisher: Zenodo

Date: 2019

DOI: 10.5281/ZENODO.3442494

Publication

Targeting stromal remodeling and cancer stem cell plasticity overcomes chemoresistance in triple negative breast cancer

Publisher: Springer Science and Business Media LLC

Date: 24-07-2018

DOI: 10.1038/S41467-018-05220-6

Abstract: The cellular and molecular basis of stromal cell recruitment, activation and crosstalk in carcinomas is poorly understood, limiting the development of targeted anti-stromal therapies. In mouse models of triple negative breast cancer (TNBC), Hedgehog ligand produced by neoplastic cells reprograms cancer-associated fibroblasts (CAFs) to provide a supportive niche for the acquisition of a chemo-resistant, cancer stem cell (CSC) phenotype via FGF5 expression and production of fibrillar collagen. Stromal treatment of patient-derived xenografts with smoothened inhibitors (SMOi) downregulates CSC markers expression and sensitizes tumors to docetaxel, leading to markedly improved survival and reduced metastatic burden. In the phase I clinical trial EDALINE, 3 of 12 patients with metastatic TNBC derived clinical benefit from combination therapy with the SMOi Sonidegib and docetaxel chemotherapy, with one patient experiencing a complete response. These studies identify Hedgehog signaling to CAFs as a novel mediator of CSC plasticity and an exciting new therapeutic target in TNBC.

Benjamin Elsworth

Researcher

Related Links

Publications

MELODI Presto: a fast and agile tool to explore semantic triples derived from biomedical literature

MR-Base: a platform for systematic causal inference across the phenome using billions of genetic associations

Genomics and transcriptomics across the diversity of the Nematoda

Mendelian Randomization analysis reveals a causal influence of circulating sclerostin levels on bone mineral density and fractures

Genome wide analysis for mouth ulcers identifies associations at immune regulatory loci

The MR-Base platform supports systematic causal inference across the human phenome

DNA taxonomy of a neglected animal phylum: an unexpected diversity of tardigrades

MicroRNAs as potential therapeutics to enhance chemosensitivity in advanced prostate cancer

Trans-ethnic Mendelian-randomization study reveals causal relationships between cardiometabolic factors and chronic kidney disease

MicroRNA profiling of the pubertal mouse mammary gland identifies miR-184 as a candidate breast tumour suppressor gene

The variant call format provides efficient and robust storage of GWAS summary statistics

Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases

MELODI: Mining enriched literature objects to derive intermediates

EpiGraphDB: a database and data mining platform for health data science

MicroRNAs as potential therapeutics to enhance chemosensitivity in advanced prostate cancer

Badger—an accessible genome exploration environment

Can the impact of childhood adiposity on disease risk be reversed? A Mendelian randomization study

LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis

Phylogenomics of Nematoda

The variant call format provides efficient and robust storage of GWAS summary statistics

A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

Comment on Protein Sequences from Mastodon and Tyrannosaurus rex Revealed by Mass Spectrometry

Dihydroartemisinin inhibits the human erythroid cell differentiation by altering the cell cycle

A phenome-wide approach to identify causal risk factors for deep vein thrombosis

Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study

ID4 controls mammary stem cells and marks breast cancers with a stem cell-like phenotype

Integrating Mendelian randomization and literature-mined evidence for breast cancer risk factors

Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases

Coffee consumption and risk of breast cancer: A Mendelian randomization study.

Cancer cell CCL5 mediates bone marrow independent angiogenesis in breast cancer

Single cell transcriptomics reveals molecular subtype and functional heterogeneity in models of breast cancer

Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome

NEMBASE4: The nematode transcriptome resource

A high-coverage draft genome of the mycalesine butterfly Bicyclus anynana

Discovering cancer vulnerabilities using high-throughput micro-RNA screening

Targeting stromal remodeling and cancer stem cell plasticity to overcome chemoresistance in triple negative breast cancer

MELODI - Mining Enriched Literature Objects to Derive Intermediates

Mendelian Randomization Analysis Reveals a Causal Influence of Circulating Sclerostin Levels on Bone Mineral Density and Fractures

LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis

elswob/AXON: Initial release

Targeting stromal remodeling and cancer stem cell plasticity overcomes chemoresistance in triple negative breast cancer

Related Organisations

The University Of Edinburgh

University Of York

Garvan Institute Of Medical Research

Our Future Health

University Of Bristol

Related Funding Activities

ARDC NEWSLETTER SIGNUP