ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-4922-8415

Current Organisation
Olink proteomics

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13929-1

Abstract: The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here , as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.

Publication

Interpretable machine learning identifies paediatric Systemic Lupus Erythematosus subtypes based on gene expression data

Publisher: Cold Spring Harbor Laboratory

Date: 05-06-2021

DOI: 10.1101/2021.06.03.446884

Abstract: Transcriptomic analyses are commonly used to identify differentially expressed genes between patients and controls, or within in iduals across disease courses. These methods, whilst effective, cannot encompass the combinatorial effects of genes driving disease. We applied rule-based machine learning (RBML) models and rule networks (RN) to an existing paediatric Systemic Lupus Erythematosus (SLE) blood expression dataset, with the goal of developing gene networks to separate low and high disease activity (DA1 and DA3). The resultant model had an 81% accuracy to distinguish between DA1 and DA3, with unsupervised hierarchical clustering revealing additional subgroups indicative of the immune axis involved or state of disease flare. These subgroups correlated with clinical variables, suggesting that the gene sets identified may further the understanding of gene networks that act in concert to drive disease progression. This included roles for genes i) induced by interferons ( IFI35 and OTOF ), ii) key to SLE cell types ( KLRB1 encoding CD161), or iii) with roles in autophagy and NF-κB pathway responses ( CKAP4 ). As demonstrated here, RBML approaches have the potential to reveal novel gene patterns from within a heterogeneous disease, facilitating patient clinical and therapeutic stratification.

Publication

Interpretable machine learning identifies paediatric Systemic Lupus Erythematosus subtypes based on gene expression data

Publisher: Springer Science and Business Media LLC

Date: 06-05-2022

DOI: 10.1038/S41598-022-10853-1

Abstract: Transcriptomic analyses are commonly used to identify differentially expressed genes between patients and controls, or within in iduals across disease courses. These methods, whilst effective, cannot encompass the combinatorial effects of genes driving disease. We applied rule-based machine learning (RBML) models and rule networks (RN) to an existing paediatric Systemic Lupus Erythematosus (SLE) blood expression dataset, with the goal of developing gene networks to separate low and high disease activity (DA1 and DA3). The resultant model had an 81% accuracy to distinguish between DA1 and DA3, with unsupervised hierarchical clustering revealing additional subgroups indicative of the immune axis involved or state of disease flare. These subgroups correlated with clinical variables, suggesting that the gene sets identified may further the understanding of gene networks that act in concert to drive disease progression. This included roles for genes (i) induced by interferons ( IFI35 and OTOF ), (ii) key to SLE cell types ( KLRB1 encoding CD161), or (iii) with roles in autophagy and NF-κB pathway responses ( CKAP4 ). As demonstrated here, RBML approaches have the potential to reveal novel gene patterns from within a heterogeneous disease, facilitating patient clinical and therapeutic stratification.

Publication

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S42003-019-0741-7

Abstract: Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1 , but not MALAT1 . Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.

Publication

Pan-cancer analysis of whole genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1969-6

Abstract: Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale 1–3 . Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution in acral melanoma, for ex le, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter 4 identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation 5,6 analyses timings and patterns of tumour evolution 7 describes the erse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity 8,9 and evaluates a range of more-specialized features of cancer genomes 8,10–18 .

Publication

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1965-X

Abstract: The discovery of drivers of cancer has traditionally focused on protein-coding genes 1–4 . Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium 5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of in idual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers 6,7 , raise doubts about others and identify novel candidates, including point mutations in the 5′ region of TP53 , in the 3′ untranslated regions of NFKBIZ and TOB1 , focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional ex les of these drivers will be found as more cancer genomes become available.

Publication

Sex differences in oncogenic mutational processes

Publisher: Springer Science and Business Media LLC

Date: 28-08-2020

DOI: 10.1038/S41467-020-17359-2

Abstract: Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here we report a pan-cancer analysis of sex differences in whole genomes of 1983 tumours of 28 subtypes as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in molecular cancer research.

Publication

Pathway and network analysis of more than 2500 whole cancer genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-020-14367-0

Abstract: The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53 , TLE4 , and TCF4 . We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and s les containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as s les with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.

Publication

Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

Publisher: Springer Science and Business Media LLC

Date: 21-09-2020

DOI: 10.1038/S41467-020-18151-Y

Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA s les, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological ergences between two reproducible somatic variant detection efforts.

Publication

Integrative pathway enrichment analysis of multivariate omics data

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13983-9

Abstract: Multi-omics datasets represent distinct aspects of the central dogma of molecular biology. Such high-dimensional molecular profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple datasets using statistical data fusion, rationalizes contributing evidence and highlights associated genes. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we integrated genes with coding and non-coding mutations and revealed frequently mutated pathways and additional cancer genes with infrequent mutations. We also analyzed prognostic molecular pathways by integrating genomic and transcriptomic features of 1780 breast cancers and highlighted associations with immune response and anti-apoptotic signaling. Integration of ChIP-seq and RNA-seq data for master regulators of the Hippo pathway across normal human tissues identified processes of tissue regeneration and stem cell regulation. ActivePathways is a versatile method that improves systems-level understanding of cellular organization in health and disease through integration of multiple molecular datasets and pathway annotations.

Related Organisations

Organisation

Uppsala University

Location: Sweden

View Organisation

Organisation

University Of Patras

Location: Greece

View Organisation

Organisation

Central Queensland University School Of Human Health And Social Sciences

Location: Australia

View Organisation

Organisation

Uppsala Universitet

Location: Sweden

View Organisation

Organisation

Olink Proteomics

Location: Sweden

View Organisation

Organisation

Olink Bioscience (Sweden)

Location: Sweden

View Organisation

Related Funding Activities

No related grants have been discovered for Klev Diamanti.

Klev Diamanti

Researcher

Related Links

Publications

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Interpretable machine learning identifies paediatric Systemic Lupus Erythematosus subtypes based on gene expression data

Interpretable machine learning identifies paediatric Systemic Lupus Erythematosus subtypes based on gene expression data

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Pan-cancer analysis of whole genomes

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Sex differences in oncogenic mutational processes

Pathway and network analysis of more than 2500 whole cancer genomes

Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

Integrative pathway enrichment analysis of multivariate omics data

Related Organisations

Uppsala University

University Of Patras

Central Queensland University School Of Human Health And Social Sciences

Uppsala Universitet

Olink Proteomics

Olink Bioscience (Sweden)

Related Funding Activities

ARDC NEWSLETTER SIGNUP