ARDC Research Link Australia

Publication

PhosphoregDB: The tissue and sub-cellular distribution of mammalian protein kinases and phosphatases

Publisher: Springer Science and Business Media LLC

Date: 20-02-2006

Abstract: Protein kinases and protein phosphatases are the fundamental components of phosphorylation dependent protein regulatory systems. We have created a database for the protein kinase-like and phosphatase-like loci of mouse phosphoreg.imb.uq.edu.au that integrates protein sequence, interaction, classification and pathway information with the results of a systematic screen of their sub-cellular localization and tissue specific expression data mined from the GNF tissue atlas of mouse. The database lets users query where a specific kinase or phosphatase is expressed at both the tissue and sub-cellular levels. Similarly the interface allows the user to query by tissue, pathway or sub-cellular localization, to reveal which components are co-expressed or co-localized. A review of their expression reveals 30% of these components are detected in all tissues tested while 70% show some level of tissue restriction. Hierarchical clustering of the expression data reveals that expression of these genes can be used to separate the s les into tissues of related lineage, including 3 larger clusters of nervous tissue, developing embryo and cells of the immune system. By overlaying the expression, sub-cellular localization and classification data we examine correlations between class, specificity and tissue restriction and show that tyrosine kinases are more generally expressed in fewer tissues than serine/threonine kinases. Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered.

Publication

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S42003-019-0741-7

Abstract: Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1 , but not MALAT1 . Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis.

Publication

Genomic analyses identify molecular subtypes of pancreatic cancer

Publisher: Springer Science and Business Media LLC

Date: 24-02-2016

DOI: 10.1038/NATURE16965

Abstract: Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous (2) pancreatic progenitor (3) immunogenic and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63∆N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.

Publication

Integrative pathway enrichment analysis of multivariate omics data

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13983-9

Abstract: Multi-omics datasets represent distinct aspects of the central dogma of molecular biology. Such high-dimensional molecular profiles pose challenges to data interpretation and hypothesis generation. ActivePathways is an integrative method that discovers significantly enriched pathways across multiple datasets using statistical data fusion, rationalizes contributing evidence and highlights associated genes. As part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we integrated genes with coding and non-coding mutations and revealed frequently mutated pathways and additional cancer genes with infrequent mutations. We also analyzed prognostic molecular pathways by integrating genomic and transcriptomic features of 1780 breast cancers and highlighted associations with immune response and anti-apoptotic signaling. Integration of ChIP-seq and RNA-seq data for master regulators of the Hippo pathway across normal human tissues identified processes of tissue regeneration and stem cell regulation. ActivePathways is a versatile method that improves systems-level understanding of cellular organization in health and disease through integration of multiple molecular datasets and pathway annotations.

Publication

LOCATE: A mammalian protein subcellular localization database

Publisher: Oxford University Press (OUP)

Date: 23-12-2008

DOI: 10.1093/NAR/GKM950

Publication

Patterns of somatic structural variation in human cancer genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-019-1913-9

Abstract: A key mutational process in cancer is structural variation, in which rearrangements delete, lify or reorder genomic segments that range in size from kilobases to whole chromosomes 1–7 . Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types 8 . Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions—as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2–7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and—in liver cancer—frequently activate the telomerase gene TERT . A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.

Publication

Sleeping Beauty mutagenesis reveals cooperating mutations and pathways in pancreatic adenocarcinoma

Publisher: Proceedings of the National Academy of Sciences

Date: 15-03-2012

DOI: 10.1073/PNAS.1202490109

Abstract: Pancreatic cancer is one of the most deadly cancers affecting the Western world. Because the disease is highly metastatic and difficult to diagnosis until late stages, the 5-y survival rate is around 5%. The identification of molecular cancer drivers is critical for furthering our understanding of the disease and development of improved diagnostic tools and therapeutics. We have conducted a mutagenic screen using Sleeping Beauty ( SB ) in mice to identify new candidate cancer genes in pancreatic cancer. By combining SB with an oncogenic Kras allele, we observed highly metastatic pancreatic adenocarcinomas. Using two independent statistical methods to identify loci commonly mutated by SB in these tumors, we identified 681 loci that comprise 543 candidate cancer genes (CCGs) 75 of these CCGs, including Mll3 and Ptk2 , have known mutations in human pancreatic cancer. We identified point mutations in human pancreatic patient s les for another 11 CCGs, including Acvr2a and Map2k4 . Importantly, 10% of the CCGs are involved in chromatin remodeling, including Arid4b , Kdm6a , and Nsd3 , and all SB tumors have at least one mutated gene involved in this process 20 CCGs, including Ctnnd1 , Fbxo11 , and Vgll4 , are also significantly associated with poor patient survival. SB mutagenesis provides a rich resource of mutations in potential cancer drivers for cross-comparative analyses with ongoing sequencing efforts in human pancreatic adenocarcinoma.

Publication

Marked mitochondrial genetic variation in individuals and populations of the carcinogenic liver fluke Clonorchis sinensis

Publisher: Public Library of Science (PLoS)

Date: 19-08-2020

DOI: 10.1371/JOURNAL.PNTD.0008480

Publication

Towards defining the nuclear proteome

Publisher: Springer Science and Business Media LLC

Date: 2008

DOI: 10.1186/GB-2008-9-1-R15

Publication

Computational approaches to identify functional genetic variants in cancer genomes

Publisher: Springer Science and Business Media LLC

Date: 30-07-2013

DOI: 10.1038/NMETH.2562

Publication

Differential Use of Signal Peptides and Membrane Domains Is a Common Occurrence in the Protein Output of Transcriptional Units

Publisher: Public Library of Science (PLoS)

Date: 28-04-2006

DOI: 10.1371/JOURNAL.PGEN.0020046

Publication

Minimizing Sample Failure Rates for Challenging Clinical Tumor Samples

Publisher: Elsevier BV

Date: 05-2023

DOI: 10.1016/J.JMOLDX.2023.01.008

Publication

Open access: Taking full advantage of the content

Publisher: Public Library of Science (PLoS)

Date: 28-03-2008

DOI: 10.1371/JOURNAL.PCBI.1000037

Publication

A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13825-8

Abstract: In cancer, the primary tumour’s organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor s les and 88% and 83% respectively on independent primary and metastatic s les, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA.

Publication

2HAPI: a microarray data analysis system

Publisher: Oxford University Press (OUP)

Date: 22-07-2003

DOI: 10.1093/BIOINFORMATICS/BTG169

Abstract: Summary: 2HAPI (version 2 of High density Array Pattern Interpreter) is a web-based, publicly-available analytical tool designed to aid researchers in microarray data analysis. 2HAPI includes tools for searching, manipulating, visualizing, and clustering the large sets of data generated by microarray experiments. Other features include association of genes with NCBI information and linkage to external data resources. Unique to 2HAPI is the ability to retrieve upstream sequences of co-regulated genes for promoter analysis using MEME (Multiple Expectation-maximization for Motif Elicitation) Availability: 2HAPI is freely available at array.sdsc.edu. Users can try 2HAPI anonymously with pre-loaded data or they can register as a 2HAPI user and upload their data. Contact: gribskov@sdsc.edu * To whom correspondence should be addressed.

Publication

Targeted Next-Gen Sequencing for Detecting MLL Gene Fusions in Leukemia

Publisher: American Association for Cancer Research (AACR)

Date: 02-2017

DOI: 10.1158/1541-7786.MCR-17-0569

Abstract: Mixed lineage leukemia (MLL) gene rearrangements characterize approximately 70% of infant and 10% of adult and therapy-related leukemia. Conventional clinical diagnostics, including cytogenetics and fluorescence in situ hybridization (FISH) fail to detect MLL translocation partner genes (TPG) in many patients. Long-distance inverse (LDI)-PCR, the “gold standard” technique that is used to characterize MLL breakpoints, is laborious and requires a large input of genomic DNA (gDNA). To overcome the limitations of current techniques, a targeted next-generation sequencing (NGS) approach that requires low RNA input was tested. Anchored multiplex PCR-based enrichment (AMP-E) was used to rapidly identify a broad range of MLL fusions in patient specimens. Libraries generated using Archer FusionPlex Heme and Myeloid panels were sequenced using the Illumina platform. Diagnostic specimens (n = 39) from pediatric leukemia patients were tested with AMP-E and validated by LDI-PCR. In concordance with LDI-PCR, the AMP-E method successfully identified TPGs without prior knowledge. AMP-E identified 10 different MLL fusions in the 39 s les. Only two specimens were discordant AMP-E successfully identified a MLL-MLLT1 fusion where LDI-PCR had failed to determine the breakpoint, whereas a MLL-MLLT3 fusion was not detected by AMP-E due to low expression of the fusion transcript. Sensitivity assays demonstrated that AMP-E can detect MLL-AFF1 in MV4-11 cell dilutions of 10−7 and transcripts down to 0.005 copies/ng. Implications: This study demonstrates a NGS methodology with improved sensitivity compared with current diagnostic methods for MLL-rearranged leukemia. Furthermore, this assay rapidly and reliably identifies MLL partner genes and patient-specific fusion sequences that could be used for monitoring minimal residual disease. Mol Cancer Res 16(2) 279–85. ©2017 AACR.

Publication

Subtype-Specific Analyses Reveal Infiltrative Basal Cell Carcinomas Are Highly Interactive with their Environment

Publisher: Elsevier BV

Date: 10-2021

DOI: 10.1016/J.JID.2021.02.760

Abstract: Little is known regarding the molecular differences between basal cell carcinoma (BCC) subtypes, despite clearly distinct phenotypes and clinical outcomes. In particular, infiltrative BCCs have poorer clinical outcomes in terms of response to therapy and propensity for dissemination. In this project, we aimed to use exome sequencing and RNA sequencing to identify somatic mutations and molecular pathways leading to infiltrative BCCs. Using whole-exome sequencing of 36 BCC s les (eight infiltrative) combined with previously reported exome data (58 s les), we determine that infiltrative BCCs do not contain a distinct somatic variant profile and carry classical UV-induced mutational signatures. RNA sequencing on both datasets revealed key differentially expressed genes, such as POSTN and WISP1, suggesting increased integrin and Wnt signaling. Immunostaining for periostin and WISP1 clearly distinguished infiltrative BCCs, and nuclear β-catenin staining patterns further validated the resulting increase in Wnt signaling in infiltrative BCCs. Of significant interest, in BCCs with mixed morphology, infiltrative areas expressed WISP1, whereas nodular areas did not, supporting a continuum between subtypes. In conclusion, infiltrative BCCs do not differ in their genomic alteration in terms of initiating mutations. They display a specific type of interaction with the extracellular matrix environment regulating Wnt signaling.

Publication

Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41588-019-0576-7

Abstract: Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene lification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.

Publication

Cutting edge genomics reveal new insights into tumour development, disease progression and therapeutic impacts in multiple myeloma

Publisher: Wiley

Date: 03-05-2017

DOI: 10.1111/BJH.14649

Abstract: Multiple Myeloma (MM) is a haematological malignancy characterised by the clonal expansion of plasma cells (PCs) within the bone marrow. Despite advances in therapy, MM remains a largely incurable disease with a median survival of 6 years. In almost all cases, the development of MM is preceded by the benign PC condition Monoclonal Gammopathy of Undetermined Significance (MGUS). Recent studies show that the transformation of MGUS to MM is associated with complex genetic changes. Understanding how these changes contribute to evolution will present targets for clinical intervention. We discuss three models of MM evolution the linear, the expansionist and the intraclonal heterogeneity models. Of particular interest is the intraclonal heterogeneity model. Here, distinct populations of MM PCs carry differing combinations of genetic mutations. Acquisition of additional mutations can contribute to subclonal lineages where "driver" mutations may influence selective pressure and dominance, and "passenger" mutations are neutral in their effects. Furthermore, studies show that clinical intervention introduces additional selective pressure on tumour cells and can influence subclone survival, leading to therapy resistance. This review discusses how Next Generation Sequencing approaches are revealing critical insights into the genetics of MM development, disease progression and treatment. MM disease progression will illuminate possible mechanisms underlying the tumour.

Publication

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13929-1

Abstract: The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here , as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower’s background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery.

Publication

Rival penalized competitive learning (RPCL): a topology-determining algorithm for analyzing gene expression data

Publisher: Elsevier BV

Date: 12-2003

DOI: 10.1016/J.COMPBIOLCHEM.2003.09.006

Abstract: DNA arrays have become the immediate choice in the analysis of large-scale expression measurements. Understanding the expression pattern of genes provide functional information on newly identified genes by computational approaches. Gene expression pattern is an indicator of the state of the cell, and abnormal cellular states can be inferred by comparing expression profiles. Since co-regulated genes, and genes involved in a particular pathway, tend to show similar expression patterns, clustering expression patterns has become the natural method of choice to differentiate groups. However, most methods based on cluster analysis suffer from the usual problems (i) dead units, and (ii) the problem of determining the correct number of clusters (k) needed to classify the data. Selecting the k has been an open problem of pattern recognition and statistics for decades. Since clustering reveals similar patterns present in the data, fixing this number strongly influences the quality of the result. While there is no theoretical solution to this problem, the number of clusters can be decided by a heuristic clustering algorithm called rival penalized competitive learning (RPCL). We present a novel implementation of RPCL that transforms the correct number of clusters problem to the tractable problem of clustering based on the degree of similarity. This is biologically significant since our implementation clusters functionally co-regulated genes and genes that present similar patterns of expression. This new approach reveals potential genes that are co-involved in a biological process. This implementation of the RPCL algorithm is useful in differentiating groups involved in concerted functional regulation and helps to progressively home into patterns, which are closely similar.

Publication

BioLit: integrating biological literature with databases

Publisher: Oxford University Press (OUP)

Date: 19-05-2008

DOI: 10.1093/NAR/GKN317

Publication

The repertoire of mutational signatures in human cancer

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1943-3

Abstract: Somatic mutations in cancer genomes are caused by multiple mutational processes, each of which generates a characteristic mutational signature 1 . Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium 2 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we characterized mutational signatures using 84,729,690 somatic mutations from 4,645 whole-genome and 19,184 exome sequences that encompass most types of cancer. We identified 49 single-base-substitution, 11 doublet-base-substitution, 4 clustered-base-substitution and 17 small insertion-and-deletion signatures. The substantial size of our dataset, compared with previous analyses 3–15 , enabled the discovery of new signatures, the separation of overlapping signatures and the decomposition of signatures into components that may represent associated—but distinct—DNA damage, repair and/or replication mechanisms. By estimating the contribution of each signature to the mutational catalogues of in idual cancer genomes, we revealed associations of signatures to exogenous or endogenous exposures, as well as to defective DNA-maintenance processes. However, many signatures are of unknown cause. This analysis provides a systematic perspective on the repertoire of mutational processes that contribute to the development of human cancer.

Publication

Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity

Publisher: Springer Science and Business Media LLC

Date: 28-01-2019

DOI: 10.1038/S41467-018-08205-7

Abstract: Integrative analysis of multi-omics layers at single cell level is critical for accurate dissection of cell-to-cell variation within certain cell populations. Here we report scCAT-seq, a technique for simultaneously assaying chromatin accessibility and the transcriptome within the same single cell. We show that the combined single cell signatures enable accurate construction of regulatory relationships between cis -regulatory elements and the target genes at single-cell resolution, providing a new dimension of features that helps direct discovery of regulatory patterns specific to distinct cell identities. Moreover, we generate the first single cell integrated map of chromatin accessibility and transcriptome in early embryos and demonstrate the robustness of scCAT-seq in the precise dissection of master transcription factors in cells of distinct states. The ability to obtain these two layers of omics data will help provide more accurate definitions of “single cell state” and enable the deconvolution of regulatory heterogeneity from complex cell populations.

Publication

Inferring structural variant cancer cell fraction

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-020-14351-8

Abstract: We present SVclone, a computational method for inferring the cancer cell fraction of structural variant (SV) breakpoints from whole-genome sequencing data. SVclone accurately determines the variant allele frequencies of both SV breakends, then simultaneously estimates the cancer cell fraction and SV copy number. We assess performance using in silico mixtures of real s les, at known proportions, created from two clonal metastases from the same patient. We find that SVclone’s performance is comparable to single-nucleotide variant-based methods, despite having an order of magnitude fewer data points. As part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we use SVclone to reveal a subset of liver, ovarian and pancreatic cancers with subclonally enriched copy-number neutral rearrangements that show decreased overall survival. SVclone enables improved characterisation of SV intra-tumour heterogeneity.

Publication

Exquisitely Platinum-Sensitive Triple-Negative Breast Cancer, Time for BRCA Methylation Testing?

Publisher: American Society of Clinical Oncology (ASCO)

Date: 11-2022

DOI: 10.1200/PO.22.00309

Publication

Chromosome arm aneuploidies shape tumour evolution and drug response

Publisher: Springer Science and Business Media LLC

Date: 23-01-2020

DOI: 10.1038/S41467-020-14286-0

Abstract: Chromosome arm aneuploidies (CAAs) are pervasive in cancers. However, how they affect cancer development, prognosis and treatment remains largely unknown. Here, we analyse CAA profiles of 23,427 tumours, identifying aspects of tumour evolution including probable orders in which CAAs occur and CAAs predicting tissue-specific metastasis. Both haematological and solid cancers initially gain chromosome arms, while only solid cancers subsequently preferentially lose multiple arms. 72 CAAs and 88 synergistically co-occurring CAA pairs multivariately predict good or poor survival for 58% of 6977 patients, with negligible impact of whole-genome doubling. Additionally, machine learning identifies 31 CAAs that robustly alter response to 56 chemotherapeutic drugs across cell lines representing 17 cancer types. We also uncover 1024 potential synthetic lethal pharmacogenomic interactions. Notably, in predicting drug response, CAAs substantially outperform mutations and focal deletions/ lifications combined. Thus, CAAs predict cancer prognosis, shape tumour evolution, metastasis and drug response, and may advance precision oncology.

Publication

Evaluation and comparison of mammalian subcellular localization prediction methods

Publisher: Springer Science and Business Media LLC

Date: 12-2006

DOI: 10.1186/1471-2105-7-S5-S3

Abstract: Determination of the subcellular location of a protein is essential to understanding its biochemical function. This information can provide insight into the function of hypothetical or novel proteins. These data are difficult to obtain experimentally but have become especially important since many whole genome sequencing projects have been finished and many resulting protein sequences are still lacking detailed functional information. In order to address this paucity of data, many computational prediction methods have been developed. However, these methods have varying levels of accuracy and perform differently based on the sequences that are presented to the underlying algorithm. It is therefore useful to compare these methods and monitor their performance. In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations ( nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, Golgi apparatus, endoplasmic reticulum (ER), peroxisome , and lysosome ). The selected methods were CELLO, MultiLoc, Proteome Analyst, pTarget and WoLF PSORT. These methods were evaluated using 3763 mouse proteins from SwissProt that represent the source of the training sets used in development of the in idual methods. In addition, an independent evaluation set of 2145 mouse proteins from LOCATE with a bias towards the subcellular localization underrepresented in SwissProt was used. The sensitivity and specificity were calculated for each method and compared to a theoretical value based on what might be observed by random chance. No in idual method had a sufficient level of sensitivity across both evaluation sets that would enable reliable application to hypothetical proteins. All methods showed lower performance on the LOCATE dataset and variable performance on in idual subcellular localizations was observed. Proteins localized to the secretory pathway were the most difficult to predict, while nuclear and extracellular proteins were predicted with the highest sensitivity.

Publication

Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41588-019-0562-0

Abstract: About half of all cancers have somatic integrations of retrotransposons. Here, to characterize their role in oncogenesis, we analyzed the patterns and mechanisms of somatic retrotransposition in 2,954 cancer genomes from 38 histological cancer subtypes within the framework of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. We identified 19,166 somatically acquired retrotransposition events, which affected 35% of s les and spanned a range of event types. Long interspersed nuclear element (LINE-1 L1 hereafter) insertions emerged as the first most frequent type of somatic structural variation in esophageal adenocarcinoma, and the second most frequent in head-and-neck and colorectal cancers. Aberrant L1 integrations can delete megabase-scale regions of a chromosome, which sometimes leads to the removal of tumor-suppressor genes, and can induce complex translocations and large-scale duplications. Somatic retrotranspositions can also initiate breakage–fusion–bridge cycles, leading to high-level lification of oncogenes. These observations illuminate a relevant role of 22 L1 retrotransposition in remodeling the cancer genome, with potential implications for the development of human tumors.

Publication

Pan-cancer analysis of whole genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1969-6

Abstract: Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale 1–3 . Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution in acral melanoma, for ex le, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter 4 identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation 5,6 analyses timings and patterns of tumour evolution 7 describes the erse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity 8,9 and evaluates a range of more-specialized features of cancer genomes 8,10–18 .

Publication

Integration of open access literature into the RCSB Protein Data Bank using BioLit

Publisher: Springer Science and Business Media LLC

Date: 29-04-2010

DOI: 10.1186/1471-2105-11-220

Publication

Progression of Disease Within 24 Months in Follicular Lymphoma Is Associated With Reduced Intratumoral Immune Infiltration.

Publisher: American Society of Clinical Oncology (ASCO)

Date: 12-2019

DOI: 10.1200/JCO.18.02365

Abstract: Understanding the immunobiology of the 15% to 30% of patients with follicular lymphoma (FL) who experience progression of disease within 24 months (POD24) remains a priority. Solid tumors with low levels of intratumoral immune infiltration have inferior outcomes. It is unknown whether a similar relationship exists between POD24 in FL. Digital gene expression using a custom code set—five immune effector, six immune checkpoint, one macrophage molecules—was applied to a discovery cohort of patients with early- and advanced-stage FL (n = 132). T-cell receptor repertoire analysis, flow cytometry, multispectral immunofluorescence, and next-generation sequencing were performed. The immune infiltration profile was validated in two independent cohorts of patients with advanced-stage FL requiring systemic treatment (n = 138, rituximab plus cyclophosphamide, vincristine, prednisone n = 45, rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone), with the latter selected to permit comparison of patients experiencing a POD24 event with those having no progression at 5 years or more. Immune molecules showed distinct clustering, characterized by either high or low expression regardless of categorization as an immune effector, immune checkpoint, or macrophage molecule. Low programmed death-ligand 2 (PD-L2) was the most sensitive/specific marker to segregate patients with adverse outcomes therefore, PD-L2 expression was chosen to distinguish immune infiltration HI (ie, high PD-L2) FL biopsies from immune infiltration LO (ie, low PD-L2) tumors. Immune infiltration HI tissues were highly infiltrated with macrophages and expanded populations of T-cell clones. Of note, the immune infiltration LO subset of patients with FL was enriched for POD24 events (odds ratio [OR], 4.32 c-statistic, 0.81 P = .001), validated in the independent cohorts (rituximab plus cyclophosphamide, vincristine, prednisone: OR, 2.95 c-statistic, 0.75 P = .011 and rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone: OR, 7.09 c-statistic, 0.88 P = .011). Mutations were equally proportioned across tissues, which indicated that degree of immune infiltration is capturing aspects of FL biology distinct from its mutational profile. Assessment of immune-infiltration by PD-L2 expression is a promising tool with which to help identify patients who are at risk for POD24.

Publication

Whole-genome landscape of pancreatic neuroendocrine tumours

Publisher: Springer Science and Business Media LLC

Date: 15-02-2017

DOI: 10.1038/NATURE21063

Abstract: The diagnosis of pancreatic neuroendocrine tumours (PanNETs) is increasing owing to more sensitive detection methods, and this increase is creating challenges for clinical management. We performed whole-genome sequencing of 102 primary PanNETs and defined the genomic events that characterize their pathogenesis. Here we describe the mutational signatures they harbour, including a deficiency in G:C > T:A base excision repair due to inactivation of MUTYH, which encodes a DNA glycosylase. Clinically sporadic PanNETs contain a larger-than-expected proportion of germline mutations, including previously unreported mutations in the DNA repair genes MUTYH, CHEK2 and BRCA2. Together with mutations in MEN1 and VHL, these mutations occur in 17% of patients. Somatic mutations, including point mutations and gene fusions, were commonly found in genes involved in four main pathways: chromatin remodelling, DNA damage repair, activation of mTOR signalling (including previously undescribed EWSR1 gene fusions), and telomere maintenance. In addition, our gene expression analyses identified a subgroup of tumours associated with hypoxia and HIF signalling.

Publication

Using genomics to better define high-risk MGUS/SMM patients

Publisher: Impact Journals, LLC

Date: 27-11-2018

DOI: 10.18632/ONCOTARGET.26390

Publication

Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity

Publisher: Cold Spring Harbor Laboratory

Date: 19-05-2018

DOI: 10.1101/316208

Abstract: Integrative analysis of multi-omics layers at single cell level is critical for accurate dissection of cell-to-cell variation within certain cell populations. Here we report scCAT-seq, a technique for simultaneously assaying chromatin accessibility and the transcriptome within the same single cell. We show that the combined single cell signatures enable accurate construction of regulatory relationships between cis -regulatory elements and the target genes at single-cell resolution, providing a new dimension of features that helps direct discovery of regulatory patterns specific to distinct cell identities. Moreover, we generated the first single cell integrated maps of chromatin accessibility and transcriptome in human pre-implantation embryos and demonstrated the robustness of scCAT-seq in the precise dissection of master transcription factors in cells of distinct states during embryo development. The ability to obtain these two layers of omics data will help provide more accurate definitions of “single cell state” and enable the deconvolution of regulatory heterogeneity from complex cell populations.

Publication

LOCATE: a mouse protein subcellular localization database

Publisher: Oxford University Press (OUP)

Date: 2006

DOI: 10.1093/NAR/GKJ069

Publication

Butler enables rapid cloud-based analysis of thousands of human genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41587-019-0360-3

Abstract: We present Butler, a computational tool that facilitates large-scale genomic analyses on public and academic clouds. Butler includes innovative anomaly detection and self-healing functions that improve the efficiency of data processing and analysis by 43% compared with current approaches. Butler enabled processing of a 725-terabyte cancer genome dataset from the Pan-Cancer Analysis of Whole Genomes (PCAWG) project in a time-efficient and uniform manner.

Publication

Comprehensive molecular characterization of mitochondrial genomes in human cancers

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41588-019-0557-X

Abstract: Mitochondria are essential cellular organelles that play critical roles in cancer. Here, as part of the International Cancer Genome Consortium/The Cancer Genome Atlas Pan-Cancer Analysis of Whole Genomes Consortium, which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we performed a multidimensional, integrated characterization of mitochondrial genomes and related RNA sequencing data. Our analysis presents the most definitive mutational landscape of mitochondrial genomes and identifies several hypermutated cases. Truncating mutations are markedly enriched in kidney, colorectal and thyroid cancers, suggesting oncogenic effects with the activation of signaling pathways. We find frequent somatic nuclear transfers of mitochondrial DNA, some of which disrupt therapeutic target genes. Mitochondrial copy number varies greatly within and across cancers and correlates with clinical variables. Co-expression analysis highlights the function of mitochondrial genes in oxidative phosphorylation, DNA repair and the cell cycle, and shows their connections with clinically actionable genes. Our study lays a foundation for translating mitochondrial biology into clinical applications.

Publication

qpure: A Tool to Estimate Tumor Cellularity from Genome-Wide Single-Nucleotide Polymorphism Profiles

Publisher: Public Library of Science (PLoS)

Date: 25-09-2012

DOI: 10.1371/JOURNAL.PONE.0045835

Publication

Genomic basis for RNA alterations in cancer

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1970-0

Abstract: Transcript alterations often result from somatic changes in cancer genomes 1 . Various forms of RNA alterations have been described in cancer, including overexpression 2 , altered splicing 3 and gene fusions 4 however, it is difficult to attribute these to underlying genomic changes owing to heterogeneity among patients and tumour types, and the relatively small cohorts of patients for whom s les have been analysed by both transcriptome and whole-genome sequencing. Here we present, to our knowledge, the most comprehensive catalogue of cancer-associated gene alterations to date, obtained by characterizing tumour transcriptomes from 1,188 donors of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) 5 . Using matched whole-genome sequencing data, we associated several categories of RNA alterations with germline and somatic DNA alterations, and identified probable genetic mechanisms. Somatic copy-number alterations were the major drivers of variations in total gene and allele-specific expression. We identified 649 associations of somatic single-nucleotide variants with gene expression in cis , of which 68.4% involved associations with flanking non-coding regions of the gene. We found 1,900 splicing alterations associated with somatic mutations, including the formation of exons within introns in proximity to Alu elements. In addition, 82% of gene fusions were associated with structural variants, including 75 of a new class, termed ‘bridged’ fusions, in which a third genomic location bridges two genes. We observed transcriptomic alteration signatures that differ between cancer types and have associations with variations in DNA mutational signatures. This compendium of RNA alterations in the genomic context provides a rich resource for identifying genes and mechanisms that are functionally implicated in cancer.

Publication

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41586-020-1965-X

Abstract: The discovery of drivers of cancer has traditionally focused on protein-coding genes 1–4 . Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium 5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of in idual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers 6,7 , raise doubts about others and identify novel candidates, including point mutations in the 5′ region of TP53 , in the 3′ untranslated regions of NFKBIZ and TOB1 , focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional ex les of these drivers will be found as more cancer genomes become available.

Publication

Subclonal evolution in disease progression from MGUS/SMM to multiple myeloma is characterised by clonal stability.

Publisher: Springer Science and Business Media LLC

Date: 25-07-2018

DOI: 10.1038/S41375-018-0206-X

Publication

Subcellular Localization of Mammalian Type II Membrane Proteins

Publisher: Wiley

Date: 16-03-2006

DOI: 10.1111/J.1600-0854.2006.00407.X

Abstract: Application of a computational membrane organization prediction pipeline, MemO, identified putative type II membrane proteins as proteins predicted to encode a single alpha-helical transmembrane domain (TMD) and no signal peptides. MemO was applied to RIKEN's mouse isoform protein set to identify 1436 non-overlapping genomic regions or transcriptional units (TUs), which encode exclusively type II membrane proteins. Proteins with overlapping predicted InterPro and TMDs were reviewed to discard false positive predictions resulting in a dataset comprised of 1831 transcripts in 1408 TUs. This dataset was used to develop a systematic protocol to document subcellular localization of type II membrane proteins. This approach combines mining of published literature to identify subcellular localization data and a high-throughput, polymerase chain reaction (PCR)-based approach to experimentally characterize subcellular localization. These approaches have provided localization data for 244 and 169 proteins. Type II membrane proteins are localized to all major organelle compartments however, some biases were observed towards the early secretory pathway and punctate structures. Collectively, this study reports the subcellular localization of 26% of the defined dataset. All reported localization data are presented in the LOCATE database (www.locate.imb.uq.edu.au).

Publication

Whole genomes redefine the mutational landscape of pancreatic cancer

Publisher: Springer Science and Business Media LLC

Date: 25-02-2015

DOI: 10.1038/NATURE14169

Publication

Pathway and network analysis of more than 2500 whole cancer genomes

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-020-14367-0

Abstract: The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53 , TLE4 , and TCF4 . We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and s les containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as s les with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.

Publication

PTEN deletion drives acute myeloid leukemia resistance to MEK inhibitors.

Publisher: Impact Journals, LLC

Date: 08-10-2019

DOI: 10.18632/ONCOTARGET.27206

Publication

Computational Biology Resources Lack Persistence and Usability

Publisher: Public Library of Science (PLoS)

Date: 18-07-2008

DOI: 10.1371/JOURNAL.PCBI.1000136

Publication

Hypermutation In Pancreatic Cancer

Publisher: Elsevier BV

Date: 2017

DOI: 10.1053/J.GASTRO.2016.09.060

Abstract: Pancreatic cancer is molecularly erse, with few effective therapies. Increased mutation burden and defective DNA repair are associated with response to immune checkpoint inhibitors in several other cancer types. We interrogated 385 pancreatic cancer genomes to define hypermutation and its causes. Mutational signatures inferring defects in DNA repair were enriched in those with the highest mutation burdens. Mismatch repair deficiency was identified in 1% of tumors harboring different mechanisms of somatic inactivation of MLH1 and MSH2. Defining mutation load in in idual pancreatic cancers and the optimal assay for patient selection may inform clinical trial design for immunotherapy in pancreatic cancer.

Publication

Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes

Publisher: Springer Science and Business Media LLC

Date: 24-10-2012

DOI: 10.1038/NATURE11547

Publication

Genomic footprints of activated telomere maintenance mechanisms in cancer

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13824-9

Abstract: Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium , we dissect whole-genome sequencing data of over 2500 matched tumor-control s les from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXX trunc ) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor s les contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXX trunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer.

Publication

EBV-associated primary CNS lymphoma occurring after immunosuppression is a distinct immunobiological entity.

Publisher: American Society of Hematology

Date: 18-03-2021

DOI: 10.1182/BLOOD.2020008520

Abstract: Primary central nervous system lymphoma (PCNSL) is confined to the brain, eyes, and cerebrospinal fluid without evidence of systemic spread. Rarely, PCNSL occurs in the context of immunosuppression (eg, posttransplant lymphoproliferative disorders or HIV [AIDS-related PCNSL]). These cases are poorly characterized, have dismal outcome, and are typically Epstein-Barr virus (EBV)-associated (ie, tissue-positive). We used targeted sequencing and digital multiplex gene expression to compare the genetic landscape and tumor microenvironment (TME) of 91 PCNSL tissues all with diffuse large B-cell lymphoma histology. Forty-seven were EBV tissue-negative: 45 EBV− HIV− PCNSL and 2 EBV− HIV+ PCNSL and 44 were EBV tissue-positive: 23 EBV+ HIV+ PCNSL and 21 EBV+ HIV− PCNSL. As with prior studies, EBV− HIV− PCNSL had frequent MYD88, CD79B, and PIM1 mutations, and enrichment for the activated B-cell (ABC) cell-of-origin subtype. In contrast, these mutations were absent in all EBV tissue-positive cases and ABC frequency was low. Furthermore, copy number loss in HLA class I/II and antigen-presenting rocessing genes were rarely observed, indicating retained antigen presentation. To counter this, EBV+ HIV− PCNSL had a tolerogenic TME with elevated macrophage and immune-checkpoint gene expression, whereas AIDS-related PCNSL had low CD4 gene counts. EBV-associated PCNSL in the immunosuppressed is immunobiologically distinct from EBV− HIV− PCNSL, and, despite expressing an immunogenic virus, retains the ability to present EBV antigens. Results provide a framework for targeted treatment.

Publication

ARED 2.0: an update of AU-rich element mRNA database

Publisher: Oxford University Press (OUP)

Date: 2003

DOI: 10.1093/NAR/GKG023

Abstract: The Adenylate Uridylate (AU)-Rich Element Database, ARED-mRNA version 2.0, contains information not present in the previous ARED. This includes additional data entries, new information and links to Unigene, LocusLink, RefSeq records and mouse homologue data. An ARE consensus sequence specific to the 3'UTR is the basis of ARED that demonstrated two important findings: (i) AREs are present in a large, previously unrecognized set of human mRNAs and (ii) ARE-mRNAs encode proteins of erse functions which are largely involved in early and transient biological responses. In this update, we have modified the strategy for identifying ARE-mRNA in order to systematically deal with inconsistencies of molecule type and mRNA region in GenBank records. Potential uses for the ARED in functional genomics are also given. The database is accessible via the web, rc.kfshrc.edu.sa/ared, with a new querying system that allows searching ARE-mRNAs by any public database identifier or name. The ARED website also contains relevant links to uses for the ARED.

Publication

Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41588-019-0564-Y

Abstract: Chromatin is folded into successive layers to organize linear DNA. Genes within the same topologically associating domains (TADs) demonstrate similar expression and histone-modification profiles, and boundaries separating different domains have important roles in reinforcing the stability of these features. Indeed, domain disruptions in human cancers can lead to misregulation of gene expression. However, the frequency of domain disruptions in human cancers remains unclear. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumor types, we analyzed 288,457 somatic structural variations (SVs) to understand the distributions and effects of SVs across TADs. Notably, SVs can lead to the fusion of discrete TADs, and complex rearrangements markedly change chromatin folding maps in the cancer genomes. Notably, only 14% of the boundary deletions resulted in a change in expression in nearby genes of more than twofold.

Publication

The landscape of viral associations in human cancers

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41588-019-0558-9

Abstract: Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, for which whole-genome and—for a subset—whole-transcriptome sequencing data from 2,658 cancers across 38 tumor types was aggregated, we systematically investigated potential viral pathogens using a consensus approach that integrated three independent pipelines. Viruses were detected in 382 genome and 68 transcriptome datasets. We found a high prevalence of known tumor-associated viruses such as Epstein–Barr virus (EBV), hepatitis B virus (HBV) and human papilloma virus (HPV for ex le, HPV16 or HPV18). The study revealed significant exclusivity of HPV and driver mutations in head-and-neck cancer and the association of HPV with APOBEC mutational signatures, which suggests that impaired antiviral defense is a driving force in cervical, bladder and head-and-neck carcinoma. For HBV, HPV16, HPV18 and adeno-associated virus-2 (AAV2), viral integration was associated with local variations in genomic copy numbers. Integrations at the TERT promoter were associated with high telomerase expression evidently activating this tumor-driving process. High levels of endogenous retrovirus (ERV1) expression were linked to a worse survival outcome in patients with kidney cancer.

Publication

High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-019-13885-W

Abstract: The impact of somatic structural variants (SVs) on gene expression in cancer is largely unknown. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data and RNA sequencing from a common set of 1220 cancer cases, we report hundreds of genes for which the presence within 100 kb of an SV breakpoint associates with altered expression. For the majority of these genes, expression increases rather than decreases with corresponding breakpoint events. Up-regulated cancer-associated genes impacted by this phenomenon include TERT , MDM2 , CDK4 , ERBB2 , CD274 , PDCD1LG2 , and IGF2 . TERT -associated breakpoints involve ~3% of cases, most frequently in liver biliary, melanoma, sarcoma, stomach, and kidney cancers. SVs associated with up-regulation of PD1 and PDL1 genes involve ~1% of non- lified cases. For many genes, SVs are significantly associated with increased numbers or greater proximity of enhancer regulatory elements near the gene. DNA methylation near the promoter is often increased with nearby SV breakpoint, which may involve inactivation of repressor elements.

Publication

Whole-genome characterization of chemoresistant ovarian cancer

Publisher: Springer Science and Business Media LLC

Date: 27-05-2015

DOI: 10.1038/NATURE14410

Abstract: Patients with high-grade serous ovarian cancer (HGSC) have experienced little improvement in overall survival, and standard treatment has not advanced beyond platinum-based combination chemotherapy, during the past 30 years. To understand the drivers of clinical phenotypes better, here we use whole-genome sequencing of tumour and germline DNA s les from 92 patients with primary refractory, resistant, sensitive and matched acquired resistant disease. We show that gene breakage commonly inactivates the tumour suppressors RB1, NF1, RAD51B and PTEN in HGSC, and contributes to acquired chemotherapy resistance. CCNE1 lification was common in primary resistant and refractory disease. We observed several molecular events associated with acquired resistance, including multiple independent reversions of germline BRCA1 or BRCA2 mutations in in idual patients, loss of BRCA1 promoter methylation, an alteration in molecular subtype, and recurrent promoter fusion associated with overexpression of the drug efflux pump MDR1.

Publication

Word add-in for ontology recognition: semantic enrichment of scientific literature

Publisher: Springer Science and Business Media LLC

Date: 24-02-2010

DOI: 10.1186/1471-2105-11-103

Publication

Translocation breakpoints preferentially occur in euchromatin and acrocentric chromosomes

Publisher: MDPI AG

Date: 08-01-2018

DOI: 10.3390/CANCERS10010013

Publication

Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

Publisher: Springer Science and Business Media LLC

Date: 05-02-2020

DOI: 10.1038/S41467-020-14352-7

Abstract: The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour s le. In simulations, we find TrackSig has a 3–5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes.

Publication

I Am Not a Scientist, I Am a Number

Publisher: Public Library of Science (PLoS)

Date: 26-12-2008

DOI: 10.1371/JOURNAL.PCBI.1000247

Publication

Genome‐wide DNA methylation patterns in pancreatic ductal adenocarcinoma reveal epigenetic deregulation of SLIT‐ROBO, ITGA2 and MET signaling

Publisher: Wiley

Date: 09-05-2014

DOI: 10.1002/IJC.28765

Abstract: The importance of epigenetic modifications such as DNA methylation in tumorigenesis is increasingly being appreciated. To define the genome-wide pattern of DNA methylation in pancreatic ductal adenocarcinomas (PDAC), we captured the methylation profiles of 167 untreated resected PDACs and compared them to a panel of 29 adjacent nontransformed pancreata using high-density arrays. A total of 11,634 CpG sites associated with 3,522 genes were significantly differentially methylated (DM) in PDAC and were capable of segregating PDAC from non-malignant pancreas, regardless of tumor cellularity. As expected, PDAC hypermethylation was most prevalent in the 5' region of genes (including the proximal promoter, 5'UTR and CpG islands). Approximately 33% DM genes showed significant inverse correlation with mRNA expression levels. Pathway analysis revealed an enrichment of aberrantly methylated genes involved in key molecular mechanisms important to PDAC: TGF-β, WNT, integrin signaling, cell adhesion, stellate cell activation and axon guidance. Given the recent discovery that SLIT-ROBO mutations play a clinically important role in PDAC, the role of epigenetic perturbation of axon guidance was pursued in more detail. Bisulfite licon deep sequencing and qRT-PCR expression analyses confirmed recurrent perturbation of axon guidance pathway genes SLIT2, SLIT3, ROBO1, ROBO3, ITGA2 and MET and suggests epigenetic suppression of SLIT-ROBO signaling and up-regulation of MET and ITGA2 expression. Hypomethylation of MET and ITGA2 correlated with high gene expression, which was associated with poor survival. These data suggest that aberrant methylation plays an important role in pancreatic carcinogenesis affecting core signaling pathways with potential implications for the disease pathophysiology and therapy.

Lynn Fink

Researcher

Related Links

Publications

PhosphoregDB: The tissue and sub-cellular distribution of mammalian protein kinases and phosphatases

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis

Genomic analyses identify molecular subtypes of pancreatic cancer

Integrative pathway enrichment analysis of multivariate omics data

LOCATE: A mammalian protein subcellular localization database

Patterns of somatic structural variation in human cancer genomes

Sleeping Beauty mutagenesis reveals cooperating mutations and pathways in pancreatic adenocarcinoma

Marked mitochondrial genetic variation in individuals and populations of the carcinogenic liver fluke Clonorchis sinensis

Towards defining the nuclear proteome

Computational approaches to identify functional genetic variants in cancer genomes

Differential Use of Signal Peptides and Membrane Domains Is a Common Occurrence in the Protein Output of Transcriptional Units

Minimizing Sample Failure Rates for Challenging Clinical Tumor Samples

Open access: Taking full advantage of the content

A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns

2HAPI: a microarray data analysis system

Targeted Next-Gen Sequencing for Detecting MLL Gene Fusions in Leukemia

Subtype-Specific Analyses Reveal Infiltrative Basal Cell Carcinomas Are Highly Interactive with their Environment

Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing

Cutting edge genomics reveal new insights into tumour development, disease progression and therapeutic impacts in multiple myeloma

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Rival penalized competitive learning (RPCL): a topology-determining algorithm for analyzing gene expression data

BioLit: integrating biological literature with databases

The repertoire of mutational signatures in human cancer

Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity

Inferring structural variant cancer cell fraction

Exquisitely Platinum-Sensitive Triple-Negative Breast Cancer, Time for BRCA Methylation Testing?

Chromosome arm aneuploidies shape tumour evolution and drug response

Evaluation and comparison of mammalian subcellular localization prediction methods

Pan-cancer analysis of whole genomes identifies driver rearrangements promoted by LINE-1 retrotransposition

Pan-cancer analysis of whole genomes

Integration of open access literature into the RCSB Protein Data Bank using BioLit

Progression of Disease Within 24 Months in Follicular Lymphoma Is Associated With Reduced Intratumoral Immune Infiltration.

Whole-genome landscape of pancreatic neuroendocrine tumours

Using genomics to better define high-risk MGUS/SMM patients

Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity

LOCATE: a mouse protein subcellular localization database

Butler enables rapid cloud-based analysis of thousands of human genomes

Comprehensive molecular characterization of mitochondrial genomes in human cancers

qpure: A Tool to Estimate Tumor Cellularity from Genome-Wide Single-Nucleotide Polymorphism Profiles

Genomic basis for RNA alterations in cancer

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes

Subclonal evolution in disease progression from MGUS/SMM to multiple myeloma is characterised by clonal stability.

Subcellular Localization of Mammalian Type II Membrane Proteins

Whole genomes redefine the mutational landscape of pancreatic cancer

Pathway and network analysis of more than 2500 whole cancer genomes

PTEN deletion drives acute myeloid leukemia resistance to MEK inhibitors.

Computational Biology Resources Lack Persistence and Usability

Hypermutation In Pancreatic Cancer

Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes

Genomic footprints of activated telomere maintenance mechanisms in cancer

EBV-associated primary CNS lymphoma occurring after immunosuppression is a distinct immunobiological entity.

ARED 2.0: an update of AU-rich element mRNA database

Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer

The landscape of viral associations in human cancers

High-coverage whole-genome analysis of 1220 cancers reveals hundreds of genes deregulated by rearrangement-mediated cis-regulatory alterations

Whole-genome characterization of chemoresistant ovarian cancer

Word add-in for ontology recognition: semantic enrichment of scientific literature

Translocation breakpoints preferentially occur in euchromatin and acrocentric chromosomes

Reconstructing evolutionary trajectories of mutation signature activities in cancer using TrackSig

I Am Not a Scientist, I Am a Number

Genome‐wide DNA methylation patterns in pancreatic ductal adenocarcinoma reveal epigenetic deregulation of SLIT‐ROBO, ITGA2 and MET signaling

Related Organisations

XING Genomic Services

University Of California, San Diego

University Of Arizona

University Of Queensland

SELF

Queensland University Of Technology Australian Translational Genomics Centre

Bond University

BGI Group

Related Funding Activities

Semantic Mark-up, XML Formatting, And Submission Of Scholarly Articles

SciVee – New Modes Of Scientific Dissemination

BioLit: Open Access Tools For Integration Of The Biological Literature And Databases

Integrating Immunity And Genetics In Follicular Lymphoma To Establish A Prognostic Score Fit For The Modern Era

ARC Training Centre For Innovation In Biomedical Imaging Technology