ORCID Profile
0000-0002-0138-2691
Current Organisations
University of Southampton
,
University of Western Australia
,
Telethon Kids Institute
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: Elsevier BV
Date: 12-2022
Publisher: Springer Science and Business Media LLC
Date: 03-12-2010
DOI: 10.1038/LEU.2009.246
Abstract: Acute myeloid leukemia (AML) involves a block in terminal differentiation of the myeloid lineage and uncontrolled proliferation of a progenitor state. Using phorbol myristate acetate (PMA), it is possible to overcome this block in THP-1 cells (an M5-AML containing the MLL-MLLT3 fusion), resulting in differentiation to an adherent monocytic phenotype. As part of FANTOM4, we used microarrays to identify 23 microRNAs that are regulated by PMA. We identify four PMA-induced microRNAs (mir-155, mir-222, mir-424 and mir-503) that when overexpressed cause cell-cycle arrest and partial differentiation and when used in combination induce additional changes not seen by any in idual microRNA. We further characterize these pro-differentiative microRNAs and show that mir-155 and mir-222 induce G2 arrest and apoptosis, respectively. We find mir-424 and mir-503 are derived from a polycistronic precursor mir-424-503 that is under repression by the MLL-MLLT3 leukemogenic fusion. Both of these microRNAs directly target cell-cycle regulators and induce G1 cell-cycle arrest when overexpressed in THP-1. We also find that the pro-differentiative mir-424 and mir-503 downregulate the anti-differentiative mir-9 by targeting a site in its primary transcript. Our study highlights the combinatorial effects of multiple microRNAs within cellular systems.
Publisher: Springer Science and Business Media LLC
Date: 25-04-2014
Publisher: Elsevier BV
Date: 03-2010
Publisher: Oxford University Press (OUP)
Date: 07-2006
DOI: 10.1093/NAR/GKL191
Publisher: Springer Science and Business Media LLC
Date: 15-03-2007
Abstract: Mutations in the PTEN induced putative kinase 1 ( PINK1 ) are implicated in early-onset Parkinson's disease. PINK1 is expressed abundantly in mitochondria rich tissues, such as skeletal muscle, where it plays a critical role determining mitochondrial structural integrity in Drosophila . Herein we characterize a novel splice variant of PINK1 (svPINK1) that is homologous to the C-terminus regulatory domain of the protein kinase. Naturally occurring non-coding antisense provides sophisticated mechanisms for ersifying genomes and we describe a human specific non-coding antisense expressed at the PINK1 locus (naPINK1). We further demonstrate that PINK1 varies in vivo when human skeletal muscle mitochondrial content is enhanced, supporting the idea that PINK1 has a physiological role in mitochondrion. The observation of concordant regulation of svPINK1 and naPINK1 during in vivo mitochondrial biogenesis was confirmed using RNAi, where selective targeting of naPINK1 results in loss of the PINK1 splice variant in neuronal cell lines. Our data presents the first direct observation that a mammalian non-coding antisense molecule can positively influence the abundance of a cis -transcribed mRNA under physiological abundance conditions. While our analysis implies a possible human specific and dsRNA-mediated mechanism for stabilizing the expression of svPINK1, it also points to a broader genomic strategy for regulating a human disease locus and increases the complexity through which alterations in the regulation of the PINK1 locus could occur.
Publisher: Cold Spring Harbor Laboratory
Date: 07-2020
Abstract: Gene expression profiles in homologous tissues have been observed to be different between species, which may be due to differences between species in the gene expression program in each cell type, but may also reflect differences in cell type composition of each tissue in different species. Here, we compare expression profiles in matching primary cells in human, mouse, rat, dog, and chicken using Cap Analysis Gene Expression (CAGE) and short RNA (sRNA) sequencing data from FANTOM5. While we find that expression profiles of orthologous genes in different species are highly correlated across cell types, in each cell type many genes were differentially expressed between species. Expression of genes with products involved in transcription, RNA processing, and transcriptional regulation was more likely to be conserved, while expression of genes encoding proteins involved in intercellular communication was more likely to have erged during evolution. Conservation of expression correlated positively with the evolutionary age of genes, suggesting that ergence in expression levels of genes critical for cell function was restricted during evolution. Motif activity analysis showed that both promoters and enhancers are activated by the same transcription factors in different species. An analysis of expression levels of mature miRNAs and of primary miRNAs identified by CAGE revealed that evolutionary old miRNAs are more likely to have conserved expression patterns than young miRNAs. We conclude that key aspects of the regulatory network are conserved, while differential expression of genes involved in cell-to-cell communication may contribute greatly to phenotypic differences between species.
Publisher: Springer Science and Business Media LLC
Date: 28-01-2015
Publisher: Public Library of Science (PLoS)
Date: 06-03-2017
Publisher: Public Library of Science (PLoS)
Date: 14-12-2015
Publisher: The Company of Biologists
Date: 2016
DOI: 10.1242/JCS.186767
Abstract: Lymphangiogenesis plays a crucial role during development, in cancer metastasis and in inflammation. Activation of VEGFR-3 by VEGF-C is one of the main drivers of lymphangiogenesis, but the transcriptional events downstream of VEGFR-3 activation are largely unknown. Recently, we identified a wave of immediate early transcription factors (TF) upregulated in human lymphatic endothelial cells (LEC) within the first 30 to 80 min after VEGFR-3 activation. Expression of these TFs must be regulated by additional, pre-existing TFs, which are rapidly activated by VEGFR-3 signaling. Using TF activity analysis, we identified the homeobox TF HOXD10 to be specifically activated at early time points after VEGFR-3 stimulation, and to regulate expression of immediate early TFs, including NR4A1. Gain- and loss of function studies revealed that HOXD10 is involved in LEC migration and formation of cord-like structures. Furthermore, HOXD10 regulates expression of VE-cadherin, claudin-5 and e-NOS, and promotes lymphatic endothelial permeability. Taken together, these results reveal an important and unanticipated role of HOXD10 in the regulation of VEGFR-3 signaling in lymphatic endothelial cells and in the control of lymphangiogenesis and permeability.
Publisher: American Society of Hematology
Date: 24-04-2014
DOI: 10.1182/BLOOD-2013-02-483537
Abstract: Expression analysis of novel potential regulatory epigenetic factors in hematopoiesis.
Publisher: Cold Spring Harbor Laboratory
Date: 05-01-2010
Abstract: MicroRNAs (miRNAs) are short (20–23 nt) RNAs that are sequence-specific mediators of transcriptional and post-transcriptional regulation of gene expression. Modern high-throughput technologies enable deep sequencing of such RNA species on an unprecedented scale. We find that the analysis of small RNA deep-sequencing libraries can be affected by cross-mapping, in which RNA sequences originating from one locus are inadvertently mapped to another. Similar to cross-hybridization on microarrays, cross-mapping is prevalent among miRNAs, as they tend to occur in families, are similar or derived from repeat or structural RNAs, or are post-transcriptionally modified. Here, we develop a strategy to correct for cross-mapping, and apply it to the analysis of RNA editing in mature miRNAs. In contrast to previous reports, our analysis suggests that RNA editing in mature miRNAs is rare in animals.
Publisher: American Association for Cancer Research (AACR)
Date: 10-2017
DOI: 10.1158/1541-7786.MCR-17-0191
Abstract: Lung cancer is the leading cause of cancer-related deaths worldwide. The majority of cancer driver mutations have been identified however, relevant epigenetic regulation involved in tumorigenesis has only been fragmentarily analyzed. Epigenetically regulated genes have a great theranostic potential, especially in tumors with no apparent driver mutations. Here, epigenetically regulated genes were identified in lung cancer by an integrative analysis of promoter-level expression profiles from Cap Analysis of Gene Expression (CAGE) of 16 non–small cell lung cancer (NSCLC) cell lines and 16 normal lung primary cell specimens with DNA methylation data of 69 NSCLC cell lines and 6 normal lung epithelial cells. A core set of 49 coding genes and 10 long noncoding RNAs (lncRNA), which are upregulated in NSCLC cell lines due to promoter hypomethylation, was uncovered. Twenty-two epigenetically regulated genes were validated (upregulated genes with hypomethylated promoters) in the adenocarcinoma and squamous cell cancer subtypes of lung cancer using The Cancer Genome Atlas data. Furthermore, it was demonstrated that multiple copies of the REP522 DNA repeat family are prominently upregulated due to hypomethylation in NSCLC cell lines, which leads to cancer-specific expression of lncRNAs, such as RP1-90G24.10, AL022344.4, and PCAT7. Finally, Myeloma Overexpressed (MYEOV) was identified as the most promising candidate. Functional studies demonstrated that MYEOV promotes cell proliferation, survival, and invasion. Moreover, high MYEOV expression levels were associated with poor prognosis. Implications: This report identifies a robust list of 22 candidate driver genes that are epigenetically regulated in lung cancer such genes may complement the known mutational drivers. Visual Overview: ontent/molcanres/15/10/1354/F1.large.jpg. Mol Cancer Res 15(10) 1354–65. ©2017 AACR.
Publisher: Springer Science and Business Media LLC
Date: 13-06-2017
DOI: 10.1038/LEU.2016.165
Publisher: Springer Science and Business Media LLC
Date: 09-2012
DOI: 10.1038/NATURE11233
Publisher: Frontiers Media SA
Date: 16-07-2020
Publisher: American Association for Cancer Research (AACR)
Date: 15-07-2016
DOI: 10.1158/1538-7445.AM2016-2897
Abstract: Genes that are frequently deregulated in cancer are clinically attractive as both potential pan-cancer diagnostic markers and therapeutic targets. Here we compared Cap Analysis of Gene Expression (CAGE) profiles from 225 cancer cell lines and 339 corresponding primary cell s les to identify transcripts that are recurrently deregulated in a broad range of cancer types. CAGE is a 5’ sequence tag technology that globally determines transcription start sites (TSS) in the genome and their expression levels. This allowed us to assess novel aspects of the cancer transcriptome. First, we identified (at promoter resolution) hundreds of protein coding and long non-coding transcripts that are commonly de-regulated in cancer cell lines. Next, we showed that promoters that overlap repetitive elements (especially SINE/Alu and LTR/ERV1 elements) are often upregulated in cancer. In particular, a specific repeat family, REP522 (largely palindromic, unclassified interspersed repeat of ∼1.8Kb in size), was strongly enriched for the most up-regulated promoters. To our knowledge this is the first report implicating REP522 activation in cancer. Then, taking advantage of the fact that CAGE data can be used to estimate the activity of enhancers from balanced bidirectional transcription we identified 90 enhancer RNA producing regions that are recurrently activated in cancer cell lines. With ENCODE ChIA-PET data, we linked 16 of the cancer-activated enhancers to promoters of known cancer related genes. Finally, to confirm that our results are relevant to clinical tumors we performed complementary analysis in RNA-seq data from 4,055 tumors and 563 normal tissues profiled by The Cancer Genome Atlas (TCGA) and we identified a core set of pan-cancer biomarkers (of both coding and non-coding transcripts) that are recurrently perturbed in both the FANTOM5 and TCGA datasets. In summary, our extensive transcriptome analysis identified a comprehensive set of candidate biomarkers with pan-cancer potential, and extended the perspective of enhancers and repetitive elements that are recurrently activated during carcinogenesis. Citation Format: Bogumil Kaczkowski, Yuji Tanaka, Hideya Kawaji, Albin Sandelin, Robin Andersson, Masayoshi Itoh, Timo Lassmann, Yoshihide Hayashizaki, Piero Carninci, Alistair R.R. Forrest, FANTOM5 Consortium. Recurrent transcriptome alterations across multiple cancer types. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research 2016 Apr 16-20 New Orleans, LA. Philadelphia (PA): AACR Cancer Res 2016 (14 Suppl):Abstract nr 2897.
Publisher: Proceedings of the National Academy of Sciences
Date: 27-03-2014
Abstract: Naturally occurring regulatory T (Treg) cells, which specifically express the transcription factor forkhead box P3 (Foxp3), are engaged in the maintenance of immunological self-tolerance and homeostasis. By transcriptional start site cluster analysis, we assessed here how genome-wide patterns of DNA methylation or Foxp3 binding sites were associated with Treg-specific gene expression. We found that Treg-specific DNA hypomethylated regions were closely associated with Treg up-regulated transcriptional start site clusters, whereas Foxp3 binding regions had no significant correlation with either up- or down-regulated clusters in nonactivated Treg cells. However, in activated Treg cells, Foxp3 binding regions showed a strong correlation with down-regulated clusters. In accordance with these findings, the above two features of activation-dependent gene regulation in Treg cells tend to occur at different locations in the genome. The results collectively indicate that Treg-specific DNA hypomethylation is instrumental in gene up-regulation in steady state Treg cells, whereas Foxp3 down-regulates the expression of its target genes in activated Treg cells. Thus, the two events seem to play distinct but complementary roles in Treg-specific gene expression.
Publisher: Springer Science and Business Media LLC
Date: 03-10-2017
Abstract: The FANTOM5 expression atlas is a quantitative measurement of the activity of nearly 200,000 promoter regions across nearly 2,000 different human primary cells, tissue types and cell lines. Generation of this atlas was made possible by the use of CAGE, an experimental approach to localise transcription start sites at single-nucleotide resolution by sequencing the 5′ ends of capped RNAs after their conversion to cDNAs. While 50% of CAGE-defined promoter regions could be confidently associated to adjacent transcriptional units, nearly 100,000 promoter regions remained gene-orphan. To address this, we used the CAGEscan method, in which random-primed 5′-cDNAs are paired-end sequenced. Pairs starting in the same region are assembled in transcript models called CAGEscan clusters. Here, we present the production and quality control of CAGEscan libraries from 56 FANTOM5 RNA sources, which enhances the FANTOM5 expression atlas by providing experimental evidence associating core promoter regions with their cognate transcripts.
Publisher: Springer Science and Business Media LLC
Date: 11-12-2018
DOI: 10.1038/S41597-018-0003-4
Abstract: The authors regret that Luba M. Pardo was omitted in error from the author list of the original version of this Data Descriptor. This omission has now been corrected in the HTML and PDF versions. The authors also regret that Anemieke Rozemuller was omitted in error from the Acknowledgements of the original version of this Data Descriptor. This omission has now been corrected in the HTML and PDF versions.
Publisher: Oxford University Press (OUP)
Date: 09-03-2021
DOI: 10.1093/CID/CIAB216
Abstract: Our goal was to identify genetic risk factors for severe otitis media (OM) in Aboriginal Australians. Illumina® Omni2.5 BeadChip and imputed data were compared between 21 children with severe OM (multiple episodes chronic suppurative OM and/or perforations or tympanic sclerosis) and 370 in iduals without this phenotype, followed by FUnctional Mapping and Annotation (FUMA). Exome data filtered for common (EXaC_all ≥ 0.1) putative deleterious variants influencing protein coding (CADD-scaled scores ≥15] were used to compare 15 severe OM cases with 9 mild cases (single episode of acute OM recorded over ≥3 consecutive years). Rare (ExAC_all ≤ 0.01) such variants were filtered for those present only in severe OM. Enrichr was used to determine enrichment of genes contributing to pathways rocesses relevant to OM. FUMA analysis identified 2 plausible genetic risk loci for severe OM: NR3C1 (Pimputed_1000G = 3.62 × 10−6) encoding the glucocorticoid receptor, and NREP (Pimputed_1000G = 3.67 × 10−6) encoding neuronal regeneration-related protein. Exome analysis showed: (i) association of severe OM with variants influencing protein coding (CADD-scaled ≥ 15) in a gene-set (GRXCR1, CDH23, LRP2, FAT4, ARSA, EYA4) enriched for Mammalian Phenotype Level 4 abnormal hair cell stereociliary bundle morphology and related phenotypes (ii) rare variants influencing protein coding only seen in severe OM provided gene-sets enriched for “abnormal ear” (LMNA, CDH23, LRP2, MYO7A, FGFR1), integrin interactions, transforming growth factor signaling, and cell projection phenotypes including hair cell stereociliary bundles and cilium assembly. This study highlights interacting genes and pathways related to cilium structure and function that may contribute to extreme susceptibility to OM in Aboriginal Australian children.
Publisher: Springer Science and Business Media LLC
Date: 05-2007
Publisher: Hindawi Limited
Date: 09-03-2022
DOI: 10.1002/HUMU.24362
Abstract: Identifying the causal variant for diagnosis of genetic diseases is challenging when using next-generation sequencing approaches and variant prioritization tools can assist in this task. These tools provide in silico predictions of variant pathogenicity, however they are agnostic to the disease under study. We previously performed a disease-specific benchmark of 24 such tools to assess how they perform in different disease contexts. We found that the tools themselves show large differences in performance, but more importantly that the best tools for variant prioritization are dependent on the disease phenotypes being considered. Here we expand the assessment to 37 tools and refine our assessment by separating performance for nonsynonymous single nucleotide variants (nsSNVs) and missense variants (i.e., excluding nonsense variants). We found differences in performance for missense variants compared to nsSNVs and recommend three tools that stand out in terms of their performance (BayesDel, CADD, and ClinPred).
Publisher: Springer Science and Business Media LLC
Date: 09-2012
DOI: 10.1038/NATURE11247
Publisher: Springer Science and Business Media LLC
Date: 02-06-2020
DOI: 10.1186/S13059-020-02048-6
Abstract: Single-cell RNA sequencing has been widely adopted to estimate the cellular composition of heterogeneous tissues and obtain transcriptional profiles of in idual cells. Multiple approaches for optimal s le dissociation and storage of single cells have been proposed as have single-nuclei profiling methods. What has been lacking is a systematic comparison of their relative biases and benefits. Here, we compare gene expression and cellular composition of single-cell suspensions prepared from adult mouse kidney using two tissue dissociation protocols. For each s le, we also compare fresh cells to cryopreserved and methanol-fixed cells. Lastly, we compare this single-cell data to that generated using three single-nucleus RNA sequencing workflows. Our data confirms prior reports that digestion on ice avoids the stress response observed with 37 °C dissociation. It also reveals cell types more abundant either in the cold or warm dissociations that may represent populations that require gentler or harsher conditions to be released intact. For cell storage, cryopreservation of dissociated cells results in a major loss of epithelial cell types in contrast, methanol fixation maintains the cellular composition but suffers from ambient RNA leakage. Finally, cell type composition differences are observed between single-cell and single-nucleus RNA sequencing libraries. In particular, we note an underrepresentation of T, B, and NK lymphocytes in the single-nucleus libraries. Systematic comparison of recovered cell types and their transcriptional profiles across the workflows has highlighted protocol-specific biases and thus enables researchers starting single-cell experiments to make an informed choice.
Publisher: Oxford University Press (OUP)
Date: 27-06-2015
DOI: 10.1093/NAR/GKV646
Publisher: Proceedings of the National Academy of Sciences
Date: 13-03-2007
Abstract: Attainment of a brown adipocyte cell phenotype in white adipocytes, with their abundant mitochondria and increased energy expenditure potential, is a legitimate strategy for combating obesity. The unique transcriptional regulators of the primary brown adipocyte phenotype are unknown, limiting our ability to promote brown adipogenesis over white. In the present work, we used microarray analysis strategies to study primary preadipocytes, and we made the striking discovery that brown preadipocytes demonstrate a myogenic transcriptional signature, whereas both brown and white primary preadipocytes demonstrate signatures distinct from those found in immortalized adipogenic models. We found a plausible SIRT1-related transcriptional signature during brown adipocyte differentiation that may contribute to silencing the myogenic signature. In contrast to brown preadipocytes or skeletal muscle cells, white preadipocytes express Tcf21, a transcription factor that has been shown to suppress myogenesis and nuclear receptor activity. In addition, we identified a number of developmental genes that are differentially expressed between brown and white preadipocytes and that have recently been implicated in human obesity. The interlinkage between the myocyte and the brown preadipocyte confirms the distinct origin for brown versus white adipose tissue and also represents a plausible explanation as to why brown adipocytes ultimately specialize in lipid catabolism rather than storage, much like oxidative skeletal muscle tissue.
Publisher: American Society of Hematology
Date: 24-04-2014
DOI: 10.1182/BLOOD-2013-02-484188
Abstract: In-depth regulome analysis of human monocyte subsets, including transcription and enhancer profiling. Description of metabolomic differences in human monocyte subsets.
Publisher: Springer Science and Business Media LLC
Date: 05-01-2015
Publisher: Cold Spring Harbor Laboratory
Date: 15-12-2016
DOI: 10.1101/088500
Abstract: We used a transgenic HeLa cell line that reports cell cycle phases through fluorescent, ubiquitination-based cell cycle indicators (Fucci), to produce a reference dataset of more than 270 curated single cells. Microscopic images were taken from each cell followed by RNA-sequencing, so that single-cell expression data is associated to the fluorescence intensity of the Fucci probes in the same cell. We developed an open data management and quality control workflow that enables users to replicate the processing of the sequence and microscopic image data that we deposited in public repositories. The workflow outputs a table with metadata, that is the starting point for further studies on these data. Beyond its use for cell cycle studies, We also expect that our workflow can be adapted to other single-cell projects using a similar combination of sequencing data and fluorescence measurements.
Publisher: Oxford University Press (OUP)
Date: 02-07-2020
Abstract: The recent increase in babies born with brain and eye malformations in Brazil is associated with Zika virus (ZIKV) infection in utero. ZIKV alters host DNA methylation in vitro. Using genome-wide DNA methylation profiling we compared 18 babies born with congenital ZIKV microcephaly with 20 controls. We found ZIKV-associated alteration of host methylation patterns, notably at RABGAP1L which is important in brain development, at viral host immunity genes MX1 and ISG15, and in an epigenetic module containing the causal microcephaly gene MCPH1. Our data support the hypothesis that clinical signs of congenital ZIKV are associated with changes in DNA methylation.
Publisher: Springer Science and Business Media LLC
Date: 05-02-2018
DOI: 10.1038/S41525-018-0044-9
Abstract: Next generation sequencing is a standard tool used in clinical diagnostics. In Mendelian diseases the challenge is to discover the single etiological variant among thousands of benign or functionally unrelated variants. After calling variants from aligned sequencing reads, variant prioritisation tools are used to examine the conservation or potential functional consequences of variants. We hypothesised that the performance of variant prioritisation tools may vary by disease phenotype. To test this we created benchmark data sets for variants associated with different disease phenotypes. We found that performance of 24 tested tools is highly variable and differs by disease phenotype. The task of identifying a causative variant amongst a large number of benign variants is challenging for all tools, highlighting the need for further development in the field. Based on our observations, we recommend use of five top performers found in this study (FATHMM, M-CAP, MetaLR, MetaSVM and VEST3). In addition we provide tables indicating which analytical approach works best in which disease context. Variant prioritisation tools are best suited to investigate variants associated with well-studied genetic diseases, as these variants are more readily available during algorithm development than variants associated with rare diseases. We anticipate that further development into disease focussed tools will lead to significant improvements.
Publisher: Cold Spring Harbor Laboratory
Date: 05-06-2014
Abstract: Underlying the complexity of the mammalian brain is its network of neuronal connections, but also the molecular networks of signaling pathways, protein interactions, and regulated gene expression within each in idual neuron. The ersity and complexity of the spatially intermingled neurons pose a serious challenge to the identification and quantification of single neuron components. To address this challenge, we present a novel approach for the study of the ribosome-associated transcriptome—the translatome—from selected subcellular domains of specific neurons, and apply it to the Purkinje cells (PCs) in the rat cerebellum. We combined microdissection, translating ribosome affinity purification (TRAP) in nontransgenic animals, and quantitative nanoCAGE sequencing to obtain a snapshot of RNAs bound to cytoplasmic or rough endoplasmic reticulum (rER)–associated ribosomes in the PC and its dendrites. This allowed us to discover novel markers of PCs, to determine structural aspects of genes, to find hitherto uncharacterized transcripts, and to quantify biophysically relevant genes of membrane proteins controlling ion homeostasis and neuronal electrical activities.
Publisher: Public Library of Science (PLoS)
Date: 27-03-2014
Publisher: Elsevier BV
Date: 05-2019
Publisher: Oxford University Press (OUP)
Date: 26-10-2020
DOI: 10.1093/BIOINFORMATICS/BTZ795
Abstract: Kalign is an efficient multiple sequence alignment (MSA) program capable of aligning thousands of protein or nucleotide sequences. However, current alignment problems involving large numbers of sequences are exceeding Kalign’s original design specifications. Here we present a completely re-written and updated version to meet current and future alignment challenges. Kalign now uses a SIMD (single instruction, multiple data) accelerated version of the bit-parallel Gene Myers algorithm to estimate pairwise distances, adopts a sequence embedding strategy and the bi-secting K-means algorithm to rapidly construct guide trees for thousands of sequences. The new version maintains high alignment accuracy on both protein and nucleotide alignments and scales better than other MSA tools. The source code of Kalign and code to reproduce the results are found here: imolassmann/kalign.
Publisher: Springer Science and Business Media LLC
Date: 23-02-2012
Publisher: Springer Science and Business Media LLC
Date: 02-06-2021
DOI: 10.1038/S41467-021-23143-7
Abstract: Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
Publisher: Springer Science and Business Media LLC
Date: 03-2014
DOI: 10.1038/NATURE12787
Publisher: Royal Society of Chemistry (RSC)
Date: 2019
DOI: 10.1039/C9ME00029A
Abstract: A hitchhiker's guide to biomarker discovery in immune checkpoint blockade.
Publisher: Springer Science and Business Media LLC
Date: 09-02-2022
DOI: 10.1186/S13287-022-02740-3
Abstract: Over 400 million people worldwide are living with a rare disease. Next Generation Sequencing (NGS) identifies potential disease causative genetic variants. However, many are identified as variants of uncertain significance (VUS) and require functional laboratory validation to determine pathogenicity, and this creates major diagnostic delays . In this study we test a rapid genetic variant assessment pipeline using CRISPR homology directed repair to introduce single nucleotide variants into inducible pluripotent stem cells (iPSCs), followed by neuronal disease modelling, and functional genomics on licon and RNA sequencing, to determine cellular changes to support patient diagnosis and identify disease mechanism. As proof-of-principle, we investigated an EHMT1 (Euchromatin histone methyltransferase 1 EHMT1 c.3430C T p.Gln1144*) genetic variant pathogenic for Kleefstra syndrome and determined changes in gene expression during neuronal progenitor cell differentiation. This pipeline rapidly identified Kleefstra syndrome in genetic variant cells compared to healthy cells, and revealed novel findings potentially implicating the key transcription factors REST and SP1 in disease pathogenesis. The study pipeline is a rapid, robust method for genetic variant assessment that will support rare diseases patient diagnosis. The results also provide valuable information on genome wide perturbations key to disease mechanism that can be targeted for drug treatments.
Publisher: Cold Spring Harbor Laboratory
Date: 22-12-2010
Abstract: Core promoters are critical regions for gene regulation in higher eukaryotes. However, the boundaries of promoter regions, the relative rates of initiation at the transcription start sites (TSSs) distributed within them, and the functional significance of promoter architecture remain poorly understood. We produced a high-resolution map of promoters active in the Drosophila melanogaster embryo by integrating data from three independent and complementary methods: 21 million cap analysis of gene expression (CAGE) tags, 1.2 million RNA ligase mediated rapid lification of cDNA ends (RLM-RACE) reads, and 50,000 cap-trapped expressed sequence tags (ESTs). We defined 12,454 promoters of 8037 genes. Our analysis indicates that, due to non-promoter-associated RNA background signal, previous studies have likely overestimated the number of promoter-associated CAGE clusters by fivefold. We show that TSS distributions form a complex continuum of shapes, and that promoters active in the embryo and adult have highly similar shapes in 95% of cases. This suggests that these distributions are generally determined by static elements such as local DNA sequence and are not modulated by dynamic signals such as histone modifications. Transcription factor binding motifs are differentially enriched as a function of promoter shape, and peaked promoter shape is correlated with both temporal and spatial regulation of gene expression. Our results contribute to the emerging view that core promoters are functionally erse and control patterning of gene expression in Drosophila and mammals.
Publisher: Springer Science and Business Media LLC
Date: 23-08-2009
DOI: 10.1038/NATURE08283
Publisher: MDPI AG
Date: 29-04-2013
DOI: 10.3390/IJMS14059305
Publisher: Research Square Platform LLC
Date: 20-05-2022
DOI: 10.21203/RS.3.RS-1613398/V1
Abstract: Immune checkpoint therapy (ICT) causes durable tumor responses in a subgroup of patients. Profiling T cell receptor beta (TCRβ) repertoire structure in ICT responders and non-responders provides mechanistic insight into what constitutes an effective anti-tumor response, and could result in the development of predictive biomarkers of response to identify and stratify patients for ICT. To examine how the TCRβ repertoire dynamics contribute to ICT response, we utilized an established murine model that excludes variation in host genetics, environmental factors and tumor mutation burden, limiting variation between animals to naturally erse TCRβ repertoires. Oligoclonal expansion of TCRβ clonotypes that corresponded with a low TCRβ ersity was observed in responding tumors prior to ICT. We modeled TCRβ cluster dynamics during ICT and found that select clonotypes expanded slower in responders compared to non- responders. Clonally expanded CD8+ tumor infiltrating T cells in non-responders exhibited a T cell exhaustion phenotype. We conclude that an early burst of clonal expansion followed by a contraction during ICT is associated with response.
Publisher: Frontiers Media SA
Date: 18-11-2015
Publisher: Springer Science and Business Media LLC
Date: 2009
Publisher: Springer Science and Business Media LLC
Date: 19-07-2018
DOI: 10.1038/S41598-018-29279-9
Abstract: Chronic renal disease (CRD) associated with cardiovascular disease (CVD) and/or type 2 diabetes (T2D) is a significant problem in Aboriginal Australians. Whole exome sequencing data (N = 72) showed enrichment for ClinVar pathogenic variants in gene sets athways linking lipoprotein, lipid and glucose metabolism. The top Ingenuity Pathway Analysis canonical pathways were Farsenoid X Receptor and Retinoid Receptor (FXR/RXR (P = 1.86 × 10 −7 ), Liver X Receptor and Retinoid Receptor (LXR/RXR P = 2.88 × 10 −6 ), and atherosclerosis signalling (P = 3.80 × 10 −6 ). Top pathways rocesses identified using Enrichr included: Reactome 2016 chylomicron-mediated lipid transport (P = 3.55 × 10 −7 ) Wiki 2016 statin (P = 8.29 × 10 −8 ) GO Biological Processes 2017 chylomicron remodelling (P = 1.92 × 10 −8 ). ClinVar arylsulfatase A pseudodeficiency (ARSA-PD) pathogenic variants were common, including the missense variant c.511 G A (p.Asp171Asn rs74315466 frequency 0.44) only reported in Polynesians. This variant is in cis with known ARSA-PD 3′ regulatory c.*96 A G (rs6151429 frequency 0.47) and missense c.1055 A G (p.Asn352Ser rs2071421 frequency 0.47) variants. These latter two variants are associated with T2D (risk haplotype GG odds ratio 2.67 95% CI 2.32–3.08 P = 2.43 × 10 −4 ) in genome-wide association data (N = 402), but are more strongly associated with quantitative traits (DBP, SBP, ACR, eGFR) for hypertension and renal function in non-diabetic than diabetic subgroups. Traits associated with CVD, CRD and T2D in Aboriginal Australians provide novel insight into function of ARSA-PD variants.
Publisher: Oxford University Press (OUP)
Date: 18-11-2011
DOI: 10.1093/BIOINFORMATICS/BTQ614
Abstract: Motivation: The sequence alignment/map format (SAM) is a commonly used format to store the alignments between millions of short reads and a reference genome. Often certain positions within the reads are inherently more likely to contain errors due to the protocols used to prepare the s les. Such biases can have adverse effects on both mapping rate and accuracy. To understand the relationship between potential protocol biases and poor mapping we wrote SAMstat, a simple C program plotting nucleotide overrepresentation and other statistics in mapped and unmapped reads in a concise html page. Collecting such statistics also makes it easy to highlight problems in the data processing and enables non-experts to track data quality over time. Results: We demonstrate that studying sequence features in mapped data can be used to identify biases particular to one sequencing protocol. Once identified, such biases can be considered in the downstream analysis or even be removed by read trimming or filtering techniques. Availability: SAMStat is open source and freely available as a C program running on all Unix-compatible platforms. The source code is available from samstat.sourceforge.net. Contact: timolassmann@gmail.com
Publisher: Frontiers Media SA
Date: 14-10-2020
Publisher: Oxford University Press (OUP)
Date: 07-09-2009
DOI: 10.1093/BIOINFORMATICS/BTP527
Abstract: Motivation: Next-generation parallel sequencing technologies produce large quantities of short sequence reads. Due to experimental procedures various types of artifacts are commonly sequenced alongside the targeted RNA or DNA sequences. Identification of such artifacts is important during the development of novel sequencing assays and for the downstream analysis of the sequenced libraries. Results: Here we present TagDust, a program identifying artifactual sequences in large sequencing runs. Given a user-defined cutoff for the false discovery rate, TagDust identifies all reads explainable by combinations and partial matches to known sequences used during library preparation. We demonstrate the quality of our method on sequencing runs performed on Illumina's Genome Analyzer platform. Availability: Executables and documentation are available from genome.gsc.riken.jp/osc/english/software/. Contact: timolassmann@gmail.com
Publisher: Springer Science and Business Media LLC
Date: 31-10-2017
Abstract: Rhesus macaque was the second non-human primate whose genome has been fully sequenced and is one of the most used model organisms to study human biology and disease, thanks to the close evolutionary relationship between the two species. But compared to human, where several previously unknown RNAs have been uncovered, the macaque transcriptome is less studied. Publicly available RNA expression resources for macaque are limited, even for brain, which is highly relevant to study human cognitive abilities. In an effort to complement those resources, FANTOM5 profiled 15 distinct anatomical regions of the aged macaque central nervous system using Cap Analysis of Gene Expression, a high-resolution, annotation-independent technology that allows monitoring of transcription initiation events with high accuracy. We identified 25,869 CAGE peaks, representing bona fide promoters. For each peak we provide detailed annotation, expanding the landscape of ‘known’ macaque genes, and we show concrete ex les on how to use the resulting data. We believe this data represents a useful resource to understand the central nervous system in macaque.
Publisher: Elsevier BV
Date: 03-2019
DOI: 10.1016/J.CELL.2019.02.040
Abstract: The introduction of exome sequencing in the clinic has sparked tremendous optimism for the future of rare disease diagnosis, and there is exciting opportunity to further leverage these advances. To provide diagnostic clarity to all of these patients, however, there is a critical need for the field to develop and implement strategies to understand the mechanisms underlying all rare diseases and translate these to clinical care.
Publisher: Springer Science and Business Media LLC
Date: 21-11-2019
DOI: 10.1038/S41467-019-13345-5
Abstract: Whole genome and exome sequencing is a standard tool for the diagnosis of patients suffering from rare and other genetic disorders. The interpretation of the tens of thousands of variants returned from such tests remains a major challenge. Here we focus on the problem of prioritising variants with respect to the observed disease phenotype. We hypothesise that linking patterns of gene expression across multiple tissues to the phenotypes will aid in discovering disease causing variants. To test this, we construct classifiers that learn associations between tissue-specific gene expression and disease phenotypes. We find that using Genotype-Tissue Expression project (GTEx) expression data in conjunction with disease agnostic variant prioritisation methods (CADD or MetaSVM) results in consistent improvements in classification accuracy. Our method represents a previously overlooked avenue of utilising existing expression data for clinical diagnostics, and also opens the door to use of other functional genomic data sets in the same manner.
Publisher: Oxford University Press (OUP)
Date: 09-12-2005
DOI: 10.1093/NAR/GKI1020
Publisher: The Royal Society
Date: 08-2018
DOI: 10.1098/RSOB.180011
Abstract: The promoters of immediate early genes (IEGs) are rapidly activated in response to an external stimulus. These genes, also known as primary response genes, have been identified in a range of cell types, under erse extracellular signals and using varying experimental protocols. Whereas genomic dissection on a case-by-case basis has not resulted in a comprehensive catalogue of IEGs, a rigorous meta-analysis of eight genome-wide FANTOM5 CAGE (cap analysis of gene expression) time course datasets reveals successive waves of promoter activation in IEGs, recapitulating known relationships between cell types and stimuli: we obtain a set of 57 (42 protein-coding) candidate IEGs possessing promoters that consistently drive a rapid but transient increase in expression over time. These genes show significant enrichment for known IEGs reported previously, pathways associated with the immediate early response, and include a number of non-coding RNAs with roles in proliferation and differentiation. Surprisingly, we also find strong conservation of the ordering of activation for these genes, such that 77 pairwise promoter activation orderings are conserved. Using the leverage of comprehensive CAGE time series data across cell types, we also document the extensive alternative promoter usage by such genes, which is likely to have been a barrier to their discovery until now. The common activation ordering of the core set of early-responding genes we identify may indicate conserved underlying regulatory mechanisms. By contrast, the considerably larger number of transiently activated genes that are specific to each cell type and stimulus illustrates the breadth of the primary response.
Publisher: American Diabetes Association
Date: 11-2016
DOI: 10.2337/DB16-0631
Abstract: White adipose tissue (WAT) can develop into several phenotypes with different pathophysiological impact on type 2 diabetes. To better understand the adipogenic process, the transcriptional events that occur during in vitro differentiation of human adipocytes were investigated and the findings linked to WAT phenotypes. Single-molecule transcriptional profiling provided a detailed map of the expressional changes of genes, enhancers, and long noncoding RNAs, where different types of transcripts share common dynamics during differentiation. Common signatures include early downregulated, transient, and late induced transcripts, all of which are linked to distinct developmental processes during adipogenesis. Enhancers expressed during adipogenesis overlap significantly with genetic variants associated with WAT distribution. Transiently expressed and late induced genes are associated with hypertrophic WAT (few but large fat cells), a phenotype closely linked to insulin resistance and type 2 diabetes. Transcription factors that are expressed early or transiently affect differentiation and adipocyte function and are controlled by several well-known upstream regulators such as glucocorticosteroids, insulin, cAMP, and thyroid hormones. Taken together, our results suggest a complex but highly coordinated regulation of adipogenesis.
Publisher: Cold Spring Harbor Laboratory
Date: 11-04-2017
DOI: 10.1101/126474
Abstract: The FANTOM5 expression atlas is a quantitative measurement of the activity of nearly 200,000 promoter regions across nearly 2,000 different human primary cells, tissue types and cell lines. Generation of this atlas was made possible by the use of CAGE, an experimental approach to localise transcription start sites at single-nucleotide resolution by sequencing the 5′ ends of capped RNAs after their conversion to cDNAs. While 50% of CAGE-defined promoter regions could be confidently associated to adjacent transcriptional units, nearly 100,000 promoter regions remained gene-orphan. To address this, we used the CAGEscan method, in which random-primed 5′-cDNAs are paired-end sequenced. Pairs starting in the same region are assembled in transcript models called CAGEscan clusters. Here, we present the production and quality control of CAGEscan libraries from 56 FANTOM5 RNA sources, which enhances the FANTOM5 expression atlas by providing experimental evidence associating core promoter regions with their cognate transcripts.
Publisher: Springer Science and Business Media LLC
Date: 03-2017
DOI: 10.1038/NATURE21374
Publisher: Research Square Platform LLC
Date: 24-09-2021
DOI: 10.21203/RS.3.RS-892399/V1
Abstract: Little is known about the dynamic biological events that underpin therapeutic efficacy in immune checkpoint blockade (ICB) in cancer, due to the inability to frequently s le tumors in patients. Here, we mapped the transcriptional profiles of 144 responding and non-responding tumors within two mouse models at four time points during ICB. We found that responding tumors displayed on/fast-off kinetics of type-I-interferon (IFN) signaling. Phenocopying of this kinetics using time-dependent sequential dosing of recombinant IFNs and neutralizing anti-bodies markedly improved ICB efficacy, but only when IFNβ was targeted, not IFNα. We identified Ly6C+/CD11b+ inflammatory monocytes as the primary source of IFNβ and found that active type-I-IFN signaling in tumor-infiltrating inflammatory monocytes was associated with T cell expansion in patients treated with ICB. Together, our results suggest that on/fast-off modulation of IFNβ signaling is critical to the therapeutic response to ICB, which can be exploited to drive clinical outcomes towards response.
Publisher: Springer Science and Business Media LLC
Date: 29-04-2020
DOI: 10.1038/S41597-020-0463-1
Abstract: Whole exome sequencing (WES) is a popular and successful technology which is widely used in both research and clinical settings. However, there is a paucity of reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal in iduals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA s les were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. A total of 289,829 variants were identified in at least one in idual in the NT cohort and 248,374 variants in at least one in idual in the WA cohort. Of these, 166,719 variants were present in both cohorts, whilst 123,110 variants were private to the NT cohort and 81,655 were private to the WA cohort. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.
Publisher: Cold Spring Harbor Laboratory
Date: 06-11-2019
DOI: 10.1101/832444
Abstract: Single-cell and single-nucleus RNA sequencing have been widely adopted in studies of heterogeneous tissues to estimate their cellular composition and obtain transcriptional profiles of in idual cells. However, the current fragmentary understanding of artefacts introduced by s le preparation protocols impedes the selection of optimal workflows and compromises data interpretation. To bridge this gap, we compared performance of several workflows applied to adult mouse kidneys. Our study encompasses two tissue dissociation protocols, two cell preservation methods, bulk tissue RNA sequencing, single-cell and three single-nucleus RNA sequencing workflows for the 10x Genomics Chromium platform. These experiments enable a systematic comparison of recovered cell types and their transcriptional profiles across the workflows and highlight protocol-specific biases important for the experimental design and data interpretation.
Publisher: Cold Spring Harbor Laboratory
Date: 19-05-2011
Abstract: We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3′ end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better define transcription start sites (TSS) than known models, identify novel regions of transcription and alternative promoters, and find two major classes of TSS signal, sharp peaks and broad regions. However, using this protocol, we observe reproducible evidence of regulation at the much finer level of in idual TSS positions. The libraries are quantitative over 5 orders of magnitude and highly reproducible (Pearson's correlation coefficient of 0.987). We have also scaled down the s le requirement to 5 μg of total RNA for a standard HeliScopeCAGE library and 100 ng for a low-quantity version. When the same RNA was run as 5-μg and 100-ng versions, the 100 ng was still able to detect expression for ∼60% of the 13,468 loci detected by a 5-μg library using the same threshold, allowing comparative analysis of even rare cell populations. Testing the protocol for differential gene expression measurements on triplicate HeLa and THP-1 s les, we find that the log fold change compared to Illumina microarray measurements is highly correlated (0.871). In addition, HeliScopeCAGE finds differential expression for thousands more loci including those with probes on the array. Finally, although the majority of tags are 5′ associated, we also observe a low level of signal on exons that is useful for defining gene structures.
Publisher: Elsevier BV
Date: 04-2010
Publisher: Springer Science and Business Media LLC
Date: 11-01-2018
Publisher: Springer Science and Business Media LLC
Date: 28-11-2017
Abstract: The promoter landscape of several non-human model organisms is far from complete. As a part of FANTOM5 data collection, we generated 13 profiles of transcription initiation activities in dog and rat aortic smooth muscle cells, mesenchymal stem cells and hepatocytes by employing CAGE (Cap Analysis of Gene Expression) technology combined with single molecule sequencing. Our analyses show that the CAGE profiles recapitulate known transcription start sites (TSSs) consistently, in addition to uncover novel TSSs. Our dataset can be thus used with high confidence to support gene annotation in dog and rat species. We identified 28,497 and 23,147 CAGE peaks, or promoter regions, for rat and dog respectively, and associated them to known genes. This approach could be seen as a standard method for improvement of existing gene models, as well as discovery of novel genes. Given that the FANTOM5 data collection includes dog and rat matched cell types in human and mouse as well, this data would also be useful for cross-species studies.
Publisher: Cold Spring Harbor Laboratory
Date: 27-05-2009
DOI: 10.1261/RNA.1528909
Abstract: Small nucleolar RNAs (snoRNAs) guide RNA modification and are localized in nucleoli and Cajal bodies in eukaryotic cells. Components of the RNA silencing pathway associate with these structures, and two recent reports have revealed that a human and a protozoan snoRNA can be processed into miRNA-like RNAs. Here we show that small RNAs with evolutionary conservation of size and position are derived from the vast majority of snoRNA loci in animals (human, mouse, chicken, fruit fly), Arabidopsis , and fission yeast. In animals, sno-derived RNAs (sdRNAs) from H/ACA snoRNAs are predominantly 20–24 nucleotides (nt) in length and originate from the 3′ end. Those derived from C/D snoRNAs show a bimodal size distribution at ∼17–19 nt and nt and predominantly originate from the 5′ end. SdRNAs are associated with AGO7 in Arabidopsis and Ago1 in fission yeast with characteristic 5′ nucleotide biases and show altered expression patterns in fly loquacious and Dicer-2 and mouse Dicer1 and Dgcr8 mutants. These findings indicate that there is interplay between the RNA silencing and snoRNA-mediated RNA processing systems, and that sdRNAs comprise a novel and ancient class of small RNAs in eukaryotes.
Publisher: Oxford University Press (OUP)
Date: 25-02-2015
DOI: 10.1189/JLB.6TA1014-477RR
Abstract: The generation of myeloid cells from their progenitors is regulated at the level of transcription by combinatorial control of key transcription factors influencing cell-fate choice. To unravel the global dynamics of this process at the transcript level, we generated transcription profiles for 91 human cell types of myeloid origin by use of CAGE profiling. The CAGE sequencing of these s les has allowed us to investigate erse aspects of transcription control during myelopoiesis, such as identification of novel transcription factors, miRNAs, and noncoding RNAs specific to the myeloid lineage. We further reconstructed a transcription regulatory network by clustering coexpressed transcripts and associating them with enriched cis-regulatory motifs. With the use of the bidirectional expression as a proxy for enhancers, we predicted over 2000 novel enhancers, including an enhancer 38 kb downstream of IRF8 and an intronic enhancer in the KIT gene locus. Finally, we highlighted relevance of these data to dissect transcription dynamics during progressive maturation of granulocyte precursors. A multifaceted analysis of the myeloid transcriptome is made available (www.myeloidome.roslin.ed.ac.uk). This high-quality dataset provides a powerful resource to study transcriptional regulation during myelopoiesis and to infer the likely functions of unannotated genes in human innate immunity.
Publisher: Springer Science and Business Media LLC
Date: 06-01-2018
DOI: 10.1007/S12311-017-0912-3
Abstract: Laser-capture microdissection was used to isolate external germinal layer tissue from three developmental periods of mouse cerebellar development: embryonic days 13, 15, and 18. The cerebellar granule cell-enriched mRNA library was generated with next-generation sequencing using the Helicos technology. Our objective was to discover transcriptional regulators that could be important for the development of cerebellar granule cells-the most numerous neuron in the central nervous system. Through differential expression analysis, we have identified 82 differentially expressed transcription factors (TFs) from a total of 1311 differentially expressed genes. In addition, with TF-binding sequence analysis, we have identified 46 TF candidates that could be key regulators responsible for the variation in the granule cell transcriptome between developmental stages. Altogether, we identified 125 potential TFs (82 from differential expression analysis, 46 from motif analysis with 3 overlaps in the two sets). From this gene set, 37 TFs are considered novel due to the lack of previous knowledge about their roles in cerebellar development. The results from transcriptome-wide analyses were validated with existing online databases, qRT-PCR, and in situ hybridization. This study provides an initial insight into the TFs of cerebellar granule cells that might be important for development and provide valuable information for further functional studies on these transcriptional regulators.
Publisher: Springer Science and Business Media LLC
Date: 12-2005
Abstract: The alignment of multiple protein sequences is a fundamental step in the analysis of biological data. It has traditionally been applied to analyzing protein families for conserved motifs, phylogeny, structural properties, and to improve sensitivity in homology searching. The availability of complete genome sequences has increased the demands on multiple sequence alignment (MSA) programs. Current MSA methods suffer from being either too inaccurate or too computationally expensive to be applied effectively in large-scale comparative genomics. We developed Kalign, a method employing the Wu-Manber string-matching algorithm, to improve both the accuracy and speed of multiple sequence alignment. We compared the speed and accuracy of Kalign to other popular methods using Balibase, Prefab, and a new large test set. Kalign was as accurate as the best other methods on small alignments, but significantly more accurate when aligning large and distantly related sets of sequences. In our comparisons, Kalign was about 10 times faster than ClustalW and, depending on the alignment size, up to 50 times faster than popular iterative methods. Kalign is a fast and robust alignment method. It is especially well suited for the increasingly important task of aligning large numbers of sequences.
Publisher: Oxford University Press (OUP)
Date: 2016
Abstract: Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for in idual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS. Database URL: slidebase.binf.ku.dk
Publisher: Springer Science and Business Media LLC
Date: 19-08-2022
DOI: 10.1038/S41467-022-32567-8
Abstract: The biological determinants of the response to immune checkpoint blockade (ICB) in cancer remain incompletely understood. Little is known about dynamic biological events that underpin therapeutic efficacy due to the inability to frequently s le tumours in patients. Here, we map the transcriptional profiles of 144 responding and non-responding tumours within two mouse models at four time points during ICB. We find that responding tumours display on/fast-off kinetics of type-I-interferon (IFN) signaling. Phenocopying of this kinetics using time-dependent sequential dosing of recombinant IFNs and neutralizing antibodies markedly improves ICB efficacy, but only when IFNβ is targeted, not IFNα. We identify Ly6C + /CD11b + inflammatory monocytes as the primary source of IFNβ and find that active type-I-IFN signaling in tumour-infiltrating inflammatory monocytes is associated with T cell expansion in patients treated with ICB. Together, our results suggest that on/fast-off modulation of IFNβ signaling is critical to the therapeutic response to ICB, which can be exploited to drive clinical outcomes towards response.
Publisher: Springer Science and Business Media LLC
Date: 07-2009
DOI: 10.1038/NG0709-859A
Publisher: Springer Science and Business Media LLC
Date: 19-04-2009
DOI: 10.1038/NG.312
Abstract: It has been reported that relatively short RNAs of heterogeneous sizes are derived from sequences near the promoters of eukaryotic genes. In conjunction with the FANTOM4 project, we have identified tiny RNAs with a modal length of 18 nt that map within -60 to +120 nt of transcription start sites (TSSs) in human, chicken and Drosophila. These transcription initiation RNAs (tiRNAs) are derived from sequences on the same strand as the TSS and are preferentially associated with G+C-rich promoters. The 5' ends of tiRNAs show peak density 10-30 nt downstream of TSSs, indicating that they are processed. tiRNAs are generally, although not exclusively, associated with highly expressed transcripts and sites of RNA polymerase II binding. We suggest that tiRNAs may be a general feature of transcription in metazoa and possibly all eukaryotes.
Publisher: Springer Science and Business Media LLC
Date: 30-04-2018
DOI: 10.1038/S41598-018-24509-6
Abstract: Mycobacterium tuberculosis (Mtb) infection reveals complex and dynamic host-pathogen interactions, leading to host protection or pathogenesis. Using a unique transcriptome technology (CAGE), we investigated the promoter-based transcriptional landscape of IFNγ (M1) or IL-4/IL-13 (M2) stimulated macrophages during Mtb infection in a time-kinetic manner. Mtb infection widely and drastically altered macrophage-specific gene expression, which is far larger than that of M1 or M2 activations. Gene Ontology enrichment analysis for Mtb-induced differentially expressed genes revealed various terms, related to host-protection and inflammation, enriched in up-regulated genes. On the other hand, terms related to dis-regulation of cellular functions were enriched in down-regulated genes. Differential expression analysis revealed known as well as novel transcription factor genes in Mtb infection, many of them significantly down-regulated. IFNγ or IL-4/IL-13 pre-stimulation induce additional differentially expressed genes in Mtb-infected macrophages. Cluster analysis uncovered significant numbers, prolonging their expressional changes. Furthermore, Mtb infection augmented cytokine-mediated M1 and M2 pre-activations. In addition, we identified unique transcriptional features of Mtb-mediated differentially expressed lncRNAs. In summary we provide a comprehensive in depth gene expression/regulation profile in Mtb-infected macrophages, an important step forward for a better understanding of host-pathogen interaction dynamics in Mtb infection.
Publisher: Oxford University Press (OUP)
Date: 2023
DOI: 10.1093/BIOINFORMATICS/BTAD019
Abstract: SAMStat is an efficient program to extract quality control metrics from fastq and SAM/BAM files. A distinguishing feature is that it displays sequence composition, base quality composition and mapping error profiles split by mapping quality. This allows users to rapidly identify reasons for poor mapping including the presence of untrimmed adapters or poor sequencing quality at in idual read positions. Here, we present a major update to SAMStat. The new version now supports paired-end and long-read data. Quality control plots are drawn using the ploty javascript library. The source code of SAMStat and code to reproduce the results are found here: imolassmann/samstat.
Publisher: Public Library of Science (PLoS)
Date: 19-04-2011
Publisher: American Association for Cancer Research (AACR)
Date: 14-01-2016
DOI: 10.1158/0008-5472.CAN-15-0484
Abstract: Genes that are commonly deregulated in cancer are clinically attractive as candidate pan-diagnostic markers and therapeutic targets. To globally identify such targets, we compared Cap Analysis of Gene Expression profiles from 225 different cancer cell lines and 339 corresponding primary cell s les to identify transcripts that are deregulated recurrently in a broad range of cancer types. Comparing RNA-seq data from 4,055 tumors and 563 normal tissues profiled in the The Cancer Genome Atlas and FANTOM5 datasets, we identified a core transcript set with theranostic potential. Our analyses also revealed enhancer RNAs, which are upregulated in cancer, defining promoters that overlap with repetitive elements (especially SINE/Alu and LTR/ERV1 elements) that are often upregulated in cancer. Lastly, we documented for the first time upregulation of multiple copies of the REP522 interspersed repeat in cancer. Overall, our genome-wide expression profiling approach identified a comprehensive set of candidate biomarkers with pan-cancer potential, and extended the perspective and pathogenic significance of repetitive elements that are frequently activated during cancer progression. Cancer Res 76(2) 216–26. ©2015 AACR.
Publisher: American Society of Hematology
Date: 24-04-2014
DOI: 10.1182/BLOOD-2013-02-482893
Abstract: In granulopoiesis, changes in DNA methylation preferably occur at points of lineage restriction in low CpG areas. DNA methylation is dynamic in enhancer elements and appears to regulate the expression of key transcription factors and neutrophil genes.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 17-07-2019
DOI: 10.1126/SCITRANSLMED.AAV7816
Abstract: A STAT1-driven inflammatory phenotype associated with response to checkpoint blocking antibodies sensitizes cancers to immunotherapy.
Publisher: Springer Science and Business Media LLC
Date: 21-01-2019
DOI: 10.1038/S41467-018-08126-5
Abstract: Single-cell transcriptomic profiling is a powerful tool to explore cellular heterogeneity. However, most of these methods focus on the 3′-end of polyadenylated transcripts and provide only a partial view of the transcriptome. We introduce C1 CAGE, a method for the detection of transcript 5′-ends with an original s le multiplexing strategy in the C1 TM microfluidic system. We first quantifiy the performance of C1 CAGE and find it as accurate and sensitive as other methods in the C1 system. We then use it to profile promoter and enhancer activities in the cellular response to TGF-β of lung cancer cells and discover subpopulations of cells differing in their response. We also describe enhancer RNA dynamics revealing transcriptional bursts in subsets of cells with transcripts arising from either strand in a mutually exclusive manner, validated using single molecule fluorescence in situ hybridization.
Publisher: Springer Science and Business Media LLC
Date: 21-10-2020
DOI: 10.1038/S41597-020-00703-Y
Abstract: Exome sequencing is widely used in the diagnosis of rare genetic diseases and provides useful variant data for analysis of complex diseases. There is not always adequate population-specific reference data to assist in assigning a diagnostic variant to a specific clinical condition. Here we provide a catalogue of variants called after sequencing the exomes of 45 babies from Rio Grande do Nord in Brazil. Sequence data were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. A total of 612,761 variants were identified in at least one in idual in this Brazilian Cohort, including 559,448 single nucleotide variants (SNVs) and 53,313 insertion/deletions. Of these, 58,111 overlapped with nonsynonymous (nsSNVs) or splice site (ssSNVs) SNVs in dbNSFP. As an aid to clinical diagnosis of rare diseases, we used the American College of Medicine Genetics and Genomics (ACMG) guidelines to assign pathogenic/likely pathogenic status to 185 (0.32%) of the 58,111 nsSNVs and ssSNVs. Our data set provides a useful reference point for diagnosis of rare diseases in Brazil. (169 words).
Publisher: Springer Science and Business Media LLC
Date: 13-06-2010
DOI: 10.1038/NMETH.1470
Publisher: Oxford University Press (OUP)
Date: 09-06-2016
DOI: 10.1093/BIOINFORMATICS/BTW337
Abstract: With the emergence of large-scale Cap Analysis of Gene Expression (CAGE) datasets from in idual labs and the FANTOM consortium, one can now analyze the cis-regulatory regions associated with gene transcription at an unprecedented level of refinement. By coupling transcription factor binding site (TFBS) enrichment analysis with CAGE-derived genomic regions, CAGEd-oPOSSUM can identify TFs that act as key regulators of genes involved in specific mammalian cell and tissue types. The webtool allows for the analysis of CAGE-derived transcription start sites (TSSs) either provided by the user or selected from ∼1300 mammalian s les from the FANTOM5 project with pre-computed TFBS predicted with JASPAR TF binding profiles. The tool helps power insights into the regulation of genes through the study of the specific usage of TSSs within specific cell types and/or under specific conditions. Availability and Implementation: The CAGEd-oPOSUM web tool is implemented in Perl, MySQL and Apache and is available at cagedop.cmmt.ubc.ca/CAGEd_oPOSSUM. Contacts: anthony.mathelier@ncmm.uio.no or wyeth@cmmt.ubc.ca Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Wiley
Date: 14-08-2002
DOI: 10.1016/S0014-5793(02)03189-7
Abstract: A renewed interest in the multiple sequence alignment problem has given rise to several new algorithms. In contrast to traditional progressive methods, computationally expensive score optimization strategies are now predominantly employed. We systematically tested four methods (Poa, Dialign, T-Coffee and ClustalW) for the speed and quality of their alignments. As test sequences we used structurally derived alignments from BAliBASE and synthetic alignments generated by Rose. The tests included alignments of variable numbers of domains embedded in random spacer sequences. Overall, Dialign was the most accurate in cases with low sequence identity, while T-Coffee won in cases with high sequence identity. The fast Poa algorithm was almost as accurate, while ClustalW could compete only in strictly global cases with high sequence similarity.
Publisher: BMJ
Date: 09-2021
DOI: 10.1136/BMJOPEN-2021-053720
Abstract: The absence of a diagnostic test for acute rheumatic fever (ARF) is a major impediment in managing this serious childhood condition. ARF is an autoimmune condition triggered by infection with group A Streptococcus . It is the precursor to rheumatic heart disease (RHD), a leading cause of health inequity and premature mortality for Indigenous peoples of Australia, New Zealand and internationally. ‘Searching for a Technology-Driven Acute Rheumatic Fever Test’ (START) is a biomarker discovery study that aims to detect and test a biomarker signature that distinguishes ARF cases from non-ARF, and use systems biology and serology to better understand ARF pathogenesis. Eligible participants with ARF diagnosed by an expert clinical panel according to the 2015 Revised Jones Criteria, aged 5–30 years, will be recruited from three hospitals in Australia and New Zealand. Age, sex and ethnicity-matched in iduals who are healthy or have non-ARF acute diagnoses or RHD, will be recruited as controls. In the discovery cohort, blood s les collected at baseline, and during convalescence in a subset, will be interrogated by comprehensive profiling to generate possible diagnostic biomarker signatures. A biomarker validation cohort will subsequently be used to test promising combinations of biomarkers. By defining the first biomarker signatures able to discriminate between ARF and other clinical conditions, the START study has the potential to transform the approach to ARF diagnosis and RHD prevention. The study has approval from the Northern Territory Department of Health and Menzies School of Health Research ethics committee and the New Zealand Health and Disability Ethics Committee. It will be conducted according to ethical standards for research involving Indigenous Australians and New Zealand Māori and Pacific Peoples. Indigenous investigators and governance groups will provide oversight of study processes and advise on cultural matters.
Publisher: Cold Spring Harbor Laboratory
Date: 21-03-2016
DOI: 10.1101/044172
Abstract: Sex differences in susceptibility and progression have been reported in numerous diseases. Female cells have two copies of the X chromosome with X-chromosome inactivation imparting mono-allelic gene silencing for dosage compensation. However, a subset of genes, named escapees, escape silencing and are transcribed bi-allelically resulting in sexual dimorphism. Here we conducted analyses of the sexes using human datasets to gain perspectives in such regulation. We identified transcription start sites of escapees (escTSSs) based on higher transcription levels in female cells using FANTOM5 CAGE data. Significant over-representations of YY1 transcription factor binding motif and ChIP-seq peaks around escTSSs highlighted its positive association with escapees. Furthermore, YY1 occupancy is significantly biased towards the inactive X (Xi) at long non-coding RNA loci that are frequent contacts of Xi-specific superloops. Our study elucidated the importance of YY1 on transcriptional activity on Xi in general through sequence-specific binding, and its involvement at superloop anchors.
Publisher: Oxford University Press (OUP)
Date: 14-10-2014
DOI: 10.1002/STEM.1791
Abstract: Mesenchymal stem/stromal cells (MSCs) are the precursors of various cell types that compose both normal and cancer tissue microenvironments. In order to support the widely ersified parenchymal cells and tissue organization, MSCs are characterized by a large degree of heterogeneity, although available analyses of molecular and transcriptional data do not provide clear evidence. We have isolated MSCs from high-grade serous ovarian cancers (HG-SOCs) and various normal tissues (N-MSCs), demonstrated their normal genotype and analyzed their transcriptional activity with respect to the large comprehensive FANTOM5 s le dataset. Our integrative analysis conducted against the extensive panel of primary cells and tissues of the FANTOM5 project allowed us to mark the HG-SOC-MSCs CAGE-seq transcriptional heterogeneity and to identify a cell-type-specific transcriptional activity showing a significant relationship with primary mesothelial cells. Our analysis shows that MSCs isolated from different tissues are highly heterogeneous. The mesothelial-related gene signature identified in this study supports the hypothesis that HG-SOC-MSCs are bona fide representatives of the ovarian district. This finding indicates that HG-SOC-MSCs could actually derive from the coelomic mesothelium, suggesting that they might be linked to the epithelial tumor through common embryological precursors. Stem Cells 2014 :2998–3011
Publisher: Springer Science and Business Media LLC
Date: 03-2014
DOI: 10.1038/NATURE13182
Publisher: Springer Science and Business Media LLC
Date: 05-11-2020
DOI: 10.1038/S41598-020-76157-4
Abstract: The bone marrow microenvironment (BMM) plays a key role in leukemia progression, but its molecular complexity in pre-B cell acute lymphoblastic leukemia (B-ALL), the most common cancer in children, remains poorly understood. To gain further insight, we used single-cell RNA sequencing to characterize the kinetics of the murine BMM during B-ALL progression. Normal pro- and pre-B cells were found to be the most affected at the earliest stages of disease and this was associated with changes in expression of genes regulated by the AP1-transcription factor complex and regulatory factors NELFE, MYC and BCL11A. Granulocyte–macrophage progenitors show reduced expression of the tumor suppressor long non-coding RNA Neat1 and disruptions in the rate of transcription. Intercellular communication networks revealed monocyte-dendritic precursors to be consistently active during B-ALL progression, with enriched processes including cytokine-mediated signaling pathway, neutrophil-mediated immunity and regulation of cell migration and proliferation. In addition, we confirmed that the hematopoietic stem and progenitor cell compartment was perturbed during leukemogenesis. These findings extend our understanding of the complexity of changes and molecular interactions among the normal cells of the BMM during B-ALL progression.
Publisher: Springer Science and Business Media LLC
Date: 10-12-2020
DOI: 10.1038/S41525-020-00161-W
Abstract: Exome sequencing has enabled molecular diagnoses for rare disease patients but often with initial diagnostic rates of ~25−30%. Here we develop a robust computational pipeline to rank variants for reassessment of unsolved rare disease patients. A comprehensive web-based patient report is generated in which all deleterious variants can be filtered by gene, variant characteristics, OMIM disease and Phenolyzer scores, and all are annotated with an ACMG classification and links to ClinVar. The pipeline ranked 21/34 previously diagnosed variants as top, with 26 in total ranked ≤7th, 3 ranked ≥13th 5 failed the pipeline filters. Pathogenic/likely pathogenic variants by ACMG criteria were identified for 22/145 unsolved cases, and a previously undefined candidate disease variant for 27/145. This open access pipeline supports the partnership between clinical and research laboratories to improve the diagnosis of unsolved exomes. It provides a flexible framework for iterative developments to further improve diagnosis.
Publisher: Springer Science and Business Media LLC
Date: 16-05-2014
Publisher: Elsevier BV
Date: 05-2022
DOI: 10.1016/J.GENE.2022.146287
Abstract: There are an estimated > 400 million people living with a rare disease globally, with genetic variants the cause of approximately 80% of cases. Next Generation Sequencing (NGS) rapidly identifies genetic variants however they are often of unknown significance. Low throughput functional validation in specialist laboratories is the current ad hoc approach for functional validation of genetic variants, which creating major bottlenecks in patient diagnosis. This study investigates the application of CRISPR gene editing followed by genome wide transcriptomic profiling to facilitate patient diagnosis. As proof-of-concept, we introduced a variant in the Euchromatin histone methyl transferase (EHMT1) gene into HEK293T cells. We identified changes in the regulation of the cell cycle, neural gene expression and suppression of gene expression changes on chromosome 19 and chromosome X, that are in keeping with Kleefstra syndrome clinical phenotype and/or provide insight into disease mechanism. This study demonstrates the utility of genome editing followed by functional readouts to rapidly and systematically validating the function of variants of unknown significance in patients suffering from rare diseases.
Publisher: American Association for the Advancement of Science (AAAS)
Date: 27-02-2015
Abstract: In order to understand cellular differentiation, it is important to understand the timing of the regulation of gene expression. Arner et al. used cap analysis of gene expression (CAGE) to analyze gene enhancer and promoter activities in a number of human and mouse cell types. The RNA of enhancers was transcribed first, followed by that of transcription factors, and finally by genes that are not transcription factors. Science , this issue p. 1010
Publisher: Cold Spring Harbor Laboratory
Date: 30-07-2015
Abstract: Promoters are central to the regulation of gene expression. Changes in gene regulation are thought to underlie much of the adaptive ersification between species and phenotypic variation within populations. In contrast to earlier work emphasizing the importance of enhancer evolution and subtle sequence changes at promoters, we show that dramatic changes such as the complete gain and loss (collectively, turnover) of functional promoters are common. Using quantitative measures of transcription initiation in both humans and mice across 52 matched tissues, we discriminate promoter sequence gains from losses and resolve the lineage of changes. We also identify expression ergence and functional turnover between orthologous promoters, finding only the latter is associated with local sequence changes. Promoter turnover has occurred at the majority ( %) of protein-coding genes since humans and mice erged. Tissue-restricted promoters are the most evolutionarily volatile where retrotransposition is an important, but not the sole, source of innovation. There is considerable heterogeneity of turnover rates between promoters in different tissues, but the consistency of these in both lineages suggests that the same biological systems are similarly inclined to transcriptional rewiring. The genes affected by promoter turnover show evidence of adaptive evolution. In mice, promoters are primarily lost through deletion of the promoter containing sequence, whereas in humans, many promoters appear to be gradually decaying with weak transcriptional output and relaxed selective constraint. Our results suggest that promoter gain and loss is an important process in the evolutionary rewiring of gene regulation and may be a significant source of phenotypic ersification.
Publisher: Springer Science and Business Media LLC
Date: 04-01-2019
Publisher: Springer Science and Business Media LLC
Date: 21-08-2017
DOI: 10.1038/NBT.3947
Publisher: Springer Science and Business Media LLC
Date: 26-03-2014
Abstract: DNA methylation in promoters is closely linked to downstream gene repression. However, whether DNA methylation is a cause or a consequence of gene repression remains an open question. If it is a cause, then DNA methylation may affect the affinity of transcription factors (TFs) for their binding sites (TFBSs). If it is a consequence, then gene repression caused by chromatin modification may be stabilized by DNA methylation. Until now, these two possibilities have been supported only by non-systematic evidence and they have not been tested on a wide range of TFs. An average promoter methylation is usually used in studies, whereas recent results suggested that methylation of in idual cytosines can also be important. We found that the methylation profiles of 16.6% of cytosines and the expression profiles of neighboring transcriptional start sites (TSSs) were significantly negatively correlated. We called the CpGs corresponding to such cytosines “traffic lights”. We observed a strong selection against CpG “traffic lights” within TFBSs. The negative selection was stronger for transcriptional repressors as compared with transcriptional activators or multifunctional TFs as well as for core TFBS positions as compared with flanking TFBS positions. Our results indicate that direct and selective methylation of certain TFBS that prevents TF binding is restricted to special cases and cannot be considered as a general regulatory mechanism of transcription.
Publisher: Springer Science and Business Media LLC
Date: 12-2014
Publisher: Springer Science and Business Media LLC
Date: 29-08-2017
Abstract: In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of s les, consisting of a variety of primary cells, tissues, cell lines, and time series s les during cell activation and development, were subjected to a uniform pipeline of CAGE data production. The analysis pipeline started by measuring RNA extracts to assess their quality, and continued to CAGE library production by using a robotic or a manual workflow, single molecule sequencing, and computational processing to generate frequencies of transcription initiation. Resulting data represents the consequence of transcriptional regulation in each analyzed state of mammalian cells. Non-overlapping peaks over the CAGE profiles, approximately 200,000 and 150,000 peaks for the human and mouse genomes, were identified and annotated to provide precise location of known promoters as well as novel ones, and to quantify their activities.
Publisher: Public Library of Science (PLoS)
Date: 05-09-2017
Publisher: MDPI AG
Date: 06-01-2015
DOI: 10.3390/IJMS16011192
Publisher: Public Library of Science (PLoS)
Date: 18-12-2015
Publisher: Cold Spring Harbor Laboratory
Date: 09-2012
Abstract: Statistical models have been used to quantify the relationship between gene expression and transcription factor (TF) binding signals. Here we apply the models to the large-scale data generated by the ENCODE project to study transcriptional regulation by TFs. Our results reveal a notable difference in the prediction accuracy of expression levels of transcription start sites (TSSs) captured by different technologies and RNA extraction protocols. In general, the expression levels of TSSs with high CpG content are more predictable than those with low CpG content. For genes with alternative TSSs, the expression levels of downstream TSSs are more predictable than those of the upstream ones. Different TF categories and specific TFs vary substantially in their contributions to predicting expression. Between two cell lines, the differential expression of TSS can be precisely reflected by the difference of TF-binding signals in a quantitative manner, arguing against the conventional on-and-off model of TF binding. Finally, we explore the relationships between TF-binding signals and other chromatin features such as histone modifications and DNase hypersensitivity for determining expression. The models imply that these features regulate transcription in a highly coordinated manner.
Publisher: Oxford University Press (OUP)
Date: 21-03-2016
DOI: 10.1093/NAR/GKW162
Publisher: Springer Science and Business Media LLC
Date: 04-2020
DOI: 10.1038/S41596-020-0299-3
Abstract: The therapeutic response to immune checkpoint blockade (ICB) is highly variable, not only between different cancers but also between patients with the same cancer type. The biological mechanisms underlying these differences in response are incompletely understood. Identifying correlates in patient tumor s les is challenging because of genetic and environmental variability. Murine studies usually compare different tumor models or treatments, introducing potential confounding variables. This protocol describes bilateral murine tumor models, derived from syngeneic cancer cell lines, that display a symmetrical yet dichotomous response to ICB. These models enable detailed analysis of whole tumors in a highly homogeneous background, combined with knowledge of the therapeutic outcome within a few weeks, and could potentially be used for mechanistic studies using other (immuno-)therapies. We discuss key considerations and describe how to use two cell lines as fully optimized models. We discuss experimental details, including proper inoculation technique to achieve symmetry and one-sided surgical tumor removal, which takes only 5 min per mouse. Furthermore, we outline the preparation of bulk tissue or single-cell suspensions for downstream analyses such as bulk RNA-seq, immunohistochemistry, single-cell RNA-seq and flow cytometry.
Publisher: Oxford University Press (OUP)
Date: 2006
DOI: 10.1093/NAR/GKJ149
Publisher: Springer Science and Business Media LLC
Date: 20-09-2016
DOI: 10.1038/SREP33666
Abstract: Periodontitis is affecting over half of the adult population and represents a major public health problem. Previously, we isolated a subset of gingival fibroblasts (GFs) from periodontitis patients, designated as periodontitis-associated fibroblasts (PAFs), which were highly capable of collagen degradation. To elucidate their molecular profiles, GFs isolated form healthy and periodontitis-affected gingival tissues were analyzed by CAGE-seq and integrated with the FANTOM5 atlas. GFs from healthy gingival tissues displayed distinctive patterns of CAGE profiles as compared to fibroblasts from other organ sites and characterized by specific expression of developmentally important transcription factors such as BARX1 , PAX9 , LHX8 and DLX5 . In addition, a novel long non-coding RNA associated with LHX8 was described. Furthermore, we identified DLX5 regulating expression of the long variant of RUNX2 transcript, which was specifically active in GFs but not in their periodontitis-affected counterparts. Knockdown of these factors in GFs resulted in altered expression of extracellular matrix (ECM) components. These results indicate activation of DLX5 and RUNX2 via its distal promoter represents a unique feature of GFs and is important for ECM regulation. Down-regulation of these transcription factors in PAFs could be associated with their property to degrade collagen, which may impact on the process of periodontitis.
Publisher: Public Library of Science (PLoS)
Date: 17-04-2015
Publisher: Public Library of Science (PLoS)
Date: 30-01-2012
Publisher: Elsevier BV
Date: 11-2015
DOI: 10.1016/J.CELREP.2015.10.002
Abstract: VEGF-C/VEGFR-3 signaling plays a central role in lymphatic development, regulating the budding of lymphatic progenitor cells from embryonic veins and maintaining the expression of PROX1 during later developmental stages. However, how VEGFR-3 activation translates into target gene expression is still not completely understood. We used cap analysis of gene expression (CAGE) RNA sequencing to characterize the transcriptional changes invoked by VEGF-C in LECs and to identify the transcription factors (TFs) involved. We found that MAFB, a TF involved in differentiation of various cell types, is rapidly induced and activated by VEGF-C. MAFB induced expression of PROX1 as well as other TFs and markers of differentiated LECs, indicating a role in the maintenance of the mature LEC phenotype. Correspondingly, E14.5 Mafb(-/-) embryos showed impaired lymphatic patterning in the skin. This suggests that MAFB is an important TF involved in lymphangiogenesis.
Publisher: Frontiers Media SA
Date: 07-11-2018
Publisher: American Society of Hematology
Date: 24-04-2014
DOI: 10.1182/BLOOD-2013-02-486944
Abstract: Transcription and enhancer profiling reveal cell type–specific regulome architectures and transcription factor networks in conventional and regulatory T cells.
Publisher: Springer Science and Business Media LLC
Date: 18-11-2016
DOI: 10.1038/SREP37324
Abstract: Sex differences in susceptibility and progression have been reported in numerous diseases. Female cells have two copies of the X chromosome with X-chromosome inactivation imparting mono-allelic gene silencing for dosage compensation. However, a subset of genes, named escapees, escape silencing and are transcribed bi-allelically resulting in sexual dimorphism. Here we conducted in silico analyses of the sexes using human datasets to gain perspectives into such regulation. We identified transcription start sites of escapees (escTSSs) based on higher transcription levels in female cells using FANTOM5 CAGE data. Significant over-representations of YY1 transcription factor binding motif and ChIP-seq peaks around escTSSs highlighted its positive association with escapees. Furthermore, YY1 occupancy is significantly biased towards the inactive X (Xi) at long non-coding RNA loci that are frequent contacts of Xi-specific superloops. Our study suggests a role for YY1 in transcriptional activity on Xi in general through sequence-specific binding, and its involvement at superloop anchors.
Publisher: Oxford University Press (OUP)
Date: 22-12-2009
DOI: 10.1093/NAR/GKN1006
Publisher: American Society of Hematology
Date: 24-04-2014
DOI: 10.1182/BLOOD-2013-02-483792
Abstract: Generated a reference transcriptome for ex vivo, cultured, and stimulated mast cells, contrasted against a broad collection of primary cells. Identified BMPs as function-modulating factors for mast cells.
Publisher: Springer Science and Business Media LLC
Date: 16-07-2015
DOI: 10.1038/SREP11999
Abstract: The analysis of CAGE (Cap Analysis of Gene Expression) time-course has been proposed by the FANTOM5 Consortium to extend the understanding of the sequence of events facilitating cell state transition at the level of promoter regulation. To identify the most prominent transcriptional regulations induced by growth factors in human breast cancer, we apply here the Complexity Invariant Dynamic Time Warping motif EnRichment (CIDER) analysis approach to the CAGE time-course datasets of MCF-7 cells stimulated by epidermal growth factor (EGF) or heregulin (HRG). We identify a multi-level cascade of regulations rooted by the Serum Response Factor (SRF) transcription factor, connecting the MAPK-mediated transduction of the HRG stimulus to the negative regulation of the MAPK pathway by the members of the DUSP family phosphatases. The finding confirms the known primary role of FOS and FOSL1, members of AP-1 family, in shaping gene expression in response to HRG induction. Moreover, we identify a new potential regulation of DUSP5 and RARA (known to antagonize the transcriptional regulation induced by the estrogen receptors) by the activity of the AP-1 complex, specific to HRG response. The results indicate that a ergence in AP-1 regulation determines cellular changes of breast cancer cells stimulated by ErbB receptors.
Publisher: Cold Spring Harbor Laboratory
Date: 09-2012
Abstract: The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for in idual ex les, analytical and experimental approaches to investigate these genes have been h ered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences—particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.
Publisher: Public Library of Science (PLoS)
Date: 03-2018
Publisher: American Association for Cancer Research (AACR)
Date: 12-2019
DOI: 10.1158/1535-7163.MCT-19-0273
Abstract: Cancer precision medicine aims to predict the drug likely to yield the best response for a patient. Genomic sequencing of tumors is currently being used to better inform treatment options however, this approach has had a limited clinical impact due to the paucity of actionable mutations. An alternative to mutation status is the use of gene expression signatures to predict response. Using data from two large-scale studies, The Genomics of Drug Sensitivity of Cancer (GDSC) and The Cancer Therapeutics Response Portal (CTRP), we investigated the relationship between the sensitivity of hundreds of cell lines to hundreds of drugs, and the relative expression levels of the targets these drugs are directed against. For approximately one third of the drugs considered (73/222 in GDSC and 131/360 in CTRP), sensitivity was significantly correlated with the expression of at least one of the known targets. Surprisingly, for 8% of the annotated targets, there was a significant anticorrelation between target expression and sensitivity. For several cases, this corresponded to drugs targeting multiple genes in the same family, with the expression of one target significantly correlated with sensitivity and another significantly anticorrelated suggesting a possible role in resistance. Furthermore, we identified nontarget genes that are significantly correlated or anticorrelated with drug sensitivity, and find literature linking several to sensitization and resistance. Our analyses provide novel and important insights into both potential mechanisms of resistance and relative efficacy of drugs against the same target.
Publisher: Oxford University Press (OUP)
Date: 19-08-2010
DOI: 10.1093/NAR/GKQ729
Publisher: Springer Science and Business Media LLC
Date: 12-04-2016
Abstract: Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal in iduals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal in iduals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.
Location: United Kingdom of Great Britain and Northern Ireland
No related grants have been discovered for Timo Lassmann.