ARDC Research Link Australia

Publication

Mutational analysis of driver genes with tumor suppressive and oncogenic roles in gastric cancer

Publisher: PeerJ

Date: 17-07-2017

Abstract: Gastric cancer (GC) is a complex disease with heterogeneous genetic mechanisms. Genomic mutational profiling of gastric cancer not only expands our knowledge about cancer progression at a fundamental genetic level, but also could provide guidance on new treatment decisions, currently based on tumor histology. The fact that precise medicine-based treatment is successful in a subset of tumors indicates the need for better identification of clinically related molecular tumor phenotypes, especially with regard to those driver mutations on tumor suppressor genes (TSGs) and oncogenes (ONGs). We surveyed 313 TSGs and 160 ONGs associated with 48 protein coding and 19 miRNA genes with both TSG and ONG roles. Using public cancer mutational profiles, we confirmed the dual roles of CDKN1A and CDKN1B . In addition to the widely recognized alterations, we identified another 82 frequently mutated genes in public gastric cancer cohort. In summary, these driver mutation profiles of in idual GC will form the basis of personalized treatment of gastric cancer, leading to substantial therapeutic improvements.

Publication

Bioinformatic investigation and functional analysis of 214 hereditary genes identified non-coding RNAs as therapeautic tool for breast cancer management

Publisher: Elsevier BV

Date: 06-2022

DOI: 10.1016/J.GENREP.2022.101565

Publication

GPCR and IR genes in Schistosoma mansoni miracidia

Publisher: Springer Science and Business Media LLC

Date: 26-10-2016

DOI: 10.1186/S13071-016-1837-2

Publication

A pan-cancer study of copy number gain and up-regulation in human oncogenes

Publisher: Elsevier BV

Date: 10-2018

DOI: 10.1016/J.LFS.2018.09.032

Abstract: There has been limited research on CNVs in oncogenes and we conducted a systematic pan-cancer analysis of CNVs and their gene expression changes. The aim of the present study was to provide an insight into the relationships between gene expression and oncogenesis. We collected all the oncogenes from ONGene database and overlapped with CNVs TCGA tumour s les from Catalogue of Somatic Mutations in Cancer database. We further conducted an integrative analysis of CNV with gene expression using the data from the matched TCGA tumour s les. From our analysis, we found 637 oncogenes associated with CNVs in 5900 tumour s les. There were 204 oncogenes with frequent copy number of gain (CNG). These 204 oncogenes were enriched in cancer-related pathways including the MAPK cascade and Ras GTPases signalling pathways. By using corresponding tumour s les data to perform integrative analyses of CNVs and gene expression changes, we identified 95 oncogenes with consistent CNG occurrence and up-regulation in the tumour s les, which may represent the recurrent driving force for oncogenesis. Surprisingly, eight oncogenes shown concordant CNG and gene up-regulation in at least 250 tumour s les: INTS8 (355), ECT2 (326), LSM1 (310), DDHD2 (298), COPS5 (286), EIF3E (281), TPD52 (258) and ERBB2 (254). As the first report about abundant CNGs on oncogene and concordant change of gene expression, our results may be valuable for the design of CNV-based cancer diagnostic strategy.

Publication

Comparative study of excretory-secretory proteins released by Schistosoma mansoni-resistant, susceptible and naïve Biomphalaria glabrata.

Publisher: Springer Science and Business Media LLC

Date: 14-09-2019

DOI: 10.1186/S13071-019-3708-0

Abstract: Schistosomiasis is a harmful neglected tropical disease caused by infection with Schistosoma spp., such as Schistosoma mansoni . Schistosoma must transition within a molluscan host to survive. Chemical analyses of schistosome-molluscan interactions indicate that host identification involves chemosensation, including naïve host preference. Proteomic technique advances enable sophisticated comparative analyses between infected and naïve snail host proteins. This study aimed to compare resistant, susceptible and naïve Biomphalaria glabrata snail-conditioned water (SCW) to identify potential attractants and deterrents. Behavioural bioassays were performed on S. mansoni miracidia to compare the effects of susceptible, F1 resistant and naïve B. glabrata SCW. The F1 resistant and susceptible B. glabrata SCW excretory–secretory proteins (ESPs) were fractionated using SDS-PAGE, identified with LC-MS/MS and compared to naïve snail ESPs. Protein-protein interaction (PPI) analyses based on published studies (including experiments, co-expression, text-mining and gene fusion) identified S. mansoni and B. glabrata protein interaction. Data are available via ProteomeXchange with identifier PXD015129. A total of 291, 410 and 597 ESPs were detected in the susceptible, F1 resistant and naïve SCW, respectively. Less overlap in ESPs was identified between susceptible and naïve snails than F1 resistant and naïve snails. F1 resistant B. glabrata ESPs were predominately associated with anti-pathogen activity and detoxification, such as leukocyte elastase and peroxiredoxin. Susceptible B. glabrata several proteins correlated with immunity and anti-inflammation, such as glutathione S-transferase and zinc metalloproteinase, and S. mansoni sporocyst presence. PPI analyses found that uncharacterised S. mansoni protein Smp_142140.1 potentially interacts with numerous B. glabrata proteins. This study identified ESPs released by F1 resistant, susceptible and naïve B. glabrata to explain S. mansoni miracidia interplay. Susceptible B. glabrata ESPs shed light on potential S. mansoni miracidia deterrents. Further targeted research on specific ESPs identified in this study could help inhibit B. glabrata and S. mansoni interactions and stop human schistosomiasis.

Publication

The pan-cancer analysis of gain-of-functional mutations to identify the common oncogenic signatures in multiple cancers.

Publisher: Elsevier BV

Date: 05-2019

DOI: 10.1016/J.GENE.2019.02.039

Abstract: Oncogenes can potentially cause uncontrolled cell growth, leading to cancer development, and these genes are normally mutated and over-expressed in tumor cells. Genomic alteration of oncogenes might result in oncogenesis and promotion of cancer progression. To date, researchers have focused mainly on the roles of oncogenes in particular cancers, but investigation of oncogenes with gain-of-function mutations in multiple cancer types are less represented in the literature. Furthermore, the effect of those gain-of-function are not validated in gene expression level. To meet this demand, we performed a systematic analysis of gene expression in oncogenes to identify the occurrence of gain-of-function mutations in pan-cancer. We identified 33,551 oncogenic mutations in 5000 s les. From our analysis, we identified three tissues with the highest frequency of gain-of-functional oncogenic mutations in hundreds of s les: breast (739 s les), central nervous system (646 s les) and large intestine (498 s les). By further counting the number of occurrences of oncogenes across cancer types, we identified a list cross-cancer mutational signatures of 99 oncogenes highly mutated in >400 s les in breast, central nervous system and large intestine s les. By further overlapping with gene expression data in the matched tumor s les, we further identified 1875 gain-of-functional mutations/events with consistent gene up-regulation in 1031 s les from multiple cancers. This result may offer additional insight into the relationship between gene dosage and oncogenesis and maybe useful in targeted cancer therapy. In summary, this study provides the first globally examining on the genetic alteration of oncogenes across cancer types. Clinical association analysis has shown that these 99 genes have a significant effect on survival.

Publication

Early Miocene elevation in northern Tibet estimated by palaeobotanical evidence

Publisher: Springer Science and Business Media LLC

Date: 15-05-2015

DOI: 10.1038/SREP10379

Abstract: The area and elevation of the Tibetan Plateau over time has directly affected Asia’s topography, the characteristics of the Asian monsoon and modified global climate, but in ways that are poorly understood. Charting the uplift history is crucial for understanding the mechanisms that link elevation and climate irrespective of time and place. While some palaeoelevation data are available for southern and central Tibet, clues to the uplift history of northern Tibet remain sparse and largely circumstantial. Leaf fossils are extremely rare in Tibet but here we report a newly discovered early Miocene barberry ( Berberis ) from Wudaoliang in the Hoh-Xil Basin in northern Tibet, at a present altitude of 4611 ± 9 m. Considering the fossil and its nearest living species probably occupied a similar or identical environmental niche, the palaeoelevation of the fossil locality, corrected for Miocene global temperature difference, is estimated to have been between 1395 and 2931 m, which means this basin has been uplifted ~2–3 km in the last 17 million years. Our findings contradict hypotheses that suggest northern Tibet had reached or exceeded its present elevation prior to the Miocene.

Publication

Cellular Metabolic Network Analysis: Discovering Important Reactions inTreponema pallidum

Publisher: Hindawi Limited

Date: 2015

DOI: 10.1155/2015/328568

Abstract: T. pallidum , the syphilis-causing pathogen, performs very differently in metabolism compared with other bacterial pathogens. The desire for safe and effective vaccine of syphilis requests identification of important steps in T. pallidum ’s metabolism. Here, we apply Flux Balance Analysis to represent the reactions quantitatively. Thus, it is possible to cluster all reactions in T. pallidum . By calculating minimal cut sets and analyzing topological structure for the metabolic network of T. pallidum , critical reactions are identified. As a comparison, we also apply the analytical approaches to the metabolic network of H. pylori to find coregulated drug targets and unique drug targets for different microorganisms. Based on the clustering results, all reactions are further classified into various roles. Therefore, the general picture of their metabolic network is obtained and two types of reactions, both of which are involved in nucleic acid metabolism, are found to be essential for T. pallidum . It is also discovered that both hubs of reactions and the isolated reactions in purine and pyrimidine metabolisms play important roles in T. pallidum . These reactions could be potential drug targets for treating syphilis.

Publication

TSGene: a web resource for tumor suppressor genes

Publisher: Oxford University Press (OUP)

Date: 12-10-2012

DOI: 10.1093/NAR/GKS937

Publication

Expression of epithelial-mesenchymal transition-related genes increases with copy number in multiple cancer types

Publisher: Impact Journals, LLC

Date: 25-03-2016

DOI: 10.18632/ONCOTARGET.8371

Publication

Greenlip Abalone (Haliotis laevigata) Genome and Protein Analysis Provides Insights into Maturation and Spawning.

Publisher: Oxford University Press (OUP)

Date: 10-2019

DOI: 10.1534/G3.119.400388

Abstract: Wild abalone (Family Haliotidae) populations have been severely affected by commercial fishing, poaching, anthropogenic pollution, environment and climate changes. These issues have stimulated an increase in aquaculture production however production growth has been slow due to a lack of genetic knowledge and resources. We have sequenced a draft genome for the commercially important temperate Australian ‘greenlip’ abalone (Haliotis laevigata, Donovan 1808) and generated 11 tissue transcriptomes from a female adult abalone. Phylogenetic analysis of the greenlip abalone with reference to the Pacific abalone (Haliotis discus hannai) indicates that these abalone species erged approximately 71 million years ago. This study presents an in-depth analysis into the features of reproductive dysfunction, where we provide the putative biochemical messenger components (neuropeptides) that may regulate reproduction including gonad maturation and spawning. Indeed, we isolate the egg-laying hormone neuropeptide and under trial conditions induce spawning at 80% efficiency. Altogether, we provide a solid platform for further studies aimed at stimulating advances in abalone aquaculture production. The H. laevigata genome and resources are made available to the public on the abalone ‘omics website, abalonedb.org.

Publication

REGene: a literature-based knowledgebase of animal regeneration that bridge tissue regeneration and cancer

Publisher: Springer Science and Business Media LLC

Date: 15-03-2016

DOI: 10.1038/SREP23167

Abstract: Regeneration is a common phenomenon across multiple animal phyla. Regeneration-related genes (REGs) are critical for fundamental cellular processes such as proliferation and differentiation. Identification of REGs and elucidating their functions may help to further develop effective treatment strategies in regenerative medicine. So far, REGs have been largely identified by small-scale experimental studies and a comprehensive characterization of the erse biological processes regulated by REGs is lacking. Therefore, there is an ever-growing need to integrate REGs at the genomics, epigenetics and transcriptome level to provide a reference list of REGs for regeneration and regenerative medicine research. Towards achieving this, we developed the first literature-based database called REGene (REgeneration Gene database). In the current release, REGene contains 948 human (929 protein-coding and 19 non-coding genes) and 8445 homologous genes curated from gene ontology and extensive literature examination. Additionally, the REGene database provides detailed annotations for each REG, including: gene expression, methylation sites, upstream transcription factors and protein-protein interactions. An analysis of the collected REGs reveals strong links to a variety of cancers in terms of genetic mutation, protein domains and cellular pathways. We have prepared a web interface to share these regeneration genes, supported by refined browsing and searching functions at REGene.bioinfo-minzhao.org/ .

Publication

lnCaNet: pan-cancer co-expression network for human lncRNA and cancer genes

Publisher: Oxford University Press (OUP)

Date: 18-01-2016

DOI: 10.1093/BIOINFORMATICS/BTW017

Abstract: Summary: Thousands of human long non-coding RNAs (lncRNAs) have been identified in cancers and played important roles in a wide range of tumorigenesis. However, the functions of vast majority of human lncRNAs are still elusive. Emerging studies revealed that the expression level of majority lncRNAs shows discordant expression pattern with their protein-coding gene neighbors in various model organisms. Therefore, it may be useful to infer lncRNAs’ potential biological function in cancer development by more comprehensive functional views of co-expressed cancer genes beyond mere physical proximity of genes. To this aim, we performed thorough searches and analyses of the interactions between lncRNA and non-neighboring cancer genes and provide a comprehensive co-expression data resource, LnCaNet. In current version, LnCaNet contains the pre-computed 8 494 907 significant co-expression pairs of 9641 lncRNAs and 2544 well-classified cancer genes in 2922 matched TCGA s les. In detail, we integrated 10 cancer gene lists from public database and calculate the co-expression with all the lncRNAs in 11 TCGA cancer types separately. Based on the resulted 110 co-expression networks, we identified 17 common regulatory pairs related to extracellular space shared in 11 cancers. We expect LnCaNet will enable researcher to explore lncRNA expression pattern, their affected cancer genes and pathways, biological significance in the context of specific cancer types and other useful annotation related to particular kind of lncRNA-cancer gene interaction. Availability and implementation: lncanet.bioinfo-minzhao.org/ Contact: m.zhao@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

Multi-tissue transcriptomics for construction of a comprehensive gene resource for the terrestrial snail Theba pisana

Publisher: Springer Science and Business Media LLC

Date: 08-02-2016

DOI: 10.1038/SREP20685

Abstract: The land snail Theba pisana is native to the Mediterranean region but has become one of the most abundant invasive species worldwide. Here, we present three transcriptomes of this agriculture pest derived from three tissues: the central nervous system, hepatopancreas (digestive gland), and foot muscle. Sequencing of the three tissues produced 339,479,092 high quality reads and a global de novo assembly generated a total of 250,848 unique transcripts (unigenes). BLAST analysis mapped 52,590 unigenes to NCBI non-redundant protein databases and further functional analysis annotated 21,849 unigenes with gene ontology. We report that T. pisana transcripts have representatives in all functional classes and a comparison of differentially expressed transcripts amongst all three tissues demonstrates enormous differences in their potential metabolic activities. The genes differentially expressed include those with sequence similarity to those genes associated with multiple bacterial diseases and neurological diseases. To provide a valuable resource that will assist functional genomics study, we have implemented a user-friendly web interface, ThebaDB ( thebadb.bioinfo-minzhao.org/ ). This online database allows for complex text queries, sequence searches, and data browsing by enriched functional terms and KEGG mapping.

Publication

IQdb: an intelligence quotient score-associated gene resource for human intelligence

Publisher: Oxford University Press (OUP)

Date: 2013

DOI: 10.1093/DATABASE/BAT063

Publication

Identifying the Common Cell-Free DNA Biomarkers across Seven Major Cancer Types

Publisher: MDPI AG

Date: 05-06-2023

DOI: 10.20944/PREPRINTS202306.0288.V1

Abstract: Blood-based circulating cell free DNA(cfDNA) detection offers a non-invasive and easily accessible way for early cancer detection. Despite the extensive utility of cfDNA, there are still many challenges to develop the clinical biomarkers. For ex le, cfDNA with genetic alterations often compose a small portion of the DNA circulating in plasma, which can be confounded by cfDNA contributed by normal cells. Therefore, filtering out the potential false-positive cfDNA mutations from health population will be important for the cancer-based biomarkers. Additionally, many low-frequency genetic alterations are easily overlooked in small amount of cfDNA-based cancer test. We hypothesize that, the combination of erse types of cancer studies on cfDNA can provide us a new insight to identify low-frequency genetic variant across cancer types for early clinical detection of cancers. By building a standardized computational pipeline for 1358 cfDNA s les across seven cancer types, we prioritize 129 shard genetic variants in the major cancer types. Further functional analysis of the 129 variants found that they are mainly enriched in ribosome pathways such as cotranslational protein targeting to membrane, some of which are tumor suppressor, oncogene and related to cancer initiation. In summary, our integrative analysis revealed the important roles of ribosome proteins as the common biomarkers in early cancer diagnosis.

Publication

dbEMT: an epithelial-mesenchymal transition associated gene resource

Publisher: Springer Science and Business Media LLC

Date: 23-06-2015

DOI: 10.1038/SREP11459

Abstract: As a cellular process that changes epithelial cells to mesenchymal cells, Epithelial-mesenchymal transition (EMT) plays important roles in development and cancer metastasis. Recent studies on cancer metastasis have identified many new susceptibility genes that control this transition. However, there is no comprehensive resource for EMT by integrating various genetic studies and the relationship between EMT and the risk of complex diseases such as cancer are still unclear. To investigate the cellular complexity of EMT, we have constructed dbEMT ( dbemt.bioinfo-minzhao.org/ ), the first literature-based gene resource for exploring EMT-related human genes. We manually curated 377 experimentally verified genes from literature. Functional analyses highlighted the prominent role of proteoglycans in tumor metastatic cascades. In addition, the disease enrichment analysis provides a clue for the potential transformation in affected tissues or cells in Alzheimer’s disease and Type 2 Diabetes. Moreover, the global mutation pattern of EMT-related genes across multiple cancers may reveal common cancer metastasis mechanisms. Our further reconstruction of the EMT-related protein-protein interaction network uncovered a highly modular structure. These results illustrate the importance of dbEMT to our understanding of cell development and cancer metastasis and also highlight the utility of dbEMT for elucidating the functions of EMT-related genes.

Publication

Consistent analysis of differentially expressed genes across 7 cell types in papillary thyroid carcinoma

Publisher: Elsevier BV

Date: 2023

DOI: 10.1016/J.CSBJ.2023.10.045

Publication

TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes

Publisher: Oxford University Press (OUP)

Date: 20-11-2015

DOI: 10.1093/NAR/GKV1268

Publication

CIGene: a literature-based online resource for cancer initiation genes

Publisher: Springer Science and Business Media LLC

Date: 25-07-2018

DOI: 10.1186/S12864-018-4944-Y

Publication

Distinct and Competitive Regulatory Patterns of Tumor Suppressor Genes and Oncogenes in Ovarian Cancer

Publisher: Public Library of Science (PLoS)

Date: 30-08-2012

DOI: 10.1371/JOURNAL.PONE.0044175

Publication

Integrative analysis of common genes and driver mutations implicated in hormone stimulation for four cancers in women

Publisher: PeerJ

Date: 06-06-2019

DOI: 10.7717/PEERJ.6872

Abstract: Cancer is one of the leading cause of death of women worldwide, and breast, ovarian, endometrial and cervical cancers contribute significantly to this every year. Developing early genetic-based diagnostic tools may be an effective approach to increase the chances of survival and provide more treatment opportunities. However, the current cancer genetic studies are mainly conducted independently and, hence lack of common driver genes involved in cancers in women. To explore the potential common molecular mechanism, we integrated four comprehensive literature-based databases to explore the shared implicated genetic effects. Using a total of 460 endometrial, 2,068 ovarian, 2,308 breast and 537 cervical cancer-implicated genes, we identified 52 genes which are common in all four types of cancers in women. Furthermore, we defined their potential functional role in endogenous hormonal regulation pathways within the context of four cancers in women. For ex le, these genes are strongly associated with hormonal stimulation, which may facilitate rapid diagnosis and treatment management decision making. Additional mutational analyses on combined the cancer genome atlas datasets consisting of 5,919 gynaecological and breast tumor s les were conducted to identify the frequently mutated genes across cancer types. For those common implicated genes for hormonal stimulants, we found that three quarter of 5,919 s les had genomic alteration with the highest frequency in MYC (22%), followed by NDRG1 (19%), ERBB2 (14%), PTEN (13%), PTGS2 (13%) and CDH1 (11%). We also identified 38 hormone related genes, eight of which are associated with the ovulation cycle. Further systems biology approach of the shared genes identified 20 novel genes, of which 12 were involved in the hormone regulation in these four cancers in women. Identification of common driver genes for hormone stimulation provided an unique angle of involving the potential of the hormone stimulants-related genes for cancer diagnosis and prognosis.

Publication

Decode the Stable Cell Communications Based on Neuropeptide-Receptors Network in 36746 Tumor Cells

Publisher: MDPI AG

Date: 22-12-2021

DOI: 10.3390/BIOMEDICINES10010014

Abstract: Background: As chemical signals of hormones, neuropeptides are essential to regulate cell growth by interacting with their receptors to achieve cell communications in cancer tissues. Previously, neuropeptide transcriptome analysis was limited to tissue-based bulk expression levels. The molecular mechanisms of neuropeptides and their receptors at the single-cell level remain unclear. We conducted a systematic single-cell transcriptome data integration analysis to clarify the similarities and variations of neuropeptide-mediated cell communication between various malignancies. Methods: Based on the single-cell expression information in 72 cancer datasets across 24 cancer types, we characterized actively expressed neuropeptides and receptors as having log values of the quantitative transcripts per million ≥ 1. Then, we created the putative cell-to-cell communication network for each dataset by using the known interaction of those actively expressed neuropeptides and receptors. To focus on the stable cell communication events, we identified neuropeptide and downstream receptors whose interactions were detected in more than half of all conceivable cell-cell interactions (square of the total cell population) in a dataset. Results: Focusing on those actively expressed neuropeptides and receptors, we built over 76 million cell-to-cell communications across 70 cancer datasets. Then the stable cell communication analyses were applied to each dataset, and about 14 million stable cell-to-cell communications could be detected based on 16 neuropeptides and 23 receptors. Further functional analysis indicates these 39 genes could regulate blood pressure and are significantly associated with patients’ survival among over ten thousand The Cancer Genome Atlas (TCGA)pan-cancer s les. By zooming in lung cancer-specific clinical features, we discovered the 39 genes appeared to be enriched in the patients with smoking. In skin cancer, they may differ in the patients with the distinct histological subtype and molecular drivers. Conclusions: At the single-cell level, stable cell communications across cancer types demonstrated some common and distinct neuropeptide-receptor patterns, which could be helpful in determining the status of neuropeptide-based cell communication and developing a peptide-based therapy strategy.

Publication

Neuropeptides encoded by the genomes of the akoya pearl oyster pinctata fucata and pacific oyster crassostrea gigas: A bioinformatic and peptidomic survey

Publisher: Springer Science and Business Media LLC

Date: 02-10-2014

DOI: 10.1186/1471-2164-15-840

Publication

METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text

Publisher: Hindawi Limited

Date: 2015

DOI: 10.1155/2015/254838

Abstract: The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification.

Publication

Concordance between somatic copy number loss and down-regulated expression: A pan-cancer study of cancer predisposition genes

Publisher: Springer Science and Business Media LLC

Date: 08-12-2016

DOI: 10.1038/SREP37358

Abstract: Cancer predisposition genes (CPGs) are a class of cancer genes in which germline variants lead to increased risk of cancer. Research has revealed that copy number variation (CNV) may be linked to cancer susceptibility in CPGs. In this pan-cancer analysis, we explored the relationship between somatic CNV and gene expression changes in CPGs. Based on curated 827 human CPGs from literature, we firstly identified 729 CPGs with precise CNV information from 5067 tumor s les using TCGA CNV data. Among them, 128 CPGs tended to have more frequent copy number losses (CNLs) compared with copy number gains (CNGs). Then by correlating these CNV data with TCGA gene expression data, we obtained 49 CPGs with concordant CNLs and gene down-regulation. Intriguingly, five CPGs showed concordance between CNL and down-regulation in 50 or more tumor s les: MTAP (216 s les), PTEN (143), MCPH1 (86), SMAD4 (63), and MINPP1 (51), which may represent the recurrent driving force for gene expression change during oncogenesis. Moreover, network analysis revealed that these 49 CPGs were tightly connected. In summary, this study provides the first observation of concordance between CNLs and down-regulation of CPGs in pan-cancer, which may help better understand the CPG biology in tumorigenesis and cancer progression.

Publication

Exploring the role of post-translational modulators of transcription factors in triple-negative breast cancer gene expression

Publisher: Elsevier BV

Date: 06-2020

DOI: 10.1016/J.MGENE.2020.100681

Publication

The bioinformatics tools for the genome assembly and analysis based on third-generation sequencing.

Publisher: Oxford University Press (OUP)

Date: 21-11-2018

DOI: 10.1093/BFGP/ELY037

Abstract: The application of third-generation sequencing (TGS) technology in genetics and genomics have provided opportunities to categorize and explore the in idual genomic landscapes and mutations relevant for diagnosis and therapy using whole genome sequencing and de novo genome assembly. In general, the emerging TGS technology can produce high quality long reads for the determination of overlapping reads and transcript isoforms. However, this technology still faces challenges such as the accuracy for the identification of nucleotide bases and high error rates. Here, we surveyed 39 TGS-related tools for de novo assembly and genome analysis to identify the differences among their characteristics, such as the required input, the interaction with the user, sequencing platforms, type of reads, error models, the possibility of introducing coverage bias, the simulation of genomic variants and outputs provided. The decision trees are summarized to help researchers to find out the most suitable tools to analyze the TGS data. Our comprehensive survey and evaluation of computational features of existing methods for TGS may provide a valuable guideline for researchers.

Publication

Systematic review of next-generation sequencing simulators: computational tools, features and perspectives

Publisher: Oxford University Press (OUP)

Date: 11-04-2016

DOI: 10.1093/BFGP/ELW012

Abstract: High-throughput next-generation sequencing (NGS) technologies have rapidly generated a large volume of genomic data. To aid the development and evaluation of new statistical models and computational methods, NGS-based simulators have been proposed to construct better experimental workflows. However, the comparative performance of these NGS simulators remains unclear. In this review, we conducted a comprehensive investigation of NGS simulators for various sequencing techniques, including DNA sequencing, metagenomic sequencing, RNA-seq, ChIP-seq and bisulfite sequencing for methylation.

Publication

PathLocdb: a comprehensive database for the subcellular localization of metabolic pathways and its application to multiple localization analysis

Publisher: Springer Science and Business Media LLC

Date: 2010

DOI: 10.1186/1471-2164-11-S4-S13

Publication

Gonadotropin-releasing hormone and adipokinetic hormone/corazonin-related peptide in the female prawn

Publisher: Elsevier BV

Date: 09-2016

DOI: 10.1016/J.YGCEN.2016.07.008

Abstract: Crustacean neuropeptides (NPs) play important roles in the regulation of most physiological activities, including growth, molting and reproduction. In this study, we have performed an in silico analysis of female prawn (Macrobrachium rosenbergii) neural transcriptomes to identify NPs not previously identified. We predict that approximately 1309 proteins are destined for the secretory pathway, many of which are likely post-translationally processed to generate active peptides. Within this neural secretome, we identified a gene transcript that encoded a precursor protein with striking similarity to a gonadotropin-releasing hormone (GnRH). We additionally identified another GnRH NP superfamily member, the adipokinetic hormone/corazonin-related peptide (ACP). M. rosenbergii GnRH and ACP were widespread throughout the nervous tissues, implicating them as potential neuromodulators. Furthermore, GnRH was found in non-neural tissues, including the stomach, gut, heart, testis and ovary, in the latter most prominently within secondary oocytes. The GnRH/corazonin receptor-like gene is specific to the ovary, whereas the receptor-like gene expression is more widespread. Administration of GnRH had no effect on ovarian development and maturation, nor any effect on total hemolymph lipid levels, while ACP administration decreased oocyte proliferation (at high dose) and stimulated a significant increase in total hemolymph lipids. In conclusion, our targeted analysis of the M. rosenbergii neural secretome has revealed the decapod GnRH and ACP genes. We propose that ACP in crustaceans plays a role in the lipid metabolism and the inhibition of oocyte proliferation, while the role of the GnRH remains to be clearly defined, possibly through experiments involving gene silencing.

Publication

Computational tools for copy number variation (CNV) detection using next-generation sequencing data: Features and perspectives

Publisher: Springer Science and Business Media LLC

Date: 09-2013

DOI: 10.1186/1471-2105-14-S11-S1

Publication

OCGene: a database of experimentally verified ovarian cancer-related genes with precomputed regulation information

Publisher: Springer Science and Business Media LLC

Date: 31-12-2015

DOI: 10.1038/CDDIS.2015.380

Publication

An evidence-based knowledgebase of metastasis suppressors to identify key pathways relevant to cancer metastasis

Publisher: Springer Science and Business Media LLC

Date: 21-10-2015

DOI: 10.1038/SREP15478

Abstract: Metastasis suppressor genes (MS genes) are genes that play important roles in inhibiting the process of cancer metastasis without preventing growth of the primary tumor. Identification of these genes and understanding their functions are critical for investigation of cancer metastasis. Recent studies on cancer metastasis have identified many new susceptibility MS genes. However, the comprehensive illustration of erse cellular processes regulated by metastasis suppressors during the metastasis cascade is lacking. Thus, the relationship between MS genes and cancer risk is still unclear. To unveil the cellular complexity of MS genes, we have constructed MSGene ( MSGene.bioinfo-minzhao.org/ ), the first literature-based gene resource for exploring human MS genes. In total, we manually curated 194 experimentally verified MS genes and mapped to 1448 homologous genes from 17 model species. Follow-up functional analyses associated 194 human MS genes with epithelium/tissue morphogenesis and epithelia cell proliferation. In addition, pathway analysis highlights the prominent role of MS genes in activation of platelets and coagulation system in tumor metastatic cascade. Moreover, global mutation pattern of MS genes across multiple cancers may reveal common cancer metastasis mechanisms. All these results illustrate the importance of MSGene to our understanding on cell development and cancer metastasis.

Publication

A gene browser of colorectal cancer with literature evidence and pre-computed regulatory information to identify key tumor suppressors and oncogenes

Publisher: Springer Science and Business Media LLC

Date: 08-2016

DOI: 10.1038/SREP30624

Abstract: Colorectal cancer (CRC) is a cancer of growing incidence that associates with a high mortality rate worldwide. There is a poor understanding of the heterogeneity of CRC with regard to causative genetic mutations and gene regulatory mechanisms. Previous studies have identified several susceptibility genes in small-scale experiments. However, the information has not been comprehensively and systematically compiled and interpreted. In this study, we constructed the gbCRC, the first literature-based gene resource for investigating CRC-related human genes. The features of our database include: (i) manual curation of experimentally-verified genes reported in the literature (ii) comprehensive integration of five reliable data sources and (iii) pre-computed regulatory patterns involving transcription factors, microRNAs and long non-coding RNAs. In total, 2067 genes associating with 2819 PubMed abstracts were compiled. Comprehensive functional annotations associated with all the genes, including gene expression profiles, homologous genes in other model species, protein-protein interactions, somatic mutations, and potential methylation sites. These comprehensive annotations and this pre-computed regulatory information highlighted the importance of the gbCRC with regard to the unexplored regulatory network of CRC. This information is available in a plain text format that is free to download.

Publication

Human liver rate-limiting enzymes influence metabolic flux via branch points and inhibitors

Publisher: Springer Science and Business Media LLC

Date: 2009

DOI: 10.1186/1471-2164-10-S3-S31

Publication

CNVannotator: A comprehensive annotation server for copy number variation in the human genome

Publisher: Public Library of Science (PLoS)

Date: 14-11-2013

DOI: 10.1371/JOURNAL.PONE.0080170

Publication

Pedican: an online gene resource for pediatric cancers with literature evidence

Publisher: Springer Science and Business Media LLC

Date: 15-06-2015

DOI: 10.1038/SREP11435

Abstract: Pediatric cancer (PC), that is cancer occurring in children, is the leading cause of death among children worldwide, with an incidence of 175,000 per year. Elucidating the genetic abnormalities and underlying cellular mechanisms may provide less toxic curative treatments. Therefore, it is important to understand the pathology of pediatric cancer at the genetic, genomic and epigenetic level. To unveil the cellular complexity of PC, we have developed a database of pediatric cancers (Pedican), the first literature-based pediatric gene data resource by comprehensive literature curation and data integration. In the current release, Pedican contains 735 human genes, 88 gene fusion and 24 chromosome abnormal events curated from 2245 PubMed abstracts. Pedican provides detailed annotations for each gene, such as Entrez gene information, involved pathways, protein–protein interactions, mutations, gene expression, methylation sites, TF regulation and post-translational modification. Additionally Pedican has a user-friendly web interface, which allows sophisticated text query, sequence searches and browsing by highlighted literature evidence and hundreds of cancer types. Overall, our curated pediatric cancer-related gene list maps the genomic and cellular landscape for various pediatric cancers, providing a valuable resource for further experiment design. The Pedican is available at pedican.bioinfo-minzhao.org/ .

Publication

CSGene: a literature-based database for cell senescence genes and its application to identify critical cell aging pathways and associated diseases

Publisher: Springer Science and Business Media LLC

Date: 14-01-2016

DOI: 10.1038/CDDIS.2015.414

Abstract: Cell senescence is a cellular process in which normal diploid cells cease to replicate and is a major driving force for human cancers and aging-associated diseases. Recent studies on cell senescence have identified many new genetic components and pathways that control cell aging. However, there is no comprehensive resource for cell senescence that integrates various genetic studies and relationships with cell senescence, and the risk associated with complex diseases such as cancer is still unexplored. We have developed the first literature-based gene resource for exploring cell senescence genes, CSGene. We complied 504 experimentally verified genes from public data resources and published literature. Pathway analyses highlighted the prominent roles of cell senescence genes in the control of rRNA gene transcription and unusual rDNA repeat that constitute a center for the stability of the whole genome. We also found a strong association of cell senescence with HIV-1 infection and viral carcinogenesis that are mainly related to promoter/enhancer binding and chromatin modification processes. Moreover, pan-cancer mutation and network analysis also identified common cell aging mechanisms in cancers and uncovered a highly modular network structure. These results highlight the utility of CSGene for elucidating the complex cellular events of cell senescence.

Publication

Reproducible combinatorial regulatory networks elucidate novel oncogenic microRNAs in non-small cell lung cancer

Publisher: Cold Spring Harbor Laboratory

Date: 14-07-2014

DOI: 10.1261/RNA.042754.113

Abstract: While previous studies reported aberrant expression of microRNAs (miRNAs) in non-small cell lung cancer (NSCLC), little is known about which miRNAs play central roles in NSCLC's pathogenesis and its regulatory mechanisms. To address this issue, we presented a robust computational framework that integrated matched miRNA and mRNA expression profiles in NSCLC using feed-forward loops. The network consists of miRNAs, transcription factors (TFs), and their common predicted target genes. To discern the biological meaning of their associations, we introduced the direction of regulation. A network edge validation strategy using three independent NSCLC expression profiling data sets pinpointed reproducible biological regulations. Reproducible regulation, which may reflect the true molecular interaction, has not been applied to miRNA–TF co-regulatory network analyses in cancer or other diseases yet. We revealed eight hub miRNAs that connected to a higher proportion of targets validated by independent data sets. Network analyses showed that these miRNAs might have strong oncogenic characteristics. Furthermore, we identified a novel miRNA–TF co-regulatory module that potentially suppresses the tumor suppressor activity of the TGF-β pathway by targeting a core pathway molecule (TGFBR2). Follow-up experiments showed two miRNAs (miR-9-5p and miR-130b-3p) in this module had increased expression while their target gene TGFBR2 had decreased expression in a cohort of human NSCLC. Moreover, we demonstrated these two miRNAs directly bind to the 3′ untranslated region of TGFBR2 . This study enhanced our understanding of miRNA–TF co-regulatory mechanisms in NSCLC. The combined bioinformatics and validation approach we described can be applied to study other types of diseases.

Publication

AutismKB: an evidence-based knowledgebase of autism genetics

Publisher: Oxford University Press (OUP)

Date: 12-2011

DOI: 10.1093/NAR/GKR1145

Publication

SynDB: a Synapse protein DataBase based on synapse ontology

Publisher: Oxford University Press (OUP)

Date: 03-01-2007

DOI: 10.1093/NAR/GKL876

Publication

dbLGL: an online leukemia gene and literature database for the retrospective comparison of adult and childhood leukemia genetics with literature evidence.

Publisher: Oxford University Press (OUP)

Date: 2018

DOI: 10.1093/DATABASE/BAY062

Publication

In silico neuropeptidome of female Macrobrachium rosenbergii based on transcriptome and peptide mining of eyestalk, central nervous system and ovary

Publisher: Public Library of Science (PLoS)

Date: 29-05-2015

DOI: 10.1371/JOURNAL.PONE.0123848

Publication

dbCPG: A web resource for cancer predisposition genes

Publisher: Impact Journals, LLC

Date: 12-05-2016

DOI: 10.18632/ONCOTARGET.9334

Publication

CMGene: A literature-based database and knowledge resource for cancer metastasis genes

Publisher: Elsevier BV

Date: 05-2017

DOI: 10.1016/J.JGG.2017.04.006

Publication

circVAR database: genome-wide archive of genetic variants for human circular RNAs.

Publisher: Springer Science and Business Media LLC

Date: 29-10-2020

DOI: 10.1186/S12864-020-07172-Y

Abstract: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at ircvar .

Publication

EDdb: A web resource for eating disorder and its application to identify an extended adipocytokine signaling pathway related to eating disorder

Publisher: Springer Science and Business Media LLC

Date: 12-2013

DOI: 10.1007/S11427-013-4573-2

Abstract: Eating disorder is a group of physiological and psychological disorders affecting approximately 1% of the female population worldwide. Although the genetic epidemiology of eating disorder is becoming increasingly clear with accumulated studies, the underlying molecular mechanisms are still unclear. Recently, integration of various high-throughput data expanded the range of candidate genes and started to generate hypotheses for understanding potential pathogenesis in complex diseases. This article presents EDdb (Eating Disorder database), the first evidence-based gene resource for eating disorder. Fifty-nine experimentally validated genes from the literature in relation to eating disorder were collected as the core dataset. Another four datasets with 2824 candidate genes across 601 genome regions were expanded based on the core dataset using different criteria (e.g., protein-protein interactions, shared cytobands, and related complex diseases). Based on human protein-protein interaction data, we reconstructed a potential molecular sub-network related to eating disorder. Furthermore, with an integrative pathway enrichment analysis of genes in EDdb, we identified an extended adipocytokine signaling pathway in eating disorder. Three genes in EDdb (ADIPO (adiponectin), TNF (tumor necrosis factor) and NR3C1 (nuclear receptor subfamily 3, group C, member 1)) link the KEGG (Kyoto Encyclopedia of Genes and Genomes) "adipocytokine signaling pathway" with the BioCarta "visceral fat deposits and the metabolic syndrome" pathway to form a joint pathway. In total, the joint pathway contains 43 genes, among which 39 genes are related to eating disorder. As the first comprehensive gene resource for eating disorder, EDdb ( eddb.cbi.pku.edu.cn ) enables the exploration of gene-disease relationships and cross-talk mechanisms between related disorders. Through pathway statistical studies, we revealed that abnormal body weight caused by eating disorder and obesity may both be related to dysregulation of the novel joint pathway of adipocytokine signaling. In addition, this joint pathway may be the common pathway for body weight regulation in complex human diseases related to unhealthy lifestyle.

Publication

Literature-based knowledgebase of pancreatic cancer gene to prioritize the key genes and pathways

Publisher: Elsevier BV

Date: 09-2016

DOI: 10.1016/J.JGG.2016.04.006

Publication

ONGene: A literature-based database for human oncogenes

Publisher: Elsevier BV

Date: 02-2017

DOI: 10.1016/J.JGG.2016.12.004

Publication

Synergetic regulatory networks mediated by oncogene-driven microRNAs and transcription factors in serous ovarian cancer

Publisher: Royal Society of Chemistry (RSC)

Date: 2013

DOI: 10.1039/C3MB70172G

Publication

The genome of the oyster Saccostrea offers insight into the environmental resilience of bivalves.

Publisher: Oxford University Press (OUP)

Date: 08-10-2018

DOI: 10.1093/DNARES/DSY032

Publication

Characterization of Schizophrenia Adverse Drug Interactions through a Network Approach and Drug Classification

Publisher: Hindawi Limited

Date: 2013

DOI: 10.1155/2013/458989

Abstract: Antipsychotic drugs are medications commonly for schizophrenia (SCZ) treatment, which include two groups: typical and atypical. SCZ patients have multiple comorbidities, and the coadministration of drugs is quite common. This may result in adverse drug-drug interactions, which are events that occur when the effect of a drug is altered by the coadministration of another drug. Therefore, it is important to provide a comprehensive view of these interactions for further coadministration improvement. Here, we extracted SCZ drugs and their adverse drug interactions from the DrugBank and compiled a SCZ-specific adverse drug interaction network. This network included 28 SCZ drugs, 241 non-SCZs, and 991 interactions. By integrating the Anatomical Therapeutic Chemical (ATC) classification with the network analysis, we characterized those interactions. Our results indicated that SCZ drugs tended to have more adverse drug interactions than other drugs. Furthermore, SCZ typical drugs had significant interactions with drugs of the “alimentary tract and metabolism” category while SCZ atypical drugs had significant interactions with drugs of the categories “nervous system” and “antiinfectives for systemic uses.” This study is the first to characterize the adverse drug interactions in the course of SCZ treatment and might provide useful information for the future SCZ treatment.

Publication

Identification of novel prognosis-related genes associated with cancer using integrative network analysis

Publisher: Springer Science and Business Media LLC

Date: 19-02-2018

DOI: 10.1038/S41598-018-21691-5

Abstract: Prognosis identifies the seriousness and the chances of survival of a cancer patient. However, it remains a challenge to identify the key cancer genes in prognostic studies. In this study, we collected 2064 genes that were related to prognostic studies by using gene expression measurements curated from published literatures. Among them, 1820 genes were associated with copy number variations (CNVs). The further functional enrichment on 889 genes with frequent copy number gains (CNGs) revealed that these genes were significantly associated with cancer pathways including regulation of cell cycle, cell differentiation and mitogen-activated protein kinase (MAPK) cascade. We further conducted integrative analyses of CNV and their target genes expression using the data from matched tumour s les of The Cancer Genome Atlas (TCGA). Ultimately, 95 key prognosis-related genes were extracted, with concordant CNG events and increased up-regulation in at least 300 tumour s les. These genes, and the number of s les in which they were found, included: ACTL6A (399), ATP6V1C1 (425), EBAG9 (412), FADD (308), MTDH (377), and SENP5 (304). This study provides the first observation of CNV in prognosis-related genes across pan-cancer. The systematic concordance between CNG and up-regulation of gene expression in these novel prognosis-related genes may indicate their prognostic significance.

Publication

ECGene: A Literature-Based Knowledgebase of Endometrial Cancer Genes

Publisher: Hindawi Limited

Date: 13-01-2016

DOI: 10.1002/HUMU.22950

Publication

eSnail: A transcriptome‐based molecular resource of the central nervous system for terrestrial gastropods

Publisher: Wiley

Date: 12-11-2018

DOI: 10.1111/1755-0998.12722

Abstract: To expand on emerging terrestrial gastropod molecular resources, we have undertaken transcriptome-based sequencing of the central nervous system (CNS) from six ecologically invasive terrestrial gastropods. Focusing on snail species Cochlicella acuta and Helix aspersa and reticulated slugs Deroceras invadens, Deroceras reticulatum, Lehmannia nyctelia and Milax gagates, we obtained a total of 367,869,636 high-quality reads and compared them with existing CNS transcript resources for the invasive Mediterranean snail, Theba pisana. In total, we obtained 419,289 unique transcripts (unigenes) from 1,410,569 assembled contigs, with blast search analysis of multiple protein databases leading to the annotation of 124,268 unigenes, of which 92,544 mapped to ncbi nonredundant protein databases. We found that these transcriptomes have representatives in most biological functions, based on comparison of gene ontology, kegg pathway and protein family contents, demonstrating a high range of transcripts responsible for regulating metabolic activities and molecular functions occurring within the CNS. To provide an accessible genetic resource, we also demonstrate the presence of 66,687 microsatellites and 304,693 single-nucleotide variants, which can be used for the design of potentially thousands of unique primers for functional screening. An online "eSnail" database with a user-friendly web interface was implemented to query all the information obtained herein (snail). We demonstrate the usefulness of the database through the mining of molluscan neuropeptides. As the most comprehensive CNS transcriptome resource for terrestrial gastropods, eSnail may serve as a useful gateway for researchers to explore gastropod CNS function for multiple purposes, including for the development of biocontrol approaches.

Publication

Proteomic analysis of the schistosoma mansoni miracidium

Publisher: Public Library of Science (PLoS)

Date: 22-01-2016

DOI: 10.1371/JOURNAL.PONE.0147247

Publication

Gene Dosage Analysis on the Single-Cell Transcriptomes Linking Cotranslational Protein Targeting to Metastatic Triple-Negative Breast Cancer.

Publisher: MDPI AG

Date: 10-09-2021

DOI: 10.3390/PH14090918

Abstract: Many recent efforts have been put into the association between expression heterogeneity and different cell types and states using single-cell RNA transcriptome analysis. There is only limited understanding of gene dosage effects for the genetic heterogeneity at the single-cell level. By focusing on concordant copy number variation (CNV) and expression, we presented a computational framework to explore dosage effect for aggressive metastatic triple-negative breast cancer (TNBC) at the single-cell level. In practice, we collected CNV and single-cell expression data from the same patients with independent technologies. By focusing on 47,198 consistent copy number gains (CNG) and gene up-regulation from 1145 single cells, ribosome proteins with important roles in protein targeting were enriched. Independent validation in another metastatic TNBC dataset further prioritized signal recognition particle-dependent protein targeting as the top functional module. More interesting, the increased ribosome gene copies in TNBC may associate with their enhanced stemness and metastatic potential. Indeed, the prioritization of a well-upregulated functional module confirmed by high copy numbers at the single-cell level and contributing to patient survival may indicate the possibility of targeted therapy based on ribosome proteins for TNBC.

Publication

Meta-analysis of gene expression studies in endometrial cancer identifies gene expression profiles associated with aggressive disease and patient outcome

Publisher: Springer Science and Business Media LLC

Date: 10-11-2016

DOI: 10.1038/SREP36677

Abstract: Although endometrioid endometrial cancer (EEC comprising ~80% of all endometrial cancers diagnosed) is typically associated with favourable patient outcome, a significant portion (~20%) of women with this subtype will relapse. We hypothesised that gene expression predictors of the more aggressive non-endometrioid endometrial cancers (NEEC) could be used to predict EEC patients with poor prognosis. To explore this hypothesis, we performed meta-analysis of 12 gene expression microarray studies followed by validation using RNA-Seq data from The Cancer Genome Atlas (TCGA) and identified 1,253 genes differentially expressed between EEC and NEEC. Analysis found 121 genes were associated with poor outcome among EEC patients. Forward selection likelihood-based modelling identified a 9-gene signature associated with EEC outcome in our discovery RNA-Seq dataset which remained significant after adjustment for clinical covariates, but was not significant in a smaller RNA-Seq dataset. Our study demonstrates the value of employing meta-analysis to improve the power of gene expression microarray data, and highlight genes and molecular pathways of importance for endometrial cancer therapy.

Publication

Mutational analysis revealed 97 key cancer metastasis genes from extracellular vesicles associated with patient survival

Publisher: Elsevier BV

Date: 12-2020

DOI: 10.1016/J.MGENE.2020.100781

Publication

Constructing a comprehensive gene co-expression based interactome in Bos taurus

Publisher: PeerJ

Date: 04-12-2017

DOI: 10.7717/PEERJ.4107

Abstract: Integrating genomic information into cattle breeding is an important approach to exploring genotype-phenotype relationships for complex traits related to diary and meat production. To assist with genomic-based selection, a reference map of interactome is needed to fully understand and identify the functional relevant genes. To this end, we constructed a co-expression analysis of 92 tissues and this represents the systematic exploration of gene-gene relationship in Bos taurus . By using robust WGCNA (Weighted Gene Correlation Network Analysis), we described the gene co-expression network of 5,000 protein-coding genes with majority variations in expression across 92 tissues. Further module identifications found 55 highly organized functional clusters representing erse cellular activities. To demonstrate the re-use of our interaction for functional genomics analysis, we extracted a sub-network associated with DNA binding genes in Bos taurus . The subnetwork was enriched within regulation of transcription from RNA polymerase II promoter representing central cellular functions. In addition, we identified 28 novel linker genes associated with more than 100 DNA binding genes. Our WGCNA-based co-expression network reconstruction will be a valuable resource for exploring the molecular mechanisms of incompletely characterized proteins and for elucidating larger-scale patterns of functional modulization in the Bos taurus genome.

Publication

circExp database: an online transcriptome platform for human circRNA expressions in cancers

Publisher: Oxford University Press (OUP)

Date: 2021

DOI: 10.1093/DATABASE/BAAB045

Abstract: Circular RNA (circRNA) is a highly stable, single-stranded, closed-loop RNA that works as RNA or as a protein decoy to regulate gene expression. In humans, thousands of circRNA transcriptional products precisely express in specific developmental stages, tissues and cell types. Due to their stability and specificity, circRNAs are ideal biomarkers for cancer diagnosis and prognosis. To provide an integrated and standardized circRNA expression profile for human cancers, we performed extensive data curation across 11 technical platforms, collecting 48 expression profile data sets for 18 cancer types and amassing 860 751 expression records. We also identified 189 193 differential expression signatures that are significantly different between normal and cancer s les. All the pre-calculated expression analysis results are organized into 132 plain text files for bulk download. Our online interface, circExp, provides data browsing and search functions. For each data set, a dynamic expression heatmap provides a profile overview. Based on the processed data, we found that 52 circRNAs were consistently and differentially expressed in 20 or more processed analyses. By mapping those circRNAs to their parent protein-coding genes, we found that they may have profoundly affected the survival of 10 797 patients in the The Cancer Genome Atlas pan-cancer data set. In sum, we developed circExp and demonstrated that it is useful to identify circRNAs that have potential diagnostic and prognostic significance for a variety of cancer types. In this online and reusable database, found at ircexp, we have provided pre-calculated expression data about circRNAs and their parental genes, as well as data browsing and searching functions. Database URL: ircexp/

Publication

First Insight into the Human Liver Proteome from PROTEOME^SKY-LIVER^Hu 1.0, a Publicly Available Database

Publisher: American Chemical Society (ACS)

Date: 03-09-2009

DOI: 10.1021/PR900532R

Abstract: Herein, we report proteome and transcriptome profiles of the human adult liver and present an initial analysis. Overall, the human liver proteome (HLP) data set comprises 6788 identified proteins with at least two peptides matches at 95% confidence, including 3721 proteins newly identified in liver. The human liver transcriptome (HLT) data set consists of 11 205 expressed genes. The HLP is the largest proteome data set for a human organ and is the first direct association between a proteome and its transcriptome derived from the same s le. Although it is hard to approach complete coverage of the HLP currently, several conclusions based on this data set are clearly reached: (1) The 5816 protein-encoding genes (PEGs) represented by the HLP and the 11 104 PEGs represented in the HLT have been identified from 20 070 PEGs in IPI Human v3.07 and 19 478 PEGs in the integrated human transcriptome database, respectively. (2) The patterns of chromosomal distribution of the genes corresponding to the HLP are highly consistent with those of the HLT. Some chromosomal regions, such as 16p13.3, 19q13.31, 19q13.42, and Xq28, exhibit particularly high densities of liver-specific genes, which perform the important functions related to normal physiology or/and pathology in this organ. (3) The HLP spans 6 orders of magnitude in relative protein abundance and 78% of the proteins fall in the middle of this range. Of newly identified liver proteins, 82.5% are of low abundance. (4) Proteins involving in metabolism, transport, and coagulation and those containing active domains for metabolism, transport, and biosynthesis are significantly enriched in liver. (5) All 94 metabolic pathways in KEGG are touched to different extent. Of which, for 48 pathways, particularly those involved in metabolism of carbohydrates and amino acids, more than 80% of the component proteins have been detected. The liver-specific pathways, such as those participating in metabolism of bile acid and bilirubin and in biotransformation, are identified with remarkably high coverage. A total of 31 members of the cytochrome P450 family are identified, four of which have been observed for the first time in human liver. (6) Transport proteins involved in energy metabolism and secretion of both protein and bile acid are highly abundant. Three ion channels are described for the first time in liver. (7) The 800 proteins related to signal transduction and primarily involved in cellular recognition, localization, communication, and inflammation are present in the HLP data set. Insulin and adipocytokine pathways, which are involved in the regulation of glucose and fatty acids, are highly covered. (8) Transcription factors (309 in total) have been recognized at relatively low detection rates and abundance however, transcription factors regulating gene expression related to transport, metabolism, and biosynthesis are detected at relatively higher coverage and the protein products of their target genes (100 in total), such as metabolic enzymes and plasma proteins, are also identified. (9) The overlap between the human liver and plasma proteomes is particularly noteworthy in the coagulation/anticoagulation/fibrinolysis and complement system. There is a significantly positive linear correlation between the abundance of coagulator proteins in liver and plasma.

Publication

Tertiary water striders (Hemiptera, Gerromorpha, Gerridae) from the central Tibetan Plateau and their palaeobiogeographic implications

Publisher: Elsevier BV

Date: 05-2019

DOI: 10.1016/J.JSEAES.2017.12.014

Publication

TSdb: A database of transporter substrates linking metabolic pathways and transporter systems on a genome scale via their shared substrates

Publisher: Springer Science and Business Media LLC

Date: 2011

DOI: 10.1007/S11427-010-4125-Y

Abstract: TSdb ( tsdb.cbi.pku.edu.cn ) is the first manually curated central repository that stores formatted information on the substrates of transporters. In total, 37608 transporters with 15075 substrates from 884 organisms were curated from UniProt functional annotation. A unique feature of TSdb is that all the substrates are mapped to identifiers from the KEGG Ligand compound database. Thus, TSdb links current metabolic pathway schema with compound transporter systems via the shared compounds in the pathways. Furthermore, all the transporter substrates in TSdb are classified according to their biochemical properties, biological roles and subcellular localizations. In addition to the functional annotation of transporters, extensive compound annotation that includes inhibitor information from the KEGG Ligand and BRENDA databases has been integrated, making TSdb a useful source for the discovery of potential inhibitory mechanisms linking transporter substrates and metabolic enzymes. User-friendly web interfaces are designed for easy access, query and download of the data. Text and BLAST searches against all transporters in the database are provided. We will regularly update the substrate data with evidence from new publications.

Publication

RLEdb: a database of rate-limiting enzymes and their regulation in human, rat, mouse, yeast and E. coli

Publisher: Springer Science and Business Media LLC

Date: 26-05-2009

DOI: 10.1038/CR.2009.61

Publication

Evidence for a saponin biosynthesis pathway in the body wall of the commercially significant sea cucumber Holothuria scabra

Publisher: MDPI AG

Date: 07-11-2017

DOI: 10.3390/MD15110349

Publication

A Genomics Resource for 12 Edible Seaweeds to Predict Seaweed-Secreted Peptides with Potential Anti-Cancer Function

Publisher: MDPI AG

Date: 04-10-2022

DOI: 10.3390/BIOLOGY11101458

Abstract: Seaweeds are multicellular marine macroalgae with natural compounds that have potential anticancer activity. To date, the identification of those compounds has relied on purification and assay, yet few have been documented. Additionally, the genomes and associated proteomes of edible seaweeds that have been identified thus far are scattered among different resources and with no systematic summary available, which hinders the development of a large-scale omics analysis. To enable this, we constructed a comprehensive genomics resource for the edible seaweeds. These data could be used for systematic metabolomics and a proteome search for anti-cancer compound and peptides. In brief, we integrated and annotated 12 publicly available edible seaweed genomes (8 species and 268,071 proteins). In addition, we integrate the new seaweed genomic resources with established cancer bioinformatics pipelines to help identify potential seaweed proteins that could help mitigate the development of cancer. We present 7892 protein domains that were predicted to be associated with cancer proteins based on a protein domain–domain interaction. The most enriched protein families were associated with protein phosphorylation and insulin signalling, both of which are recognised to be crucial molecular components for patient survival in various cancers. In addition, we found 6692 seaweed proteins that could interact with over 100 tumour suppressor proteins, of which 147 are predicted to be secreted proteins. In conclusion, our genomics resource not only may be helpful in exploring the genomics features of these edible seaweed but also may provide a new avenue to explore the molecular mechanisms for seaweed-associated inhibition of human cancer development.

Publication

Online database for brain cancer-implicated genes: exploring the subtype-specific mechanisms of brain cancer.

Publisher: Springer Science and Business Media LLC

Date: 18-06-2021

DOI: 10.1186/S12864-021-07793-X

Abstract: Brain cancer is one of the eight most common cancers occurring in people aged 40+ and is the fifth-leading cause of cancer-related deaths for males aged 40–59. Accurate subtype identification is crucial for precise therapeutic treatment, which largely depends on understanding the biological pathways and regulatory mechanisms associated with different brain cancer subtypes. Unfortunately, the subtype-implicated genes that have been identified are scattered in thousands of published studies. So, systematic literature curation and cross-validation could provide a solid base for comparative genetic studies about major subtypes. Here, we constructed a literature-based brain cancer gene database (BCGene). In the current release, we have a collection of 1421 unique human genes gathered through an extensive manual examination of over 6000 PubMed abstracts. We comprehensively annotated those curated genes to facilitate biological pathway identification, cancer genomic comparison, and differential expression analysis in various anatomical brain regions. By curating cancer subtypes from the literature, our database provides a basis for exploring the common and unique genetic mechanisms among 40 brain cancer subtypes. By further prioritizing the relative importance of those curated genes in the development of brain cancer, we identified 33 top-ranked genes with evidence mentioned only once in the literature, which were significantly associated with survival rates in a combined dataset of 2997 brain cancer cases. BCGene provides a useful tool for exploring the genetic mechanisms of and gene priorities in brain cancer. BCGene is freely available to academic users at cgene/ .

Publication

Identifying the Common Cell-Free DNA Biomarkers across Seven Major Cancer Types

Publisher: MDPI AG

Date: 29-06-2023

DOI: 10.3390/BIOLOGY12070934

Abstract: Blood-based detection of circulating cell-free DNA (cfDNA) is a non-invasive and easily accessible method for early cancer detection. Despite the extensive utility of cfDNA, there are still many challenges to developing clinical biomarkers. For ex le, cfDNA with genetic alterations often composes a small portion of the DNA circulating in plasma, which can be confounded by cfDNA contributed by normal cells. Therefore, filtering out the potential false-positive cfDNA mutations from healthy populations will be important for cancer-based biomarkers. Additionally, many low-frequency genetic alterations are easily overlooked in a small number of cfDNA-based cancer tests. We hypothesize that the combination of erse types of cancer studies on cfDNA will provide us with a new perspective on the identification of low-frequency genetic variants across cancer types for promoting early diagnosis. By building a standardized computational pipeline for 1358 cfDNA s les across seven cancer types, we prioritized 129 shard genetic variants in the major cancer types. Further functional analysis of the 129 variants found that they are mainly enriched in ribosome pathways such as cotranslational protein targeting the membrane, some of which are tumour suppressors, oncogenes, and genes related to cancer initiation. In summary, our integrative analysis revealed the important roles of ribosome proteins as common biomarkers in early cancer diagnosis.

Publication

A systems biology approach to identify intelligence quotient score-related genomic regions and pathways relevant to potential therapeutic treatments

Publisher: Springer Science and Business Media LLC

Date: 25-02-2014

DOI: 10.1038/SREP04176

Publication

GCGene: A gene resource for gastric cancer with literature evidence

Publisher: Impact Journals, LLC

Date: 26-04-2016

DOI: 10.18632/ONCOTARGET.9030

Publication

Application of omics research in seaweeds with a focus on red seaweeds

Publisher: Oxford University Press (OUP)

Date: 23-04-2021

DOI: 10.1093/BFGP/ELAB023

Abstract: Targeted ‘omics’ research for seaweeds, utilizing various computational and informatics frameworks, has the potential to rapidly develop our understanding of biological processes at the molecular level and contribute to solutions for the most pressing environmental and social issues of our time. Here, a systematic review into the current status of seaweed omics research was undertaken to evaluate the biological ersity of seaweed species investigated (red, green and brown phyla), the levels to which the work was undertaken (from full genome to transcripts, proteins or metabolites) and the field of research to which it has contributed. We report that from 1994 to 2021 the majority of seaweed omics research has been performed on the red seaweeds (45% of total studies), with more than half of these studies based upon two genera Pyropia and Gracilaria. A smaller number of studies examined brown seaweed (key genera Saccharina and Sargassum) and green seaweed (primarily Ulva). Overall, seaweed omics research is most highly associated with the field of evolution (46% of total studies), followed by the fields of ecology, natural products and their biosynthesis, omics methodology and seaweed–microbe interactions. Synthesis and specific outcomes derived from omics studies in the red seaweeds are provided. Together, these studies have provided a broad-scale interrogation of seaweeds, facilitating our ability to answer fundamental queries and develop applied outcomes. Crucial to the next steps will be establishing analytical tools and databases that can be more broadly utilized by practitioners and researchers across the globe because of their shared interest in the key seaweed genera.

Publication

UTP11 deficiency suppresses cancer development via nucleolar stress and ferroptosis

Publisher: Elsevier BV

Date: 06-2023

DOI: 10.1016/J.REDOX.2023.102705

Publication

A Proteomic Analysis for the Red Seaweed Asparagopsis taxiformis

Publisher: MDPI AG

Date: 20-01-2023

DOI: 10.3390/BIOLOGY12020167

Abstract: The red seaweed Asparagopsis taxiformis is a promising ruminant feed additive with anti-methanogenic properties that could contribute to global climate change solutions. Genomics has provided a strong foundation for in-depth molecular investigations, including proteomics. Here, we investigated the proteome of A. taxiformis (Lineage 6) in both sporophyte and gametophyte stages, using soluble and insoluble extraction methods. We identified 741 unique non-redundant proteins using a genome-derived database and 2007 using a transcriptome-derived database, which included numerous proteins predicted to be of fungal origin. We further investigated the genome-derived proteins to focus on seaweed-specific proteins. Ontology analysis indicated a relatively large proportion of ion-binding proteins (i.e., iron, zinc, manganese, potassium and copper), which may play a role in seaweed heavy metal tolerance. In addition, we identified 58 stress-related proteins (e.g., heat shock and vanadium-dependent haloperoxidases) and 44 photosynthesis-related proteins (e.g., phycobilisomes, photosystem I, photosystem II and ATPase), which were in general more abundantly identified from female gametophytes. Forty proteins were predicted to be secreted, including ten rhodophyte collagen-alpha-like proteins (RCAPs), which displayed overall high gene expression levels. These findings provide a comprehensive overview of expressed proteins in A. taxiformis, highlighting the potential for targeted protein extraction and functional characterisation for future biodiscovery.

Publication

The neuropeptidome of the Crown-of-Thorns Starfish, Acanthaster planci

Publisher: Elsevier BV

Date: 08-2017

DOI: 10.1016/J.JPROT.2017.05.026

Abstract: Outbreaks of Crown-of-Thorns Starfish (COTS Acanthaster planci) are a major cause of destruction of coral communities on the Australian Great Barrier Reef. While factors relating to population explosions and the social interactions of COTS have been well studied, little is known about the neural mechanisms underlying COTS physiology and behaviour. One of the major classes of chemical messengers that regulate physiological and behavioural processes in animals is the neuropeptides. Here, we have analysed COTS genome and transcriptome sequence data to identify neuropeptide precursor proteins in this species. A total of 48 neuropeptide precursors were identified, including homologs of neuropeptides that are evolutionarily conserved throughout the Bilateria, and others that are novel. Proteomic mass spectrometry was employed to confirm the presence of neuropeptides in extracts of radial nerve cords. These transcriptomic and proteomic resources provide a foundation for functional studies that will enable a better understanding of COTS physiology and behaviour, and may facilitate development of novel population biocontrol methods. The Crown-of-Thorns Starfish (COTS) is one of the primary factors leading to coral loss on the Great Barrier Reef, Australia. Our combined gene and proteomic findings of this study reveal the COTS neuropeptidome, including both echinoderm-like neuropeptides and novel putative neuropeptides. This represents the most comprehensive neuropeptidome for an echinoderm, contributing to the evolving knowledge of the COTS molecular neurobiology that may assist towards the development of biocontrol methods.

Publication

Changes in the neuropeptide content of Biomphalaria ganglia nervous system following Schistosoma infection

Publisher: Springer Science and Business Media LLC

Date: 02-06-2017

DOI: 10.1186/S13071-017-2218-1

Publication

WFDC3 inhibits tumor metastasis by promoting the ERβ-mediated transcriptional repression of TGFBR1 in colorectal cancer

Publisher: Springer Science and Business Media LLC

Date: 13-07-2023

DOI: 10.1038/S41419-023-05956-0

Abstract: Estrogen plays a protective role in colorectal cancer (CRC) and primarily functions through estrogen receptor β (ERβ). However, clinical strategies for CRC therapy associated with ERβ are still under investigation. Our discoveries identified WFDC3 as a tumor suppressor that facilitates estrogen-induced inhibition of metastasis through the ERβ/TGFBR1 signaling axis. WFDC3 interacts with ERβ and increases its protein stability by inhibiting its proteasome-dependent degradation. WFDC3 represses TGFBR1 expression through ERβ-mediated transcription. Blocking TGFβ signaling with galunisertib, a drug used in clinical trials that targets TGFBR1, impaired the migration of CRC cells induced by WFDC3 depletion. Moreover, there was clinical significance to WFDC3 in CRC, as CRC patients with high WFDC3 expression in tumor cells had favorable prognoses. Therefore, this work suggests that WFDC3 could be an indicator for therapies targeting the estrogen/ERβ pathway in CRC patients.

Publication

Single-cell sequencing reveals the potential oncogenic expression atlas of human iPSC-derived cardiomyocytes.

Publisher: The Company of Biologists

Date: 15-02-2021

DOI: 10.1242/BIO.053348

Abstract: Human induced pluripotent stem cells (iPSCs) are important source for regenerative medicine. However, the links between pluripotency and oncogenic transformation raise safety issues. To understand the characteristics of iPSC-derived cells at single-cell resolution, we directly reprogrammed two human iPSC lines into cardiomyocytes and collected cells from four time points during cardiac differentiation for single-cell sequencing. We captured 32,365 cells and identified five molecularly distinct clusters that aligned well with our reconstructed differentiation trajectory. We discovered a set of dynamic expression events related to the upregulation of oncogenes and the decreasing expression of tumor suppressor genes during cardiac differentiation, which were similar to the gain-of-function and loss-of-function patterns during oncogenesis. In practice, we characterized the dynamic expression of the TP53 and Yamanaka factor genes (OCT4, SOX2, KLF4 and MYC), which were widely used for human iPSCs lines generation and revealed the co-occurrence of MYC overexpression and TP53 silencing in some of human iPSC-derived TNNT2+ cardiomyocytes. In summary, our oncogenic expression atlas is valuable for human iPSCs application and the single-cell resolution highlights the clues potentially associated with the carcinogenic risk of human iPSC-derived cells.

Publication

The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest

Publisher: Springer Science and Business Media LLC

Date: 04-2017

DOI: 10.1038/NATURE22033

Abstract: The crown-of-thorns starfish (COTS, the Acanthaster planci species group) is a highly fecund predator of reef-building corals throughout the Indo-Pacific region. COTS population outbreaks cause substantial loss of coral cover, diminishing the integrity and resilience of reef ecosystems. Here we sequenced genomes of COTS from the Great Barrier Reef, Australia and Okinawa, Japan to identify gene products that underlie species-specific communication and could potentially be used in biocontrol strategies. We focused on water-borne chemical plumes released from aggregating COTS, which make the normally sedentary starfish become highly active. Peptide sequences detected in these plumes by mass spectrometry are encoded in the COTS genome and expressed in external tissues. The exoproteome released by aggregating COTS consists largely of signalling factors and hydrolytic enzymes, and includes an expanded and rapidly evolving set of starfish-specific ependymin-related proteins. These secreted proteins may be detected by members of a large family of olfactory-receptor-like G-protein-coupled receptors that are expressed externally, sometimes in a sex-specific manner. This study provides insights into COTS-specific communication that may guide the generation of peptide mimetics for use on reefs with COTS outbreaks.

Publication

A Database of Lung Cancer-Related Genes for the Identification of Subtype-Specific Prognostic Biomarkers

Publisher: MDPI AG

Date: 24-02-2023

DOI: 10.3390/BIOLOGY12030357

Abstract: The molecular subtype is critical for accurate treatment and follow-up in patients with lung cancer however, information regarding subtype-associated genes is dispersed among thousands of published studies. Systematic curation and cross-validation of the scientific literature would provide a solid foundation for comparative genetic studies of the major molecular subtypes of lung cancer. Here, we constructed a literature-based lung cancer gene database (LCGene). In the current release, we collected and curated 2507 unique human genes, including 2267 protein-coding and 240 non-coding genes from comprehensive manual examination of 10,960 PubMed article abstracts. Extensive annotations were added to aid identification of differentially expressed genes, potential gene editing sites, and non-coding gene regulation. For instance, we prepared 607 curated genes with CRISPR knockout information in 43 lung cancer cell lines. Further comparison of these implicated genes among different subtypes identified several subtype-specific genes with high mutational frequencies. Common tumor suppressors and oncogenes shared by lung adenocarcinoma and lung squamous cell carcinoma, for ex le, exhibited different mutational frequencies and prognostic features, suggesting the presence of subtype-specific biomarkers. Our retrospective analysis revealed 43 small cell lung cancer-specific genes. Moreover, 52 tumor suppressors and oncogenes shared by lung adenocarcinoma and squamous cell carcinoma confirmed the different molecular mechanisms of these two cancer subtypes. The subtype-based genetic differences, when combined, may provide insight into subtype-specific biomarkers for genetic testing.

Publication

High similarity of phylogenetic profiles of rate-limiting enzymes with inhibitory relation in Human, Mouse, Rat, budding Yeast and E. coli

Publisher: Springer Science and Business Media LLC

Date: 2011

DOI: 10.1186/1471-2164-12-S3-S10

Publication

A stop-gain mutation in GXYLT1 promotes metastasis of colorectal cancer via the MAPK pathway

Publisher: Springer Science and Business Media LLC

Date: 22-04-2022

DOI: 10.1038/S41419-022-04844-3

Abstract: Genomic instability plays a key role in the initiation and progression of colorectal cancer (CRC). Although cancer driver genes in CRC have been well characterized, identifying novel genes associated with carcinogenesis and treatment remains challenging because of tumor heterogeneity. Here, we analyzed the genomic alterations of 45 s les from CRC patients in northern China by whole-exome sequencing. In addition to the identification of six well-known CRC driver genes ( APC , TP53 , KRAS , FBXW7 , PIK3CA , and PABPC ), two tumor-related genes ( MTCH2 and HSPA6 ) were detected, along with RRP7A and GXYLT1 , which have not been previously linked to cancer. GXYLT1 was mutated in 40% (18/45) of the s les in our cohort. Functionally, GXYLT1 promoted migration and invasion in vitro and metastasis in vivo, while the GXYLT1 S212* mutant induced significantly greater effect. Furthermore, both GXYLT1 and GXYLT1 S212* interacted with ERK2. GXYLT1 induced metastasis via a mechanism involving the Notch and MAPK pathways, whereas the GXYLT1 S212* mutant mainly promoted metastasis by activating the MAPK pathway. We propose that GXYLT1 acts as a novel metastasis-associated driver gene and GXYLT1 S212* might serve as a potential indicator for therapies targeting the MAPK pathway in CRC.

Publication

dbEMT 2.0: An updated database for epithelial-mesenchymal transition genes with experimentally verified information and precalculated regulation information for cancer metastasis

Publisher: Elsevier BV

Date: 12-2019

DOI: 10.1016/J.JGG.2019.11.010

Publication

Integrative analysis to identify oncogenic gene expression changes associated with copy number variations of enhancer in ovarian cancer

Publisher: Impact Journals, LLC

Date: 23-09-2017

DOI: 10.18632/ONCOTARGET.21227

Publication

Analysis of rhodopsin G protein-coupled receptor orthologs reveals semiochemical peptides for parasite (Schistosoma mansoni) and host (Biomphalaria glabrata) interplay.

Publisher: Springer Science and Business Media LLC

Date: 17-05-2022

DOI: 10.1038/S41598-022-11996-X

Abstract: Schistosomiasis is a medically significant disease caused by helminth parasites of the genus Schistosoma . The schistosome life cycle requires chemically mediated interactions with an intermediate (aquatic snail) and definitive (human) host. Blocking parasite development within the snail stage requires improved understanding of the interactions between the snail host and the Schistosoma water-borne free-living form (miracidium). Innovations in snail genomics and aquatic chemical communication provide an ideal opportunity to explore snail-parasite coevolution at the molecular level. Rhodopsin G protein-coupled receptors (GPCRs) are of particular interest in studying how trematode parasites navigate towards their snail hosts. The potential role of GPCRs in parasites makes them candidate targets for new antihelminthics that disrupt the intermediate host life-cycle stages, thus preventing subsequent human infections. A genomic-bioinformatic approach was used to identify GPCR orthologs between the snail Biomphalaria glabrata and miracidia of its obligate parasite Schistosoma mansoni. We show that 8 S. mansoni rhodopsin GPCRs expressed within the miracidial stage share overall amino acid similarity with 8 different B. glabrata rhodopsin GPCRs, particularly within transmembrane domains, suggesting conserved structural features. These GPCRs include an orphan peptide receptor as well as several with strong sequence homologies with rhabdomeric opsin receptors, a serotonin receptor, a sulfakinin (SK) receptor, an allatostatin-A (buccalin) receptor and an FMRFamide receptor. Buccalin and FMRFa peptides were identified in water conditioned by B. glabrata , and we show synthetic buccalin and FMRFa can stimulate significant rates of change of direction and turn-back responses in S. mansoni miracidia. Ortholog GPCRs were identified in S. mansoni miracidia and B. glabrata . These GPCRs may detect similar ligands, including snail-derived odorants that could facilitate miracidial host finding. These results lay the foundation for future research elucidating the mechanisms by which GPCRs mediate host finding which can lead to the potential development of novel anti-schistosome interventions.

Publication

Genome-wide transcriptomics and copy number profiling identify patient-specific CNV-lncRNA-mRNA regulatory triplets in colorectal cancer

Publisher: Elsevier BV

Date: 02-2023

DOI: 10.1016/J.COMPBIOMED.2023.106545

Publication

Human transporter database: Comprehensive knowledge and discovery tools in the human transporter genes

Publisher: Public Library of Science (PLoS)

Date: 18-02-2014

DOI: 10.1371/JOURNAL.PONE.0088883

Publication

Molecular insights into land snail neuropeptides through transcriptome and comparative gene analysis

Publisher: Springer Science and Business Media LLC

Date: 17-04-2015

DOI: 10.1186/S12864-015-1510-8

Publication

Identification of consistent post-translational regulatory triplets related to oncogenic and tumour suppressive modulators in childhood acute lymphoblastic leukemia

Publisher: PeerJ

Date: 14-07-2021

DOI: 10.7717/PEERJ.11803

Abstract: Acute lymphoblastic leukemia (ALL) is the most common type of childhood cancer. It can be caused by mutations that turn on oncogenes or turn off tumour suppressor genes. For instance, changes in certain genes including Rb and p53 are common in ALL cells. Oncogenes and TSGs may serve as a modulator gene to regulate the gene expression level via their respective target genes. To investigate the regulatory relationship between oncogenes, tumour suppressor genes and transcription factors at the post translational level in childhood ALL, we performed an integrative network analysis on the gene regulation in the post-translational level for childhood ALL based on many publicly available cancer gene expression data including TARGET and GEO database. We collected 259 childhood ALL-related genes from the latest online leukemia database, Leukemia Gene Literature Database. These 259 genes were selected from a comprehensive systematic literature with experimental evidences. The identified and curated genes were also associated with patient survival cases and we incorporated this pediatric ALL-related gene list into our analysis. We extracted the known human TFs from the TRRUST database. Among 259 childhood ALL-related genes, 101 unique regulators were mapped to the list of oncogene and tumour suppressor genes (TSGs) from the ONGene and the TSGene databases, and these included 74 TSGs, 62 oncogenes and 46 TF genes. The resulted regulation was presented as a hierarchical regulatory network with transcription factors (TFs) as intermediate regulators connecting the top modulators (oncogene and TSGs) to the common target genes. Cross-validation was applied to the results from the TARGET dataset by identifying the consistent regulatory motifs based on three independent ALL expression datasets. A three-layer regulatory network of consistent positive modulators in childhood ALL was constructed in which 74 modulators (40 oncogenes, 34 TSGs) are considered as the most important regulators. The middle layer and the bottom layer contain 34 TFs and 176 target genes, respectively. Oncogenes mostly participated in positive regulation of gene expression and the transcription process of RNA II polymerase, while TSGs were mainly involved in the negative regulation of gene expression. In addition, the oncogene-specific targets were enriched with regulators of the MAPK cascade while tumour suppressor-specific targets were associated with cell death. The results revealed that oncogenes and TSGs possess a different functional regulatory pattern with regard to not only their biological functions but also their specific target genes in childhood ALL cancer progression. Taken together, our findings could contribute to a better understanding of the important regulatory mechanisms and this method could be used to analyse the targeted genes at the post-translational level in childhood ALL through integrative network analysis.

Publication

Concordance of copy number loss and down-regulation of tumor suppressor genes: a pan-cancer study

Publisher: Springer Science and Business Media LLC

Date: 08-2016

DOI: 10.1186/S12864-016-2904-Y

Publication

Integrative proteomic analysis reveals potential high-frequency alternative open reading frame-encoded peptides in human colorectal cancer

Publisher: Elsevier BV

Date: 12-2018

DOI: 10.1016/J.LFS.2018.11.018

Abstract: Identification of alternative open reading frame-encoded peptides (AEPs) for the diagnosis of colorectal cancer at the proteome level is largely unexplored because of a lack of comprehensive proteomics data. Here, we performed a comprehensive integrative analysis of mass spectral data published by Clinical Proteomic Tumor Analysis Consortium and characterized 93 high-confident AEPs encoded within 75 genes. There are four cancer-related genes appeared to have AEPs identified frequently in >20 out of 95 colorectal cancer s les, including ABCF2, AR, RBM10 and NRG1. Further network analysis of the identified AEPs found the enrichment of novel AEPs within hormone androgen receptor and a highly-modularised network with 42 genes associated with patient survival. Our results not only suggested a mechanistic view of how AEPs work in cancer progression, but also shed light on somatic amino acid mutations in AEPs, which might be overlooked previously because of their low frequencies. In particular, potential high-frequency mutations in 77 s les associated with EDARADD may contribute to the discovery of new biomarkers and the development of innovative therapeutic approaches.

Publication

Proteomic analysis of the venom and venom sac of the woodwasp, Sirex noctilio - Towards understanding its biological impact

Publisher: Elsevier BV

Date: 09-2016

DOI: 10.1016/J.JPROT.2016.07.002

Abstract: The European horntail woodwasp, Sirex noctilio, is an invasive insect that attacks conifer hosts, particularly Pinus species. Venom injected by female S. noctilio, together with its symbiotic fungus, damages the normal physiology of Pinus, leading to death of the tree. To identify the proteinaceous components in the venom and uncover the interplay between venom proteins and tree proteins, clarification of the overall profile of proteins produced in the venom gland apparatus was carried out in this work. The venom sac proteome utilised in-solution digested in either a natural or deglycosylated state, prior to nanoHPLC LTQ-Orbitrap under CID/ETD mode. Here, we report the identification of 1454 and 1225 proteins in venom and sac, respectively, with 410 mutual proteins. Approximately 90 proteins were predicted to be secretory, of which 8 have features characteristic of toxins. Chemosensory binding proteins were also identified. Gene ontology and KEGG pathway analysis were employed to predict the protein functions and biological pathways in venom and sac. Protein-protein interaction (PPI) analysis suggested that one-step responses represent the majority of the Sirex-Pinus PPIs, and the proteins representing network hub nodes could be of importance for the development of pest management strategies. The woodwasp Sirex noctilio is an invasive species in many parts of the world, including Australia and North America, where it is considered within the top 10 most serious forest insects. Where they have been introduced, the female woodwasps attack living pine trees, causing significant economic losses. Central to this destruction is the woodwasp's life cycle requirement to bore a hole to deposit eggs and a toxic mucus that disables the tree's network for transporting water and nutrients, yet aids in larval survival. Here we specifically examine the mucus gland apparatus and its contents, revealing the protein components that together with 'noctilisin' facilitate this complex association. The identification of chemosensory binding proteins further supports a role for the woodwasp ovipositor as an instrument for early stages of host tree selection. These findings could provide important clues towards the development of novel control tools against this pest.

Publication

Copy number alteration of neuropeptides and receptors in multiple cancers

Publisher: Springer Science and Business Media LLC

Date: 04-07-2017

DOI: 10.1038/S41598-017-04832-0

Abstract: Neuropeptides are peptide hormones used as chemical signals by the neuroendocrine system to communicate between cells. Recently, neuropeptides have been recognized for their ability to act as potent cellular growth factors on many cell types, including cancer cells. However, the molecular mechanism for how this occurs is unknown. To clarify the relationship between neuropeptides and cancer, we manually curated a total of 127 human neuropeptide genes by integrating information from the literature, homologous sequences, and database searches. Using human ligand-receptor interaction data, we first identified an interactome of 226 interaction pairs between 93 neuropeptides and 133 G-protein coupled receptors. We further identified four neuropeptide-receptor functional modules with ten or more genes, all of which were highly mutated in multiple cancers. We have identified a number of neuropeptide signaling systems with both oncogenic and tumour-suppressing roles for cancer progression, such as the insulin-like growth factors. By focusing on the neuroendocrine prostate cancer mutational data, we found prevalent lification of neuropeptide and receptors in about 72% of s les. In summary, we report the first observation of abundant copy number variations on neuropeptides and receptors, which will be valuable for the design of peptide-based cancer prognosis, diagnosis and treatment.

Publication

Fragmented mitochondrial genomes of seal lice (family Echinophthiriidae) and gorilla louse (family Pthiridae): frequent minichromosomal splits and a host switch of lice between seals.

Publisher: Springer Science and Business Media LLC

Date: 08-04-2022

DOI: 10.1186/S12864-022-08530-8

Abstract: The mitochondrial (mt) genomes of 15 species of sucking lice from seven families have been studied to date. These louse species have highly dynamic, fragmented mt genomes that differ in the number of minichromosomes, the gene content, and gene order in a minichromosome between families and even between species of the same genus. In the present study, we analyzed the publicly available data to understand mt genome fragmentation in seal lice (family Echinophthiriidae) and gorilla louse, Pthirus gorillae (family Pthiridae), in particular the role of minichromosome split and minichromosome merger in the evolution of fragmented mt genomes. We show that 1) at least three ancestral mt minichromosomes of sucking lice have split in the lineage leading to seal lice, 2) one minichromosome ancestral to primate lice has split in the lineage to the gorilla louse, and 3) two ancestral minichromosomes of seal lice have merged in the lineage to the northern fur seal louse. Minichromosome split occurred 15-16 times in total in the lineages leading to species in six families of sucking lice investigated. In contrast, minichromosome merger occurred only four times in the lineages leading to species in three families of sucking lice. Further, three ancestral mt minichromosomes of sucking lice have split multiple times independently in different lineages of sucking lice. Our analyses of mt karyotypes and gene sequences also indicate the possibility of a host switch of crabeater seal louse to Weddell seals. We conclude that: 1) minichromosome split contributes more than minichromosome merger in mt genome fragmentation of sucking lice, and 2) mt karyotype comparison helps understand the phylogenetic relationships between sucking louse species.

Publication

Deciphering Signaling Pathway Networks to Understand the Molecular Mechanisms of Metformin Action

Publisher: Public Library of Science (PLoS)

Date: 17-06-2015

DOI: 10.1371/JOURNAL.PCBI.1004202

Publication

Multiomics analysis of the giant triton snail salivary gland, a crown-of-thorns starfish predator

Publisher: Springer Science and Business Media LLC

Date: 20-07-2017

DOI: 10.1038/S41598-017-05974-X

Abstract: The giant triton snail ( Charonia tritonis ) is one of the few natural predators of the adult Crown-of-Thorns starfish (COTS), a corallivore that has been damaging to many reefs in the Indo-Pacific. Charonia species have large salivary glands (SGs) that are suspected to produce either a venom and/or sulphuric acid which can immobilize their prey and neutralize the intrinsic toxic properties of COTS. To date, there is little information on the types of toxins produced by tritons. In this paper, the predatory behaviour of the C. tritonis is described. Then, the C. tritonis SG, which itself is made up of an anterior lobe (AL) and posterior lobe (PL), was analyzed using an integrated transcriptomics and proteomics approach, to identify putative toxin- and feeding-related proteins. A de novo transcriptome database and in silico protein analysis predicts that ~3800 proteins have features consistent with being secreted. A gland-specific proteomics analysis confirmed the presence of numerous SG-AL and SG-PL proteins, including those with similarity to cysteine-rich venom proteins. Sulfuric acid biosynthesis enzymes were identified, specific to the SG-PL. Our analysis of the C. tritonis SG (AL and PL) has provided a deeper insight into the biomolecular toolkit used for predation and feeding by C. tritonis .

Publication

Biomolecular changes that occur in the antennal gland of the giant freshwater prawn (Machrobrachium rosenbergii)

Publisher: Public Library of Science (PLoS)

Date: 29-06-2017

DOI: 10.1371/JOURNAL.PONE.0177064

Publication

Comprehensive analyses of tumor suppressor genes in protein-protein interaction networks: A topological perspective

Publisher: IEEE

Date: 12-2012

DOI: 10.1109/GENSIPS.2012.6507738

Min Zhao

Researcher

Research Topics

Top 5 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

Mutational analysis of driver genes with tumor suppressive and oncogenic roles in gastric cancer

Bioinformatic investigation and functional analysis of 214 hereditary genes identified non-coding RNAs as therapeautic tool for breast cancer management

GPCR and IR genes in Schistosoma mansoni miracidia

A pan-cancer study of copy number gain and up-regulation in human oncogenes

Comparative study of excretory-secretory proteins released by Schistosoma mansoni-resistant, susceptible and naïve Biomphalaria glabrata.

The pan-cancer analysis of gain-of-functional mutations to identify the common oncogenic signatures in multiple cancers.

Early Miocene elevation in northern Tibet estimated by palaeobotanical evidence

Cellular Metabolic Network Analysis: Discovering Important Reactions inTreponema pallidum

TSGene: a web resource for tumor suppressor genes

Expression of epithelial-mesenchymal transition-related genes increases with copy number in multiple cancer types

Greenlip Abalone (Haliotis laevigata) Genome and Protein Analysis Provides Insights into Maturation and Spawning.

REGene: a literature-based knowledgebase of animal regeneration that bridge tissue regeneration and cancer

lnCaNet: pan-cancer co-expression network for human lncRNA and cancer genes

Multi-tissue transcriptomics for construction of a comprehensive gene resource for the terrestrial snail Theba pisana

IQdb: an intelligence quotient score-associated gene resource for human intelligence

Identifying the Common Cell-Free DNA Biomarkers across Seven Major Cancer Types

dbEMT: an epithelial-mesenchymal transition associated gene resource

Consistent analysis of differentially expressed genes across 7 cell types in papillary thyroid carcinoma

TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes

CIGene: a literature-based online resource for cancer initiation genes

Distinct and Competitive Regulatory Patterns of Tumor Suppressor Genes and Oncogenes in Ovarian Cancer

Integrative analysis of common genes and driver mutations implicated in hormone stimulation for four cancers in women

Decode the Stable Cell Communications Based on Neuropeptide-Receptors Network in 36746 Tumor Cells

Neuropeptides encoded by the genomes of the akoya pearl oyster pinctata fucata and pacific oyster crassostrea gigas: A bioinformatic and peptidomic survey

METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text

Concordance between somatic copy number loss and down-regulated expression: A pan-cancer study of cancer predisposition genes

Exploring the role of post-translational modulators of transcription factors in triple-negative breast cancer gene expression

The bioinformatics tools for the genome assembly and analysis based on third-generation sequencing.

Systematic review of next-generation sequencing simulators: computational tools, features and perspectives

PathLocdb: a comprehensive database for the subcellular localization of metabolic pathways and its application to multiple localization analysis

Gonadotropin-releasing hormone and adipokinetic hormone/corazonin-related peptide in the female prawn

Computational tools for copy number variation (CNV) detection using next-generation sequencing data: Features and perspectives

OCGene: a database of experimentally verified ovarian cancer-related genes with precomputed regulation information

An evidence-based knowledgebase of metastasis suppressors to identify key pathways relevant to cancer metastasis

A gene browser of colorectal cancer with literature evidence and pre-computed regulatory information to identify key tumor suppressors and oncogenes

Human liver rate-limiting enzymes influence metabolic flux via branch points and inhibitors

CNVannotator: A comprehensive annotation server for copy number variation in the human genome

Pedican: an online gene resource for pediatric cancers with literature evidence

CSGene: a literature-based database for cell senescence genes and its application to identify critical cell aging pathways and associated diseases

Reproducible combinatorial regulatory networks elucidate novel oncogenic microRNAs in non-small cell lung cancer

AutismKB: an evidence-based knowledgebase of autism genetics

SynDB: a Synapse protein DataBase based on synapse ontology

dbLGL: an online leukemia gene and literature database for the retrospective comparison of adult and childhood leukemia genetics with literature evidence.

In silico neuropeptidome of female Macrobrachium rosenbergii based on transcriptome and peptide mining of eyestalk, central nervous system and ovary

dbCPG: A web resource for cancer predisposition genes

CMGene: A literature-based database and knowledge resource for cancer metastasis genes

circVAR database: genome-wide archive of genetic variants for human circular RNAs.

EDdb: A web resource for eating disorder and its application to identify an extended adipocytokine signaling pathway related to eating disorder

Literature-based knowledgebase of pancreatic cancer gene to prioritize the key genes and pathways

ONGene: A literature-based database for human oncogenes

Synergetic regulatory networks mediated by oncogene-driven microRNAs and transcription factors in serous ovarian cancer

The genome of the oyster Saccostrea offers insight into the environmental resilience of bivalves.

Characterization of Schizophrenia Adverse Drug Interactions through a Network Approach and Drug Classification

Identification of novel prognosis-related genes associated with cancer using integrative network analysis

ECGene: A Literature-Based Knowledgebase of Endometrial Cancer Genes

eSnail: A transcriptome‐based molecular resource of the central nervous system for terrestrial gastropods

Proteomic analysis of the schistosoma mansoni miracidium

Gene Dosage Analysis on the Single-Cell Transcriptomes Linking Cotranslational Protein Targeting to Metastatic Triple-Negative Breast Cancer.

Meta-analysis of gene expression studies in endometrial cancer identifies gene expression profiles associated with aggressive disease and patient outcome

Mutational analysis revealed 97 key cancer metastasis genes from extracellular vesicles associated with patient survival

Constructing a comprehensive gene co-expression based interactome in Bos taurus

circExp database: an online transcriptome platform for human circRNA expressions in cancers

First Insight into the Human Liver Proteome from PROTEOMESKY-LIVERHu 1.0, a Publicly Available Database

Tertiary water striders (Hemiptera, Gerromorpha, Gerridae) from the central Tibetan Plateau and their palaeobiogeographic implications

TSdb: A database of transporter substrates linking metabolic pathways and transporter systems on a genome scale via their shared substrates

RLEdb: a database of rate-limiting enzymes and their regulation in human, rat, mouse, yeast and E. coli

Evidence for a saponin biosynthesis pathway in the body wall of the commercially significant sea cucumber Holothuria scabra

A Genomics Resource for 12 Edible Seaweeds to Predict Seaweed-Secreted Peptides with Potential Anti-Cancer Function

Online database for brain cancer-implicated genes: exploring the subtype-specific mechanisms of brain cancer.

Identifying the Common Cell-Free DNA Biomarkers across Seven Major Cancer Types

A systems biology approach to identify intelligence quotient score-related genomic regions and pathways relevant to potential therapeutic treatments

GCGene: A gene resource for gastric cancer with literature evidence

First Insight into the Human Liver Proteome from PROTEOME^SKY-LIVER^Hu 1.0, a Publicly Available Database