ARDC Research Link Australia

ORCID Profile
Orcid icon. 0000-0002-1735-2630

Current Organisations
Wellcome Sanger Institute , University of Cambridge

Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.

Publications

Publication

A Novel Edge Weighting Method to Enhance Network Community Detection

Publisher: IEEE

Date: 10-2016

DOI: 10.1109/SMC.2015.42

Publication

Simulating the dynamics of targeted capture sequencing with CapSim

Publisher: Oxford University Press (OUP)

Date: 28-10-2018

DOI: 10.1093/BIOINFORMATICS/BTX691

Abstract: Targeted sequencing using capture probes has become increasingly popular in clinical applications due to its scalability and cost-effectiveness. The approach also allows for higher sequencing coverage of the targeted regions resulting in better analysis statistical power. However, because of the dynamics of the hybridization process, it is difficult to evaluate the efficiency of the probe design prior to the experiments which are time consuming and costly. We developed CapSim, a software package for simulation of targeted sequencing. Given a genome sequence and a set of probes, CapSim simulates the fragmentation, the dynamics of probe hybridization and the sequencing of the captured fragments on Illumina and PacBio sequencing platforms. The simulated data can be used for evaluating the performance of the analysis pipeline, as well as the efficiency of the probe design. Parameters of the various stages in the sequencing process can also be evaluated in order to optimize the experiments. CapSim is publicly available under BSD license at github.com/Devika1/capsim. Supplementary data are available at Bioinformatics online.

Publication

Development of computational tools for analysis of polyploid plant genomes with application to hexaploid sweetpotato

Publisher: University of Queensland Library

Date: 2019

DOI: 10.14264/UQL.2019.320

Publication

High-throughput multiplexed tandem repeat genotyping using targeted long-read sequencing

Publisher: Cold Spring Harbor Laboratory

Date: 17-06-2019

DOI: 10.1101/673251

Abstract: Tandem repeats (TRs) are highly prone to variation in copy numbers due to their repetitive and unstable nature, which makes them a major source of genomic variation between in iduals. However, population variation of TRs have not been widely explored due to the limitations of existing tools, which are either low-throughput or restricted to a small subset of TRs. Here, we used SureSelect targeted sequencing approach combined with Nanopore sequencing to overcome these limitations. We achieved an average of 3062-fold target enrichment on a panel of 142 TR loci, generating an average of 97X sequence coverage on 7 s les utilizing 2 MinION flow-cells with 200ng of input DNA per s le. We identified a subset of 110 TR loci with length less than 2kb, and GC content greater than 25% for which we achieved an average genotyping rate of 75% and increasing to 91% for the highest-coverage s le. Alleles estimated from targeted long-read sequencing were concordant with gold standard PCR sizing analysis and moreover highly correlated with alleles estimated from whole genome long-read sequencing. We demonstrate a targeted long-read sequencing approach that enables simultaneous analysis of hundreds of TRs and accuracy is comparable to PCR sizing analysis. Our approach is feasible to scale for more targets and more s les facilitating large-scale analysis of TRs.

Publication

Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data

Publisher: Cold Spring Harbor Laboratory

Date: 14-02-2017

DOI: 10.1101/108365

Abstract: The majority of human chromosome ends remain incompletely assembled due to their highly repetitive structure. In this study, we use BioNano data to anchor and extend chromosome ends from two European trios as well as two unrelated Asian genomes. BioNano assembled chromosome ends are structurally ergent from the reference genome, including both missing sequence (10%) and extensions(22%). These extensions are heritable and in some cases ergent between Asian and European s les. Six ninths of the extension sequence in NA12878 can be confirmed and filled by nanopore data. We identify two sequence families in these sequences which have undergone substantial duplication in multiple primate lineages. We show that these sequence families have arisen from progenitor interstitial sequence on the ancestral primate chromosome 7. Comparison of chromosome end sequences from 15 species revealed that chromosome end missing sequence matches the corresponding phylogenetic relationship and revealed a rate of chromosome extension per chromosome of 0.0020 bp per year in average.

Publication

Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection

Publisher: Cold Spring Harbor Laboratory

Date: 22-12-2020

DOI: 10.1101/2020.12.22.423893

Abstract: SARS-CoV-2 uses subgenomic (sg)RNA to produce viral proteins for replication and immune evasion. We applied long-read RNA and cDNA sequencing to in vitro human and primate infection models to study transcriptional dynamics. Transcription-regulating sequence (TRS)-dependent sgRNA was upregulated earlier in infection than TRS-independent sgRNA. An abundant class of TRS-independent sgRNA consisting of a portion of ORF1ab containing nsp1 joined to ORF10 and 3’UTR was upregulated at 48 hours post infection in human cell lines. We identified double-junction sgRNA containing both TRS-dependent and independent junctions. We found multiple sites at which the SARS-CoV-2 genome is consistently more modified than sgRNA, and that sgRNA modifications are stable across transcript clusters, host cells and time since infection. Our work highlights the dynamic nature of the SARS-CoV-2 transcriptome during its replication cycle. Our results are available via an interactive web-app at coinlab.mdhs.unimelb.edu.au/ .

Publication

Insights into population structure of East African sweetpotato cultivars from hybrid assembly of chloroplast genomes

Publisher: F1000 Research Ltd

Date: 05-09-2018

DOI: 10.12688/GATESOPENRES.12856.1

Abstract: Background: The chloroplast (cp) genome is an important resource for studying plant ersity and phylogeny. Assembly of the cp genomes from next-generation sequencing data is complicated by the presence of two large inverted repeats contained in the cp DNA. Methods: We constructed a complete circular cp genome assembly for the hexaploid sweetpotato using extremely low coverage ( ×) Oxford Nanopore whole-genome sequencing (WGS) data coupled with Illumina sequencing data for polishing. Results: The sweetpotato cp genome of 161,274 bp contains 152 genes, of which there are 96 protein coding genes, 8 rRNA genes and 48 tRNA genes. Using the cp genome assembly as a reference, we constructed complete cp genome assemblies for a further 17 sweetpotato cultivars from East Africa and an I. triloba line using Illumina WGS data. Analysis of the sweetpotato cp genomes demonstrated the presence of two distinct subpopulations in East Africa. Phylogenetic analysis of the cp genomes of the species from the Convolvulaceae Ipomoea section Batatas revealed that the most closely related diploid wild species of the hexaploid sweetpotato is I. trifida . Conclusions: Nanopore long reads are helpful in construction of cp genome assemblies, especially in solving the two long inverted repeats. We are generally able to extract cp sequences from WGS data of sufficiently high coverage for assembly of cp genomes. The cp genomes can be used to investigate the population structure and the phylogenetic relationship for the sweetpotato.

Publication

YaHS: yet another Hi-C scaffolding tool

Publisher: Oxford University Press (OUP)

Date: 16-12-2022

DOI: 10.1093/BIOINFORMATICS/BTAC808

Abstract: We present YaHS, a user-friendly command-line tool for the construction of chromosome-scale scaffolds from Hi-C data. It can be run with a single-line command, requires minimal input from users (an assembly file and an alignment file) which is compatible with similar tools and provides assembly results in multiple formats, thereby enabling rapid, robust and scalable construction of high-quality genome assemblies with high accuracy and contiguity. YaHS is implemented in C and licensed under the MIT License. The source code, documentation and tutorial are available at anger-tol/yahs. Supplementary data are available at Bioinformatics online.

Publication

Self organized parallel genetic algorithm to automatically realize diversified convergence

Publisher: IEEE

Date: 06-2012

DOI: 10.1109/CEC.2012.6256642

Publication

Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement

Publisher: Springer Science and Business Media LLC

Date: 02-11-2018

DOI: 10.1038/S41467-018-06983-8

Abstract: Sweetpotato [ Ipomoea batatas (L.) Lam.] is a globally important staple food crop, especially for sub-Saharan Africa. Agronomic improvement of sweetpotato has lagged behind other major food crops due to a lack of genomic and genetic resources and inherent challenges in breeding a heterozygous, clonally propagated polyploid. Here, we report the genome sequences of its two diploid relatives, I. trifida and I. triloba , and show that these high-quality genome assemblies are robust references for hexaploid sweetpotato. Comparative and phylogenetic analyses reveal insights into the ancient whole-genome triplication history of Ipomoea and evolutionary relationships within the Batatas complex. Using resequencing data from 16 genotypes widely used in African breeding programs, genes and alleles associated with carotenoid biosynthesis in storage roots are identified, which may enable efficient breeding of varieties with high provitamin A content. These resources will facilitate genome-enabled breeding in this important food security crop.

Publication

Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outcrossed mapping populations

Publisher: Cold Spring Harbor Laboratory

Date: 22-03-2017

DOI: 10.1101/119271

Abstract: The assembly of whole-chromosome pseudomolecules for plant genomes remains challenging due to polyploidy and high repeat content. We developed an approach for constructing complete pseudomolecules for polyploid species using genotyping-by-sequencing data from outcrossing mapping populations coupled with high coverage whole genome sequence data of a reference genome. Our approach combines de novo assembly with linkage mapping to arrange scaffolds into pseudomolecules. We show that the method is able to reconstruct simulated chromosomes for both diploid and tetraploid genomes. Comparisons to three existing genetic mapping tools show that our method outperforms the other methods in accuracy on both grouping and ordering, and is robust to the presence of substantial amounts of missing data and genotyping errors. We applied our method to three real datasets including a diploid Ipomoea trifida and two tetraploid potato mapping populations. The linkage maps show significant concordance with the reference chromosomes. We resolved seven assembly errors for the published Ipomoea trifida genome assembly as well as anchored an unplaced scaffold in the published potato genome.

Publication

High-throughput multiplexed tandem repeat genotyping using targeted long-read sequencing [version 1; peer review: 1 approved with reservations, 1 not approved]

Publisher: F1000 Research Ltd

Date: 02-09-2020

DOI: 10.12688/F1000RESEARCH.25693.1

Abstract: Background: Tandem repeats (TRs) are highly prone to variation in copy numbers due to their repetitive and unstable nature, which makes them a major source of genomic variation between in iduals. However, population variation of TRs has not been widely explored due to the limitations of existing approaches, which are either low-throughput or restricted to a small subset of TRs. Here, we demonstrate a targeted sequencing approach combined with Nanopore sequencing to overcome these limitations. Methods: We selected 142 TR targets and enriched these regions using Agilent SureSelect target enrichment approach with only 200 ng of input DNA. We barcoded the enriched products and sequenced on Oxford Nanopore MinION sequencer. We used VNTRTyper and Tandem-genotypes to genotype TRs from long-read sequencing data. Gold standard PCR sizing analysis was used to validate genotyping results from targeted sequencing data. Results: We achieved an average of 3062-fold target enrichment on a panel of 142 TR loci, generating an average of 97X coverage per s le with 200 ng of input DNA per s le. We successfully genotyped an average of 75% targets and genotyping rate increased to 91% for the highest-coverage s le for targets with length less than 2 kb, and GC content greater than 25%. Alleles estimated from targeted long-read sequencing were concordant with gold standard PCR sizing analysis and highly correlated with alleles estimated from whole genome long-read sequencing. Conclusions: We demonstrate a targeted long-read sequencing approach that enables simultaneous analysis of hundreds of TRs and accuracy is comparable to PCR sizing analysis. Our approach is feasible to scale for more targets and more s les facilitating large-scale analysis of TRs.

Publication

YaHS: yet another Hi-C scaffolding tool

Publisher: Cold Spring Harbor Laboratory

Date: 09-06-2022

DOI: 10.1101/2022.06.09.495093

Abstract: We present YaHS, a user-friendly command-line tool for construction of chromosome-scale scaffolds from Hi-C data. It can be run with a single-line command, requires minimal input from users (an assembly file and an alignment file) which is compatible with similar tools, and provides assembly results in multiple formats, thereby enabling rapid, robust and scalable construction of high-quality genome assemblies with high accuracy and contiguity. YaHS is implemented in C and licensed under the MIT License. The source code, documentation and tutorial are available at -zhou/yahs .

Publication

A novel fitness allocation algorithm for maintaining a constant selective pressure during GA procedure

Publisher: Elsevier BV

Date: 2015

DOI: 10.1016/J.NEUCOM.2012.07.063

Publication

Global, regional, and national levels of maternal mortality, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 2017

DOI: 10.1097/01.OGX.0000511935.64476.66

Publication

Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data

Publisher: Springer Science and Business Media LLC

Date: 09-11-2018

DOI: 10.1038/S41598-018-34774-0

Abstract: The majority of human chromosome ends remain incompletely assembled due to their highly repetitive structure. In this study, we use BioNano data to anchor and extend chromosome ends from two European trios as well as two unrelated Asian genomes. At least 11 BioNano assembled chromosome ends are structurally ergent from the reference genome, including both missing sequence and extensions. These extensions are heritable and in some cases ergent between Asian and European s les. Six out of nine predicted extension sequences from NA12878 can be confirmed and filled by nanopore data. We identify two multi-kilobase sequence families both enriched more than 100-fold in extension sequence (p-values 1e-5) whose origins can be traced to interstitial sequence on ancestral primate chromosome 7. Extensive sub-telomeric duplication of these families has occurred in the human lineage subsequent to ergence from chimpanzees.

Publication

Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection

Publisher: Elsevier BV

Date: 05-2021

DOI: 10.1016/J.CELREP.2021.109108

Related Organisations

Organisation

University Of Queensland Institute For Molecular Bioscience

Location: Australia

View Organisation

Organisation

Victorian Comprehensive Cancer Centre

Location: Australia

View Organisation

Organisation

Wellcome Sanger Institute

Location: United Kingdom of Great Britain and Northern Ireland

View Organisation

Organisation

University Of Cambridge

Location: United Kingdom of Great Britain and Northern Ireland

View Organisation

Organisation

University Of Cambridge

Location: No location found

View Organisation

Related Funding Activities

No related grants have been discovered for Chenxi Zhou.

Chenxi Zhou

Researcher

Related Links

Publications

A Novel Edge Weighting Method to Enhance Network Community Detection

Simulating the dynamics of targeted capture sequencing with CapSim

Development of computational tools for analysis of polyploid plant genomes with application to hexaploid sweetpotato

High-throughput multiplexed tandem repeat genotyping using targeted long-read sequencing

Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data

Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection

Insights into population structure of East African sweetpotato cultivars from hybrid assembly of chloroplast genomes

YaHS: yet another Hi-C scaffolding tool

Self organized parallel genetic algorithm to automatically realize diversified convergence

Genome sequences of two diploid wild relatives of cultivated sweetpotato reveal targets for genetic improvement

Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outcrossed mapping populations

High-throughput multiplexed tandem repeat genotyping using targeted long-read sequencing [version 1; peer review: 1 approved with reservations, 1 not approved]

YaHS: yet another Hi-C scaffolding tool

A novel fitness allocation algorithm for maintaining a constant selective pressure during GA procedure

Global, regional, and national levels of maternal mortality, 1990–2015: a systematic analysis for the Global Burden of Disease Study 2015

Ongoing human chromosome end extension revealed by analysis of BioNano and nanopore data

Transcriptional and epi-transcriptional dynamics of SARS-CoV-2 during cellular infection

Related Organisations

University Of Queensland Institute For Molecular Bioscience

Victorian Comprehensive Cancer Centre

Wellcome Sanger Institute

University Of Cambridge

University Of Cambridge

Related Funding Activities

ARDC NEWSLETTER SIGNUP