ARDC Research Link Australia

Publication

Targeted next-generation sequencing of 22 mismatch repair genes identifies Lynch syndrome families

Publisher: Wiley

Date: 25-01-2016

DOI: 10.1002/CAM4.628

Publication

Genome-wide analysis of chemically induced mutations in mouse in phenotype-driven screens

Publisher: Springer Science and Business Media LLC

Date: 26-10-2015

DOI: 10.1186/S12864-015-2073-4

Publication

Predicting SUMOylation sites in developmental transcription factors of Drosophila melanogaster

Publisher: Elsevier BV

Date: 08-2010

DOI: 10.1016/J.NEUCOM.2010.01.022

Publication

Isling: A Tool for Detecting Integration of Wild-Type Viruses and Clinical Vectors

Publisher: Elsevier BV

Date: 06-2022

DOI: 10.1016/J.JMB.2021.167408

Abstract: Detecting viral and vector integration events is a key step when investigating interactions between viral and host genomes. This is relevant in several fields, including virology, cancer research and gene therapy. For ex le, investigating integrations of wild-type viruses such as human papillomavirus and hepatitis B virus has proven to be crucial for understanding the role of these integrations in cancer. Furthermore, identifying the extent of vector integration is vital for determining the potential for genotoxicity in gene therapies. To address these questions, we developed isling, the first tool specifically designed for identifying viral integrations in both wild-type and vector from next-generation sequencing data. Isling addresses complexities in integration behaviour including integration of fragmented genomes and integration junctions with ambiguous locations in a host or vector genome, and can also flag possible vector recombinations. We show that isling is up to 1.6-fold faster and up to 170% more accurate than other viral integration tools, and performs well on both simulated and real datasets. Isling is therefore an efficient and application-agnostic tool that will enable a broad range of investigations into viral and vector integration. These include comparisons between integrations of wild-type viruses and gene therapy vectors, as well as assessing the genotoxicity of vectors and understanding the role of viruses in cancer.

Publication

Studying the functional conservation of cis-regulatory modules and their transcriptional output

Publisher: Springer Science and Business Media LLC

Date: 29-04-2008

DOI: 10.1186/1471-2105-9-220

Publication

Unlocking HDR-mediated nucleotide editing by identifying high-efficiency target sites using machine learning

Publisher: Springer Science and Business Media LLC

Date: 26-02-2019

DOI: 10.1038/S41598-019-39142-0

Abstract: Editing in idual nucleotides is a crucial component for validating genomic disease association. It is currently h ered by CRISPR-Cas-mediated “base editing” being limited to certain nucleotide changes, and only achievable within a small window around CRISPR-Cas target sites. The more versatile alternative, HDR (homology directed repair), has a 3-fold lower efficiency with known optimization factors being largely immutable in experiments. Here, we investigated the variable efficiency-governing factors on a novel mouse dataset using machine learning. We found the sequence composition of the single-stranded oligodeoxynucleotide (ssODN), i.e. the repair template, to be a governing factor. Furthermore, different regions of the ssODN have variable influence, which reflects the underlying mechanism of the repair process. Our model improves HDR efficiency by 83% compared to traditionally chosen targets. Using our findings, we developed CUNE (Computational Universal Nucleotide Editor), which enables users to identify and design the optimal targeting strategy using traditional base editing or – for-the-first-time – HDR-mediated nucleotide changes.

Publication

Ankyrin-1 gene exhibits allelic heterogeneity in conferring protection against malaria

Publisher: Cold Spring Harbor Laboratory

Date: 08-03-2017

DOI: 10.1101/114959

Abstract: Allelic heterogeneity is a common phenomenon where a gene exhibit different phenotype depending on the nature of its genetic mutations. In the context of genes affecting malaria susceptibility, it allowed us to explore and understand the intricate host-parasite interactions during malaria infections. In this study, we described a gene encoding erythrocytic ankyrin-1 ( Ank-1 ) which exhibits allelic-dependent heterogeneous phenotypes during malaria infections. We conducted an ENU mutagenesis screen on mice and identified two Ank-1 mutations, one resulted in an amino acid substitution (MRI95845), and the other a truncated Ank-1 protein (MRI96570). Both mutations caused hereditary spherocytosis-like phenotypes and confer differing protection against Plasmodium chabaudi infections. Upon further examination, the Ank-1 (MRI96570) mutation was found to inhibit intra-erythrocytic parasite maturation, whereas Ank-1 (MW95845) caused increased bystander erythrocyte clearance during infection. This is the first description of allelic heterogeneity in ankyrin-1 from the direct comparison between two Ank-1 mutations. Despite the lack of direct evidence from population studies, this data further supported the protective roles of ankyrin-1 mutations in conferring malaria protection. This study also emphasized the importance of such phenomenon to achieve a better understanding of host-parasite interactions, which could be the basis of future studies.

Publication

Genetic and immunopathological analysis of CHCHD10 in Australian amyotrophic lateral sclerosis and frontotemporal dementia and transgenic TDP-43 mice

Publisher: BMJ

Date: 05-11-2020

DOI: 10.1136/JNNP-2019-321790

Abstract: Since the first report of CHCHD10 gene mutations in amyotrophiclateral sclerosis (ALS)/frontotemporaldementia (FTD) patients, genetic variation in CHCHD10 has been inconsistently linked to disease. A pathological assessment of the CHCHD10 protein in patient neuronal tissue also remains to be reported. We sought to characterise the genetic and pathological contribution of CHCHD10 to ALS/FTD in Australia. Whole-exome and whole-genome sequencing data from 81 familial and 635 sporadic ALS, and 108 sporadic FTD cases, were assessed for genetic variation in CHCHD10 . CHCHD10 protein expression was characterised by immunohistochemistry, immunofluorescence and western blotting in control, ALS and/or FTD postmortem tissues and further in a transgenic mouse model of TAR DNA-binding protein 43 (TDP-43) pathology. No causal, novel or disease-associated variants in CHCHD10 were identified in Australian ALS and/or FTD patients. In human brain and spinal cord tissues, CHCHD10 was specifically expressed in neurons. A significant decrease in CHCHD10 protein level was observed in ALS patient spinal cord and FTD patient frontal cortex. In a TDP-43 mouse model with a regulatable nuclear localisation signal (rNLS TDP-43 mouse), CHCHD10 protein levels were unaltered at disease onset and early in disease, but were significantly decreased in cortex in mid-stage disease. Genetic variation in CHCHD10 is not a common cause of ALS/FTD in Australia. However, we showed that in humans, CHCHD10 may play a neuron-specific role and a loss of CHCHD10 function may be linked to ALS and/or FTD. Our data from the rNLS TDP-43 transgenic mice suggest that a decrease in CHCHD10 levels is a late event in aberrant TDP-43-induced ALS/FTD pathogenesis.

Publication

Genetic analysis of GLT8D1 and ARPP21 in Australian familial and sporadic amyotrophic lateral sclerosis

Publisher: Elsevier BV

Date: 05-2021

DOI: 10.1016/J.NEUROBIOLAGING.2021.01.005

Publication

A Navigation System for Base Editing: Are We There Yet?

Publisher: Mary Ann Liebert Inc

Date: 08-2020

DOI: 10.1089/CRISPR.2020.29097.DCB

Publication

Data-driven platform for identifying variants of interest in COVID-19 virus

Publisher: Elsevier BV

Date: 2022

DOI: 10.1016/J.CSBJ.2022.06.005

Publication

Evidence for polygenic and oligogenic basis of Australian sporadic amyotrophic lateral sclerosis

Publisher: BMJ

Date: 14-05-2021

DOI: 10.1136/JMEDGENET-2020-106866

Abstract: Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with phenotypic and genetic heterogeneity. Approximately 10% of cases are familial, while remaining cases are classified as sporadic. To date, genes and several hundred genetic variants have been implicated in ALS. Seven hundred and fifty-seven sporadic ALS cases were recruited from Australian neurology clinics. Detailed clinical data and whole genome sequencing (WGS) data were available from 567 and 616 cases, respectively, of which 426 cases had both datasets available. As part of a comprehensive genetic analysis, 853 genetic variants previously reported as ALS-linked mutations or disease-associated alleles were interrogated in sporadic ALS WGS data. Statistical analyses were performed to identify correlation between clinical variables, and between phenotype and the number of ALS-implicated variants carried by an in idual. Relatedness between in iduals carrying identical variants was assessed using identity-by-descent analysis. Forty-three ALS-implicated variants from 18 genes, including C9orf72 , ATXN2 , TARDBP, SOD1, SQSTM1 and SETX, were identified in Australian sporadic ALS cases. One-third of cases carried at least one variant and 6.82% carried two or more variants, implicating a potential oligogenic or polygenic basis of ALS. Relatedness was detected between two sporadic ALS cases carrying a SOD1 p.I114T mutation, and among three cases carrying a SQSTM1 p.K238E mutation. Oligogenic olygenic sporadic ALS cases showed earlier age of onset than those with no reported variant. We confirm phenotypic associations among ALS cases, and highlight the contribution of genetic variation to all forms of ALS.

Publication

Monozygotic twins and triplets discordant for amyotrophic lateral sclerosis display differential methylation and gene expression

Publisher: Springer Science and Business Media LLC

Date: 04-06-2019

DOI: 10.1038/S41598-019-44765-4

Abstract: Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterised by the loss of upper and lower motor neurons. ALS exhibits high phenotypic variability including age and site of onset, and disease duration. To uncover epigenetic and transcriptomic factors that may modify an ALS phenotype, we used a cohort of Australian monozygotic twins (n = 3 pairs) and triplets (n = 1 set) that are discordant for ALS and represent sporadic ALS and the two most common types of familial ALS, linked to C9orf72 and SOD1 . Illumina Infinium HumanMethylation450K BeadChip, EpiTYPER and RNA-Seq analyses in these ALS-discordant twins/triplets and control twins (n = 2 pairs), implicated genes with consistent longitudinal differential DNA methylation and/or gene expression. Two identified genes, RAD9B and C8orf46 , showed significant differential methylation in an extended cohort of ALS cases and controls. Combined longitudinal methylation-transcription analysis within a single twin set implicated CCNF , DPP6 , RAMP3 , and CCS , which have been previously associated with ALS. Longitudinal transcriptome data showed an 8-fold enrichment of immune function genes and under-representation of transcription and protein modification genes in ALS. Examination of these changes in a large Australian sporadic ALS cohort suggest a broader role in ALS. Furthermore, we observe that increased methylation age is a signature of ALS in older patients.

Publication

Artificial Intelligence and Machine Learning in Bioinformatics

Publisher: Elsevier

Date: 2019

DOI: 10.1016/B978-0-12-809633-8.20325-7

Publication

NGSANE

Publisher: CSIRO

Date: 2014

DOI: 10.4225/08/5490FD3575E9D

Publication

Balancing the safeguarding of privacy and data sharing: perceptions of genomic professionals on patient genomic data ownership in Australia

Publisher: Springer Science and Business Media LLC

Date: 11-01-2023

DOI: 10.1038/S41431-022-01273-W

Abstract: There are inherent complexities and tensions in achieving a responsible balance between safeguarding patients’ privacy and sharing genomic data for advancing health and medical science. A growing body of literature suggests establishing patient genomic data ownership, enabled by blockchain technology, as one approach for managing these priorities. We conducted an online survey, applying a mixed methods approach to collect quantitative (using scale questions) and qualitative data (using open-ended questions). We explored the views of 117 genomic professionals (clinical geneticists, genetic counsellors, bioinformaticians, and researchers) towards patient data ownership in Australia. Data analysis revealed most professionals agreed that patients have rights to data ownership. However, there is a need for a clearer understanding of the nature and implications of data ownership in this context as genomic data often is subject to collective ownership (e.g., with family members and laboratories). This research finds that while the majority of genomic professionals acknowledge the desire for patient data ownership, bioinformaticians and researchers expressed more favourable views than clinical geneticists and genetic counsellors, suggesting that their views on this issue may be shaped by how closely they interact with patients as part of their professional duties. This research also confirms that stronger health system infrastructure is a prerequisite for enabling patient data ownership, which needs to be underpinned by appropriate digital infrastructure (e.g., central vs. decentralised data storage), patient identity ownership (e.g., limited vs. self-sovereign identity), and policy at both federal and state levels.

Publication

WHO O2CoV2: oxygen requirements and respiratory support in patients with COVID-19 in low-and-middle income countries—protocol for a multicountry, prospective, observational cohort study

Publisher: BMJ

Date: 08-2023

DOI: 10.1136/BMJOPEN-2022-071346

Abstract: SARS-CoV-2 has been identified as the cause of the disease officially named COVID-19, primarily a respiratory illness. COVID-19 was characterised as a pandemic on 11 March 2020. It has been estimated that approximately 20% of people with COVID-19 require oxygen therapy. Oxygen has been listed on the WHO Model List of Essential Medicines List and Essential Medicines List for Children for almost two decades. The COVID-19 pandemic has highlighted, more than ever, the acute need for scale-up of oxygen therapy. Detailed data on the use of oxygen therapy in low-and-middle income countries at the patient and facility level are needed to target interventions better globally. We aim to describe the requirements and use of oxygen at the facility and patient level of approximately 4500 patients with COVID-19 in 30 countries. Our objectives are specifically to characterise type and duration of different modalities of oxygen therapy delivered to patients describe demographics and outcomes of hospitalised patients with COVID-19 and describe facility-level oxygen production and support. Primary analyses will be descriptive in nature. Respiratory support transitions will be described in Sankey plots, and Kaplan-Meier models will be used to estimate probability of each transition. A multistate model will be used to study the course of hospital stay of the study population, evaluating transitions of escalating respiratory support transitions to the absorbing states. WHO Ad Hoc COVID-19 Research Ethics Review Committee (ERC) has approved this global protocol. When this protocol is adopted at specific country sites, national ERCs may make require adjustments in accordance with their respective national research ethics guidelines. Dissemination of this protocol and global findings will be open access through peer-reviewed scientific journals, study website, press and online media. NCT04918875 .

Publication

Stress analysis of nano porous material using computed tomography images

Publisher: Wiley

Date: 03-2019

DOI: 10.1002/MAWE.201800206

Publication

Fast and accurate exhaustive higher-order epistasis search with BitEpi

Publisher: Springer Science and Business Media LLC

Date: 05-08-2021

DOI: 10.1038/S41598-021-94959-Y

Abstract: Complex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.e. Single Nucleotide Variant or SNV for short). BitEpi introduces a novel bitwise algorithm that is 1.7 and 56 times faster for 3-SNV and 4-SNV search, than established software. The novel entropy statistic used in BitEpi is 44% more accurate to identify interactive SNVs, incorporating a p -value-based significance testing. We demonstrate BitEpi on real world data of 4900 s les and 87,000 SNPs. We also present EpiExplorer to visualize the potentially large number of in idual and interacting SNVs in an interactive Cytoscape graph. EpiExplorer uses various visual elements to facilitate the discovery of true biological events in a complex polygenic environment.

Publication

A novel ENU-induced ankyrin-1 mutation impairs parasite invasion and increases erythrocyte clearance during malaria infection in mice

Publisher: Cold Spring Harbor Laboratory

Date: 31-08-2016

DOI: 10.1101/072587

Abstract: Genetic defects in various red blood cell (RBC) cytoskeletal proteins have been long associated with changes in susceptibility towards malaria infection. In particular, while ankyrin (Ank-1) mutations account for approximately 50% of hereditary spherocytosis (HS) cases, an association with malaria is not well-established, and conflicting evidence has been reported. We describe a novel N-ethyl-N-nitrosourea (ENU)-induced ankyrin mutation MRI61689 that gives rise to two different ankyrin transcripts: one with an introduced splice acceptor site resulting a frameshift, the other with a skipped exon. Ank-1 (MRI61689/+) mice exhibit an HS-like phenotype including reduction in mean corpuscular volume (MCV), increased osmotic fragility and reduced RBC deformability. They were also found to be resistant to rodent malaria Plasmodium chabaudi infection. Parasites in Ank-1 (MRI61689/+) erythrocytes grew normally, but red cells showed resistance to merozoite invasion. Uninfected Ank-1 (MRI61689/+) erythrocytes were also more likely to be cleared from circulation during infection the “bystander effect”. This increased clearance is a novel resistance mechanism which was not observed in previous ankyrin mouse models. We propose that this bystander effect is due to reduced deformability of Ank-1 (MRI61689/+) erythrocytes. This paper highlights the complex roles ankyrin plays in mediating malaria resistance.

Publication

NGSANE: A lightweight production informatics framework for high-throughput data analysis

Publisher: Oxford University Press (OUP)

Date: 26-01-2014

DOI: 10.1093/BIOINFORMATICS/BTU036

Abstract: Summary: The initial steps in the analysis of next-generation sequencing data can be automated by way of software ‘pipelines’. However, in idual components depreciate rapidly because of the evolving technology and analysis methods, often rendering entire versions of production informatics pipelines obsolete. Constructing pipelines from Linux bash commands enables the use of hot swappable modular components as opposed to the more rigid program call wrapping by higher level languages, as implemented in comparable published pipelining systems. Here we present Next Generation Sequencing ANalysis for Enterprises (NGSANE), a Linux-based, high-performance-computing-enabled framework that minimizes overhead for set up and processing of new projects, yet maintains full flexibility of custom scripting when processing raw sequence data. Availability and implementation: Ngsane is implemented in bash and publicly available under BSD (3-Clause) licence via GitHub at github.com/BauerLab/ngsane. Contact: Denis.Bauer@csiro.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

VariantSpark

Publisher: CSIRO

Date: 2020

DOI: 10.25919/0HQ7-6G52

Publication

Host Porphobilinogen Deaminase Deficiency Confers Malaria Resistance in Plasmodium chabaudi but Not in Plasmodium berghei or Plasmodium falciparum During Intraerythrocytic Growth

Publisher: Frontiers Media SA

Date: 03-09-2020

DOI: 10.3389/FCIMB.2020.00464

Publication

Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan

Publisher: JMIR Publications Inc.

Date: 20-05-2022

DOI: 10.2196/36261

Abstract: The LOVIT (Lessening Organ Dysfunction with Vitamin C) trial is a blinded multicenter randomized clinical trial comparing high-dose intravenous vitamin C to placebo in patients admitted to the intensive care unit with proven or suspected infection as the main diagnosis and receiving a vasopressor. We aim to describe a prespecified statistical analysis plan (SAP) for the LOVIT trial prior to unblinding and locking of the trial database. The SAP was designed by the LOVIT principal investigators and statisticians, and approved by the steering committee and coinvestigators. The SAP defines the primary and secondary outcomes, and describes the planned primary, secondary, and subgroup analyses. The SAP includes a draft participant flow diagram, tables, and planned figures. The primary outcome is a composite of mortality and persistent organ dysfunction (receipt of mechanical ventilation, vasopressors, or new renal replacement therapy) at 28 days, where day 1 is the day of randomization. All analyses will use a frequentist statistical framework. The analysis of the primary outcome will estimate the risk ratio and 95% CI in a generalized linear mixed model with binomial distribution and log link, with site as a random effect. We will perform a secondary analysis adjusting for prespecified baseline clinical variables. Subgroup analyses will include age, sex, frailty, severity of illness, Sepsis-3 definition of septic shock, baseline ascorbic acid level, and COVID-19 status. We have developed an SAP for the LOVIT trial and will adhere to it in the analysis phase. DERR1-10.2196/36261

Publication

Domain-specific introduction to machine learning terminology, pitfalls and opportunities in CRISPR-based gene editing

Publisher: Oxford University Press (OUP)

Date: 02-02-2020

DOI: 10.1093/BIB/BBZ145

Abstract: The use of machine learning (ML) has become prevalent in the genome engineering space, with applications ranging from predicting target site efficiency to forecasting the outcome of repair events. However, jargon and ML-specific accuracy measures have made it hard to assess the validity of in idual approaches, potentially leading to misinterpretation of ML results. This review aims to close the gap by discussing ML approaches and pitfalls in the context of CRISPR gene-editing applications. Specifically, we address common considerations, such as algorithm choice, as well as problems, such as overestimating accuracy and data interoperability, by providing tangible ex les from the genome-engineering domain. Equipping researchers with the knowledge to effectively use ML to better design gene-editing experiments and predict experimental outcomes will help advance the field more rapidly.

Publication

Variantspark: Cloud-based machine learning for association study of complex phenotype and large-scale genomic data

Publisher: Oxford University Press (OUP)

Date: 08-2020

DOI: 10.1093/GIGASCIENCE/GIAA077

Abstract: Many traits and diseases are thought to be driven by & gene (polygenic). Polygenic risk scores (PRS) hence expand on genome-wide association studies by taking multiple genes into account when risk models are built. However, PRS only considers the additive effect of in idual genes but not epistatic interactions or the combination of in idual and interacting drivers. While evidence of epistatic interactions ais found in small datasets, large datasets have not been processed yet owing to the high computational complexity of the search for epistatic interactions. We have developed VariantSpark, a distributed machine learning framework able to perform association analysis for complex phenotypes that are polygenic and potentially involve a large number of epistatic interactions. Efficient multi-layer parallelization allows VariantSpark to scale to the whole genome of population-scale datasets with 100,000,000 genomic variants and 100,000 s les. Compared with traditional monogenic genome-wide association studies, VariantSpark better identifies genomic variants associated with complex phenotypes. VariantSpark is 3.6 times faster than ReForeSt and the only method able to scale to ultra-high-dimensional genomic data in a manageable time.

Publication

Scalable genomic data exchange and analytics with sBeacon

Publisher: Springer Science and Business Media LLC

Date: 14-09-2023

DOI: 10.1038/S41587-023-01972-9

Publication

Artificial Intelligence in Medicine: Applications, Limitations and Future Directions

Publisher: Springer Nature Singapore

Date: 2022

DOI: 10.1007/978-981-19-1223-8_5

Publication

Adenosine monophosphate deaminase 3 activation shortens erythrocyte half-life and provides malaria resistance in mice

Publisher: American Society of Hematology

Date: 09-2016

DOI: 10.1182/BLOOD-2015-09-666834

Abstract: AMPD3 activation reduces red blood cell half-life, which is associated with increased oxidative stress and phosphatidylserine exposure. AMPD3 activation causes malaria resistance through increased RBC turnover and increased RBC production.

Publication

Early life events influence whole-of-life metabolic health via gut microflora and gut permeability

Publisher: Informa UK Limited

Date: 19-03-2015

DOI: 10.3109/1040841X.2013.837863

Publication

INSIDER

Publisher: CSIRO

Date: 2021

DOI: 10.25919/G1EA-7G47

Publication

The inequity of targeted cystic fibrosis reproductive carrier screening tests in a multiethnic Australian population

Publisher: Wiley

Date: 15-12-2022

DOI: 10.1002/PD.6285

Abstract: European and Australian guidelines for cystic fibrosis (CF) reproductive carrier screening recommend testing a small number of high frequency CF causing variants, rather than comprehensive CFTR sequencing. The study objective was to determine variant detection rates of commercially available targeted reproductive carrier screening tests in Australia. Next‐generation DNA sequencing of the CFTR gene was performed on 2552 in iduals from a whole population s le to identify CF causing variants. The variant detection rates of two commercially available Australian reproductive carrier screening tests, which target 50 or 175 CF causing variants, in this population were calculated. The ethnicity of in iduals was determined using principal component analysis. Variant detection rates of the tests for 50 and 175 CF causing variants were 88.2% and 90.8%, respectively. No CF causing variants in in iduals of East Asian ethnicity ( n = 3) were detected by either test, while .6% ( n = 69) of CF causing variants in Europeans would be identified by either test. Reproductive carrier screening tests for a targeted set of high frequency CF variants are unable to detect approximately 10% of CF variants in a multiethnic Australian population, and in iduals of East Asian ethnicity are disproportionally affected by this test limitation.

Publication

VariantSpark: Population scale clustering of genotype information

Publisher: Springer Science and Business Media LLC

Date: 12-2015

DOI: 10.1186/S12864-015-2269-7

Publication

Supporting pandemic response using genomics and bioinformatics: A case study on the emergent SARS-CoV-2 outbreak

Publisher: Hindawi Limited

Date: 25-05-2020

DOI: 10.1111/TBED.13588

Publication

Genetic correlation between amyotrophic lateral sclerosis and schizophrenia

Publisher: Springer Science and Business Media LLC

Date: 21-03-2017

DOI: 10.1038/NCOMMS14774

Abstract: We have previously shown higher-than-expected rates of schizophrenia in relatives of patients with amyotrophic lateral sclerosis (ALS), suggesting an aetiological relationship between the diseases. Here, we investigate the genetic relationship between ALS and schizophrenia using genome-wide association study data from over 100,000 unique in iduals. Using linkage disequilibrium score regression, we estimate the genetic correlation between ALS and schizophrenia to be 14.3% (7.05–21.6 P =1 × 10 −4 ) with schizophrenia polygenic risk scores explaining up to 0.12% of the variance in ALS ( P =8.4 × 10 −7 ). A modest increase in comorbidity of ALS and schizophrenia is expected given these findings (odds ratio 1.08–1.26) but this would require very large studies to observe epidemiologically. We identify five potential novel ALS-associated loci using conditional false discovery rate analysis. It is likely that shared neurobiological mechanisms between these two disorders will engender novel hypotheses in future preclinical and clinical studies.

Publication

Methylome and transcriptome maps of human visceral and subcutaneous adipocytes reveal key epigenetic differences at developmental genes

Publisher: Springer Science and Business Media LLC

Date: 02-07-2019

DOI: 10.1038/S41598-019-45777-W

Abstract: Adipocytes support key metabolic and endocrine functions of adipose tissue. Lipid is stored in two major classes of depots, namely visceral adipose (VA) and subcutaneous adipose (SA) depots. Increased visceral adiposity is associated with adverse health outcomes, whereas the impact of SA tissue is relatively metabolically benign. The precise molecular features associated with the functional differences between the adipose depots are still not well understood. Here, we characterised transcriptomes and methylomes of isolated adipocytes from matched SA and VA tissues of in iduals with normal BMI to identify epigenetic differences and their contribution to cell type and depot-specific function. We found that DNA methylomes were notably distinct between different adipocyte depots and were associated with differential gene expression within pathways fundamental to adipocyte function. Most striking differential methylation was found at transcription factor and developmental genes. Our findings highlight the importance of developmental origins in the function of different fat depots.

Publication

TRIBES: A user-friendly pipeline for relatedness detection and disease gene discovery

Publisher: Cold Spring Harbor Laboratory

Date: 02-07-2019

DOI: 10.1101/686253

Abstract: TRIBES is a user-friendly pipeline for relatedness detection in genomic data. TRIBES is the first tool which is both accurate up to 7 th degree relatives (e.g. third cousins) and combines essential data processing steps into a single user-friendly pipeline. Furthermore, using a proof-of-principle cohort comprising amyotrophic lateral sclerosis cases with known relationship structures and a known causal mutation in SOD1 , we demonstrated that TRIBES can successfully uncover disease susceptibility loci. TRIBES has multiple applications in addition to disease gene mapping, including s le quality control in genome wide association studies and avoiding consanguineous unions in family planning. TRIBES is freely available on GitHub: ehrc/TRIBES/ natalie.twine@csiro.au XXXX

Publication

Dual-functioning transcription factors in the developmental gene network of Drosophila melanogaster

Publisher: Springer Science and Business Media LLC

Date: 02-07-2010

DOI: 10.1186/1471-2105-11-366

Publication

VariantSpark, a cloud-based random forest GWAS platform, identifies novel loci and epistasis in Alzheimer's disease

Publisher: Springer Science and Business Media LLC

Date: 17-10-2023

DOI: 10.1038/S41598-023-44378-Y

Publication

Predicting CRISPR-Cas12a guide efficiency for targeting using Machine Learning

Publisher: Cold Spring Harbor Laboratory

Date: 03-2023

DOI: 10.1101/2023.02.28.530512

Abstract: Genome editing through the development of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) – Cas technology has revolutionized many fields in biology. Beyond Cas9 nucleases, Cas12a (formerly Cpf1) has emerged as a promising alternative to Cas9 for editing AT-rich genomes. Despite the promises, guide RNA efficiency prediction through computational tools search still lacks accuracy. Through a computational meta-analysis, here we report that Cas12a target and off-target cleavage behaviour are factor of nucleotide bias combined with nucleotide mismatches relative to the protospacer adjacent motif (PAM) site. These features helped to train a machine learning random forest algorithm to improve the accuracy by at least 15% to existing algorithms to predict guide RNA efficiency for Cas12a enzyme. Despite the progresses, our report underscores the need for more representative datasets and further benchmarking to reliably and accurately predict guide RNA efficiency and off-target effects for Cas12a enzymes.

Publication

Novel Alzheimer’s disease genes and epistasis identified using machine learning GWAS platform

Publisher: Cold Spring Harbor Laboratory

Date: 05-10-2023

DOI: 10.1101/2023.10.04.23296569

Publication

Optimized nickase- and nuclease-based prime editing in human and mouse cells

Publisher: Cold Spring Harbor Laboratory

Date: 02-07-2021

DOI: 10.1101/2021.07.01.450810

Abstract: Precise genomic modification using prime editing (PE) holds enormous potential for research and clinical applications. Currently, the delivery of PE components to mammalian cell lines requires multiple plasmid vectors. To overcome this limitation, we generated all-in-one prime editing (PEA1) constructs that carry all the components required for PE, along with a selection marker. We tested these constructs (with selection) in HEK293T, K562, HeLa and mouse embryonic stem (ES) cells. We discovered that PE efficiency in HEK293T cells was much higher than previously observed, reaching up to 95% (mean 67%). The efficiency in K562 and HeLa cells, however, remained low. To improve PE efficiency in K562 and HeLa, we generated a nuclease prime editor and tested this system in these cell lines as well as mouse ES cells. PE-nuclease greatly increased prime editing initiation, however, installation of the intended edits was often accompanied by extra insertions derived from the repair template. Finally, we show that zygotic injection of the nuclease prime editor can generate correct modifications in mouse fetuses with up to 100% efficiency. In summary, PE-nuclease and the PEA1 plasmids provide new tools to generate intended edits with high efficiency.

Publication

VARSCOT: variant-aware detection and scoring enables sensitive and personalized off-target detection for CRISPR-Cas9

Publisher: Springer Science and Business Media LLC

Date: 27-06-2019

DOI: 10.1186/S12896-019-0535-5

Publication

Host porphobilinogen deaminase deficiency confers malaria resistance in Plasmodium chabaudi but not in Plasmodium berghei or Plasmodium falciparum during intraerythrocytic growth

Publisher: Cold Spring Harbor Laboratory

Date: 27-03-2019

DOI: 10.1101/589242

Abstract: An important component in host resistance to malaria infection are inherited mutations that give rise to abnormalities and deficiencies in erythrocyte proteins and enzymes. Understanding how such mutations confer protection against the disease may be useful for developing new treatment strategies. A mouse ENU-induced mutagenesis screen for novel malaria resistance-conferring mutations identified a novel nonsense mutation in the gene encoding porphobilinogen deaminase (PBGD) in mice, denoted here as Pbgd MRI58155 . Heterozygote Pbgd MRI58155 mice exhibited approximately 50% reduction in cellular PBGD activity in both mature erythrocytes and reticulocytes, although enzyme activity was approximately 10 times higher in reticulocytes than erythrocytes. When challenged with blood-stage P. chabaudi , which preferentially infects erythrocytes, heterozygote mice showed a modest but significant resistance to infection, including reduced parasite growth. A series of assays conducted to investigate the mechanism of resistance indicated that mutant erythrocyte invasion by P. chabaudi was normal, but that following intraerythrocytic establishment a significantly greater proportions of parasites died and therefore affected their ability to propagate. The Plasmodium resistance phenotype was not recapitulated in Pbgd -deficient mice infected with P. berghei , which prefers reticulocytes, or when P. falciparum was cultured in erythrocytes from patients with acute intermittent porphyria (AIP), which had modest (20-50%) reduced levels of PBGD. Furthermore, the growth of Pbgd -null P. falciparum and Pbgd -null P. berghei parasites, which grew at the same rate as their wild-type counterparts in normal cells, were not affected by the PBGD-deficient background of the AIP erythrocytes or Pbgd -deficient mice. Our results confirm the dispensability of parasite PBGD for P. berghei infection and intraerythrocytic growth of P. falciparum , but for the first time identify a requirement for host erythrocyte PBGD by P. chabaudi during in vivo blood stage infection. The causative agent of malaria, Plasmodium , adopts a parasitic lifestyle during erythrocyte infection, and as such relies on host cell factors for its survival and growth. Host-encoded mutations that alter the availability of these factors confer disease resistance, including several well-known genetic erythrocyte abnormalities that have arisen due to the historical evolutionary pressure of malaria. This study identified in mice a novel malaria resistance-conferring host mutation in the heme biosynthesis enzyme, porphobilinogen deaminase (PBGD), and compared the relative requirements by Plasmodium for the host versus parasite-encoded forms of PBGD in both in vivo and in vitro settings. The findings demonstrated that parasite PBGD was dispensable, but that the host enzyme was important specifically during in vivo infection by P. chabaudi , and collectively suggest that Plasmodium requires a certain threshold of the enzyme to sustain its intraerythrocytic growth. Plasmodium may therefore be vulnerable to other interventions that limit host PBGD activity.

Publication

High Activity Target-Site Identification Using Phenotypic Independent CRISPR-Cas9 Core Functionality

Publisher: Mary Ann Liebert Inc

Date: 04-2018

DOI: 10.1089/CRISPR.2017.0021

Abstract: The activity of CRISPR-Cas9 target sites can be measured experimentally through phenotypic assays or mutation rate and used to build computational models to predict activity of novel target sites. However, currently published models have been reported to perform poorly in situations other than their training conditions. In this study, we hence investigate how different sources of data influence predictive power and identify the best data set for the most robust predictive model. We use the activity of 28,606 target sites and a machine learning approach to train a predictive model of CRISPR-Cas9 activity, outperforming other published methods by an average increase in accuracy of 80% for prediction of the degree of activity and 13% for classification into active and inactive categories. We find that using data sets that measure CRISPR-Cas9 activity through sequencing provides more accurate predictions of activity. Our model, dubbed TUSCAN, is highly scalable, predicting the activity of 5000 target sites in under 7 s, making it suitable for genome-wide screens. We conclude that sophisticated machine learning methods can classify binary CRISPR-Cas9 activity however, predicting fine-scale activity scores will require larger data sets directly measuring Indel insertion rate.

Publication

Effects of low-dose hydrocortisone and hydrocortisone plus fludrocortisone in adults with septic shock: a protocol for a systematic review and meta-analysis of individual participant data

Publisher: BMJ

Date: 12-2020

DOI: 10.1136/BMJOPEN-2020-040931

Abstract: The benefits and risks of low-dose hydrocortisone in patients with septic shock have been investigated in numerous randomised controlled trials and trial-level meta-analyses. Yet, the routine use of this treatment remains controversial. To overcome the limitations of previous meta-analyses inherent to the use of aggregate data, we will perform an in idual patient data meta-analysis (IPDMA) on the effect of hydrocortisone with or without fludrocortisone compared with placebo or usual care on 90-day mortality and other outcomes in patients with septic shock. To assess the benefits and risks of hydrocortisone, with or without fludrocortisone for adults with septic shock, we will search major electronic databases from inception to September 2020 (Cochrane Central Register of Controlled Trials, MEDLINE, EMBASE and Latin American Caribbean Health Sciences Literature), complimented by a search for unpublished trials. The primary analysis will compare hydrocortisone with or without fludrocortisone to placebo or no treatment in adult patients with septic shock. Secondary analyses will compare hydrocortisone to placebo (or usual care), hydrocortisone plus fludrocortisone to placebo (or usual care), and hydrocortisone versus hydrocortisone plus fludrocortisone. The primary outcome will be all cause mortality at 90 days. We will conduct both one-stage IPDMA using mixed-effect models and machine learning with targeted maximum likelihood analyses. We will assess the risk of bias related to unshared data and related to the quality of in idual trial. This IPDMA will use existing data from completed randomised clinical trials and will comply with the ethical and regulatory requirements regarding data sharing for each of the component trials. The findings of this study will be submitted for publication in a peer-review journal with straightforward policy for open access. CRD42017062198.

Publication

Unlocking HDR-mediated Nucleotide Editing by identifying high-efficiency target sites using machine learning

Publisher: Cold Spring Harbor Laboratory

Date: 07-11-2018

DOI: 10.1101/464610

Abstract: Editing in idual nucleotides is a crucial component for validating genomic disease association. It currently is h ered by CRISPR-Cas-mediated “base editing” being limited to certain nucleotide changes, and only achievable within a small window around CRISPR-Cas target sites. The more versatile alternative, HDR (homology directed repair), has a 4-fold lower efficiency with known optimization factors being largely immutable in experiments. Here, we investigated the variable efficiency-governing factors on a novel mouse dataset using machine learning. We found the sequence composition of the repair template (ssODN) to be a governing factor, where different regions of the ssODN have variable influence, which reflects the underlying biophysical mechanism. Our model improves HDR efficiency by 83% compared to traditionally chosen targets. Using our findings, we develop CUNE (Computational Universal Nucleotide Editor), which enables users to identify and design the optimal targeting strategy using traditional base editing or – for-the-first-time – HDR-mediated nucleotide changes. CUNE can be run via the web at: une

Publication

IBD analysis of Australian amyotrophic lateral sclerosis SOD1-mutation carriers identifies five founder events and links sporadic cases to existing ALS families

Publisher: Cold Spring Harbor Laboratory

Date: 28-06-2019

DOI: 10.1101/685925

Abstract: Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disorder characterised by the loss of upper and lower motor neurons resulting in paralysis and eventual death. Approximately 10% of ALS cases have a family history of disease, while the remaining cases present as apparently sporadic. Heritability studies suggest a significant genetic component to sporadic ALS, and although most sporadic cases have an unknown genetic etiology, some familial ALS mutations have also been found in sporadic cases. This suggests that some sporadic cases may be unrecognised familial cases with reduced disease penetrance. Identifying a familial basis of disease in apparently sporadic ALS cases has significant genetic counselling implications for immediate relatives. A powerful strategy to uncover a familial link is identity-by-descent (IBD) analysis which detects genomic regions that have been inherited from a common ancestor. We performed IBD analysis on 90 Australian familial ALS cases from 25 families and three sporadic ALS cases, each of whom carried one of three SOD1 mutations (p.I114T, p.V149G and p.E101G). We identified five unique haplotypes that carry these mutations in our cohort, indicative of five founder events. This included two different haplotypes that carry SOD1 p.I114T, where one haplotype was present in one sporadic case and 20 families, while the second haplotype was found in the remaining two sporadic cases and one family, thus linking these familial and sporadic cases. Furthermore, we linked two families that carry SOD1 p.V149G and found that SOD1 p.E101G arose independently in each family that carries this mutation.

Publication

A novel ENU-induced ankyrin-1 mutation impairs parasite invasion and increases erythrocyte clearance during malaria infection in mice

Publisher: Springer Science and Business Media LLC

Date: 16-11-2016

DOI: 10.1038/SREP37197

Abstract: Genetic defects in various red blood cell (RBC) cytoskeletal proteins have been long associated with changes in susceptibility towards malaria infection. In particular, while ankyrin (Ank-1) mutations account for approximately 50% of hereditary spherocytosis (HS) cases, an association with malaria is not well-established, and conflicting evidence has been reported. We describe a novel N-ethyl-N-nitrosourea (ENU)-induced ankyrin mutation MRI61689 that gives rise to two different ankyrin transcripts: one with an introduced splice acceptor site resulting a frameshift, the other with a skipped exon. Ank-1 (MRI61689/ +) mice exhibit an HS-like phenotype including reduction in mean corpuscular volume (MCV), increased osmotic fragility and reduced RBC deformability. They were also found to be resistant to rodent malaria Plasmodium chabaudi infection. Parasites in Ank-1 (MRI61689/ +) erythrocytes grew normally, but red cells showed resistance to merozoite invasion. Uninfected Ank-1 (MRI61689/ +) erythrocytes were also more likely to be cleared from circulation during infection the “bystander effect”. This increased clearance is a novel resistance mechanism which was not observed in previous ankyrin mouse models. We propose that this bystander effect is due to reduced deformability of Ank-1 (MRI61689/ +) erythrocytes. This paper highlights the complex roles ankyrin plays in mediating malaria resistance.

Publication

Predicting SUMOylation sites

Publisher: Springer Berlin Heidelberg

Date: 2008

DOI: 10.1007/978-3-540-88436-1_3

Publication

Mutation analysis of MATR3 in Australian familial amyotrophic lateral sclerosis

Publisher: Elsevier BV

Date: 03-2015

DOI: 10.1016/J.NEUROBIOLAGING.2014.11.010

Abstract: Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease that arises from the progressive degeneration of the motor neurons. Recently, mutations in the matrin 3 (MATR3) gene were described in both ALS and autosomal dominant distal myopathy with vocal cord and pharyngeal weakness. We sought to determine the prevalence of MATR3 mutations in Australian familial ALS (n = 106) using whole exome sequencing. No mutations were identified, indicating that MATR3 mutations are not a common cause of ALS in Australian familial cases with predominately European ancestry.

Publication

A bioinformatic pipeline for simulating viral integration data

Publisher: Elsevier BV

Date: 06-2022

DOI: 10.1016/J.DIB.2022.108161

Publication

Predicting structural disruption of proteins caused by crossover

Publisher: IEEE

Date: 2005

DOI: 10.1109/CIBCB.2005.1594962

Publication

GOANA: A Universal High-Throughput Web Service for Assessing and Comparing the Outcome and Efficiency of Genome Editing Experiments

Publisher: Mary Ann Liebert Inc

Date: 04-2021

DOI: 10.1089/CRISPR.2020.0068

Publication

Thresholding Gini Variable Importance with a Single-Trained Random Forest: An Empirical Bayes Approach

Publisher: Elsevier BV

Date: 2023

DOI: 10.1016/J.CSBJ.2023.08.033

Publication

The Current State and Future of CRISPR-Cas9 gRNA Design Tools

Publisher: Frontiers Media SA

Date: 12-07-2018

DOI: 10.3389/FPHAR.2018.00749

Publication

STAR: predicting recombination sites from amino acid sequence

Publisher: Springer Science and Business Media LLC

Date: 08-10-2006

DOI: 10.1186/1471-2105-7-437

Publication

Interoperable medical data: The missing link for understanding COVID‐19

Publisher: Hindawi Limited

Date: 29-01-2021

DOI: 10.1111/TBED.13892

Publication

VariantSpark, A Random Forest Machine Learning Implementation for Ultra High Dimensional Data

Publisher: Cold Spring Harbor Laboratory

Date: 15-07-2019

DOI: 10.1101/702902

Abstract: The demands on machine learning methods to cater for ultra high dimensional datasets, datasets with millions of features, have been increasing in domains like life sciences and the Internet of Things (IoT). While Random Forests are suitable for “wide” datasets, current implementations such as Google’s PLANET lack the ability to scale to such dimensions. Recent improvements by Yggdrasil begin to address these limitations but do not extend to Random Forest . This paper introduces CursedForest , a novel Random Forest implementation on top of Apache Spark and part of the VariantSpark platform, which parallelises processing of all nodes over the entire forest. CursedForest is 9 and up to 89 times faster than Google’s PLANET and Yggdrasil , respectively, and is the first method capable of scaling to millions of features.

Publication

Human and microbial transcriptomics from lean and obese individuals with colorectal cancer: A comparison of Total and Poly A RNA sequencing from clinical samples.

Publisher: American Association for Cancer Research (AACR)

Date: 04-2013

DOI: 10.1158/1538-7445.AM2013-LB-237

Abstract: Australia and New Zealand have the highest worldwide CRC incidence with CRC being the 2nd most commonly diagnosed cancer and the 3rd most common cause of cancer death among both men and women. CRC is a complex disease arising from the impact of environmental factors, including diet and lifestyle choices on different genetic backgrounds. Obesity and type 2 diabetes are significant risk factors for CRC, the levels of which are increasing in Australia. As a consequence CRC is projected to increase, being the most common Australian cancer by 2025. While the association of obesity with CRC has been reported by a number of studies, the mechanism\\s that contribute to CRC development in the context of obesity are unknown. Understanding these mechanisms and potential genetic susceptibility is crucial for developing effective intervention and screening strategies. Recent evidence suggests that gut microbial communities are altered in obese in iduals, resulting in a change in the predominant species and overall loss of community ersity but it is unknown if these changes impact CRC development or progression. We have established a comprehensive tissue, blood and microbial collection from 150 lean and obese colorectal cancer patients in the Hunter Region, Australia and commenced an integrative pilot next generation sequencing project to begin to unravel the link between obesity and CRC. Twelve in iduals (6 lean and 6 obese) were selected for complementary next-generation sequencing generating matched sequencing data sets (Total RNA-seq, Poly A RNA-seq, microbial RNA-seq and exome-seq) from normal colon, colon tumor, adipose and digesta s les s les. For this purpose, we have developed a method for simultaneously isolating human and microbial RNA from gut lumen of sufficient quality and quantity from limited material. Results presented will evaluate the differences in host and microbial transciptomics in normal colon and tumor tissue of lean and obese in iduals with CRC and methodological comparisons between Total RNA-seq and Poly A RNA-seq will be reported. Citation Format: Desma M. Grice, Denis C. Bauer, Konsta Duesing, Dongmei Li, Paul Greenfield, Sarah Nielsen, Brian Draganic, Steve Smith, Peter Pockney, Rodney Scott, Garry N. Hannan. Human and microbial transcriptomics from lean and obese in iduals with colorectal cancer: A comparison of Total and Poly A RNA sequencing from clinical s les. [abstract]. In: Proceedings of the 104th Annual Meeting of the American Association for Cancer Research 2013 Apr 6-10 Washington, DC. Philadelphia (PA): AACR Cancer Res 2013 (8 Suppl):Abstract nr LB-237. doi:10.1158/1538-7445.AM2013-LB-237

Publication

Ankyrin-1 gene exhibits allelic heterogeneity in conferring protection against malaria

Publisher: Oxford University Press (OUP)

Date: 09-0001

DOI: 10.1534/G3.117.300079

Abstract: Allelic heterogeneity is a common phenomenon where a gene exhibits a different phenotype depending on the nature of its genetic mutations. In the context of genes affecting malaria susceptibility, it allowed us to explore and understand the intricate host–parasite interactions during malaria infections. In this study, we described a gene encoding erythrocytic ankyrin-1 (Ank-1) which exhibits allelic-dependent heterogeneous phenotypes during malaria infections. We conducted an ENU mutagenesis screen on mice and identified two Ank-1 mutations, one resulting in an amino acid substitution (MRI95845), and the other a truncated Ank-1 protein (MRI96570). Both mutations caused hereditary spherocytosis-like phenotypes and confer differing protection against Plasmodium chabaudi infections. Upon further examination, the Ank-1(MRI96570) mutation was found to inhibit intraerythrocytic parasite maturation, whereas Ank-1(MRI95845) caused increased bystander erythrocyte clearance during infection. This is the first description of allelic heterogeneity in ankyrin-1 from the direct comparison between two Ank-1 mutations. Despite the lack of direct evidence from population studies, this data further supported the protective roles of ankyrin-1 mutations in conferring malaria protection. This study also emphasized the importance of such phenomena in achieving a better understanding of host–parasite interactions, which could be the basis of future studies.

Publication

Optimizing static thermodynamic models of transcriptional regulation

Publisher: Oxford University Press (OUP)

Date: 27-04-2009

DOI: 10.1093/BIOINFORMATICS/BTP283

Abstract: Motivation: Modeling transcriptional regulation using thermo-dynamic modeling approaches has become increasingly relevant as a way to gain a detailed understanding of transcriptional regulation. Thermodynamic models are able to model the interactions between transcription factors (TFs) and DNA that lead to a specific transcriptional output of the target gene. Such models can be ‘trained’ by fitting their free parameters to data on the transcription rate of a gene and the concentrations of its regulating factors. However, the parameter fitting process is computationally very expensive and this limits the number of alternative types of model that can be explored. Results: In this study, we evaluate the ‘optimization landscape’ of a class of static, quantitative models of regulation and explore the efficiency of a range of optimization methods. We evaluate eight optimization methods: two variants of simulated annealing (SA), four variants of gradient descent (GD), a hybrid SA/GD algorithm and a genetic algorithm. We show that the optimization landscape has numerous local optima, resulting in poor performance for the GD methods. SA with a simple geometric cooling schedule performs best among all tested methods. In particular, we see no advantage to using the more sophisticated ‘LAM’ cooling schedule. Overall, a good approximate solution is achievable in minutes using SA with a simple cooling schedule. Contact: d.bauer@uq.edu.au t.bailey@imb.uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data

Publisher: Cold Spring Harbor Laboratory

Date: 05-2012

DOI: 10.1101/GR.130237.111

Abstract: Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand in its major groove. This sequence-specific process offers a potent mechanism for targeting genomic loci of interest that is of great value for biotechnological and gene-therapeutic applications. It is likely that nature has leveraged this addressing system for gene regulation, because computational studies have uncovered an abundance of putative triplex target sites in various genomes, with enrichment particularly in gene promoters. However, to draw a more complete picture of the in vivo role of triplexes, not only the putative targets but also the sequences acting as the third strand and their capability to pair with the predicted target sites need to be studied. Here we present Triplexator, the first computational framework that integrates all aspects of triplex formation, and showcase its potential by discussing research ex les for which the different aspects of triplex formation are important. We find that chromatin-associated RNAs have a significantly higher fraction of sequence features able to form triplexes than expected at random, suggesting their involvement in gene regulation. We furthermore identify hundreds of human genes that contain sequence features in their promoter predicted to be able to form a triplex with a target within the same promoter, suggesting the involvement of triplexes in feedback-based gene regulation. With focus on biotechnological applications, we screen mammalian genomes for high-affinity triplex target sites that can be used to target genomic loci specifically and find that triplex formation offers a resolution of ∼1300 nt.

Publication

STREAM: Static Thermodynamic REgulAtory Model of transcription

Publisher: Oxford University Press (OUP)

Date: 06-09-2008

DOI: 10.1093/BIOINFORMATICS/BTN467

Abstract: Motivation: Understanding the transcriptional regulation of a gene in detail is a crucial step towards uncovering and ultimately utilizing the regulatory grammar of the genome. Modeling transcriptional regulation using thermodynamic equations has become an increasingly important approach towards this goal. Here, we present stream, the first publicly available framework for modeling, visualizing and predicting the regulation of the transcription rate of a target gene. Given the concentrations of a set of transcription factors (TFs), the TF binding sites (TFBSs) in a regulatory DNA region, and the transcription rate of the target gene, stream will optimize its parameters to generate a model that best fits the input data. This trained model can then be used to (a) validate that the given set of TFs is able to regulate the target gene and (b) to predict the transcription rate under different conditions (e.g. different tissues, knockout/additional TFs or mutated/missing TFBSs). Availability: The platform independent executable of stream, as well as a tutorial and the full documentation, are available at bioinformatics.org.au/stream/. stream requires Java version 5 or higher. Contact: d.bauer@imb.uq.edu.au t.bailey@imb.uq.edu.au

Publication

Feasibility of Targeted Next-Generation DNA Sequencing for Expanding Population Newborn Screening

Publisher: Oxford University Press (OUP)

Date: 14-07-2023

DOI: 10.1093/CLINCHEM/HVAD066

Abstract: Newborn screening (NBS) is an effective public health intervention that reduces death and disability from treatable genetic diseases, but many conditions are not screened due to a lack of a suitable assay. Whole genome and whole exome sequencing can potentially expand NBS but there remain many technical challenges preventing their use in population NBS. We investigated if targeted gene sequencing (TGS) is a feasible methodology for expanding NBS. We constructed a TGS panel of 164 genes which screens for a broad range of inherited conditions. We designed a high-volume, low-turnaround laboratory and bioinformatics workflow that avoids the technical and data interpretation challenges associated with whole genome and whole exome sequencing. A methods-based analytical validation of the assay was completed and test performance in 2552 newborns examined. We calculated annual birth estimates for each condition to assess cost-effectiveness. Assay analytical sensitivity was & % and specificity was 100%. Of the newborns screened, 1.3% tested positive for a condition. On average, each in idual had 225 variants to interpret and 1.8% were variants of uncertain significance (VUS). The turnaround time was 7 to 10 days. Maximum batch size was 1536 s les. We demonstrate that a TGS assay could be incorporated into an NBS program soon to increase the number of conditions screened. Additionally, we conclude that NBS using TGS may be cost-effective.

Publication

Genomics and personalised whole-of-life healthcare

Publisher: Elsevier BV

Date: 09-2014

DOI: 10.1016/J.MOLMED.2014.04.001

Abstract: Genome sequencing has the potential for stratified cancer treatment and improved diagnostics for rare disorders. However, sequencing needs to be utilised in risk stratification on a population scale to deepen the impact on the health system by addressing common diseases, where in idual genomic variants have variable penetrance and minor impact. As the accuracy of genomic risk predictors is bounded by heritability, environmental factors such as diet, lifestyle, and microbiome have to be considered. Large-scale, longitudinal research programmes need to study the intrinsic properties between both genetics and environment to unravel their risk contribution. During this discovery process, frameworks need to be established to counteract unrealistic expectations. Sufficient scientific evidence is needed to interpret sources of uncertainty and inform decision making for clinical management and personal health.

Publication

Fast and Accurate Exhaustive Higher-Order Epistasis Search with BitEpi

Publisher: Cold Spring Harbor Laboratory

Date: 29-11-2019

DOI: 10.1101/858282

Abstract: Complex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.e. Single Nucleotide Variant or SNV for short). BitEpi introduces a novel bitwise algorithm that is 2.1 and 56 times faster for 3-SNV and 4-SNV search, than established software. The novel entropy statistic used in BitEpi is 44% more accurate to identify interactive SNVs, incorporating a p -value-based significance testing. We demonstrate BitEpi on real world data of 4,900 s les and 87,000 SNPs. We also present EpiExplorer to visualize the potentially large number of in idual and interacting SNVs in an interactive Cytoscape graph. EpiExplorer uses various visual elements to facilitate the discovery of true biological events in a complex polygenic environment.

Publication

Evaluation of computational programs to predict HLA genotypes from genomic sequencing data

Publisher: Oxford University Press (OUP)

Date: 11-2016

DOI: 10.1093/BIB/BBW097

Publication

A comparative study of multi-omics integration tools for cancer driver gene identification and tumour subtyping

Publisher: Oxford University Press (OUP)

Date: 27-11-2020

DOI: 10.1093/BIB/BBZ121

Abstract: Oncogenesis and cancer can arise as a consequence of a wide range of genomic aberrations including mutations, copy number alterations, expression changes and epigenetic modifications encompassing multiple omics layers. Integrating genomic, transcriptomic, proteomic and epigenomic datasets via multi-omics analysis provides the opportunity to derive a deeper and holistic understanding of the development and progression of cancer. There are two primary approaches to integrating multi-omics data: multi-staged (focused on identifying genes driving cancer) and meta-dimensional (focused on establishing clinically relevant tumour or s le classifications). A number of ready-to-use bioinformatics tools are available to perform both multi-staged and meta-dimensional integration of multi-omics data. In this study, we compared nine different integration tools using real and simulated cancer datasets. The performance of the multi-staged integration tools were assessed at the gene, function and pathway levels, while meta-dimensional integration tools were assessed based on the s le classification performance. Additionally, we discuss the influence of factors such as data representation, s le size, signal and noise on multi-omics data integration. Our results provide current and much needed guidance regarding selection and use of the most appropriate and best performing multi-omics integration tools.

Publication

Identity by descent analysis identifies founder events and links SOD1 familial and sporadic ALS cases

Publisher: Springer Science and Business Media LLC

Date: 07-08-2020

DOI: 10.1038/S41525-020-00139-8

Abstract: Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disorder characterised by the loss of upper and lower motor neurons resulting in paralysis and eventual death. Approximately 10% of ALS cases have a family history of disease, while the remainder present as apparently sporadic cases. Heritability studies suggest a significant genetic component to sporadic ALS, and although most sporadic cases have an unknown genetic aetiology, some familial ALS mutations have also been found in sporadic cases. This suggests that some sporadic cases may be unrecognised familial cases with reduced disease penetrance in their ancestors. A powerful strategy to uncover a familial link is identity-by-descent (IBD) analysis, which detects genomic regions that have been inherited from a common ancestor. IBD analysis was performed on 83 Australian familial ALS cases from 25 families and three sporadic ALS cases, each of whom carried one of three SOD1 mutations (p.I114T, p.V149G and p.E101G). We defined five unique 350-SNP haplotypes that carry these mutations in our cohort, indicative of five founder events. This included two founder haplotypes that carry SOD1 p.I114T linking familial and sporadic cases. We found that SOD1 p.E101G arose independently in each family that carries this mutation and linked two families that carry SOD1 p.V149G. The age of disease onset varied between cases that carried each SOD1 p.I114T haplotype. Linking families with identical ALS mutations allows for larger s le sizes and increased statistical power to identify putative phenotypic modifiers.

Publication

Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan (Preprint)

Publisher: JMIR Publications Inc.

Date: 27-01-2022

DOI: 10.2196/PREPRINTS.36261

Abstract: he LOVIT (Lessening Organ Dysfunction with Vitamin C) trial is a blinded multicenter randomized clinical trial comparing high-dose intravenous vitamin C to placebo in patients admitted to the intensive care unit with proven or suspected infection as the main diagnosis and receiving a vasopressor. e aim to describe a prespecified statistical analysis plan (SAP) for the LOVIT trial prior to unblinding and locking of the trial database. he SAP was designed by the LOVIT principal investigators and statisticians, and approved by the steering committee and coinvestigators. The SAP defines the primary and secondary outcomes, and describes the planned primary, secondary, and subgroup analyses. he SAP includes a draft participant flow diagram, tables, and planned figures. The primary outcome is a composite of mortality and persistent organ dysfunction (receipt of mechanical ventilation, vasopressors, or new renal replacement therapy) at 28 days, where day 1 is the day of randomization. All analyses will use a frequentist statistical framework. The analysis of the primary outcome will estimate the risk ratio and 95% CI in a generalized linear mixed model with binomial distribution and log link, with site as a random effect. We will perform a secondary analysis adjusting for prespecified baseline clinical variables. Subgroup analyses will include age, sex, frailty, severity of illness, Sepsis-3 definition of septic shock, baseline ascorbic acid level, and COVID-19 status. e have developed an SAP for the LOVIT trial and will adhere to it in the analysis phase. ERR1-10.2196/36261

Publication

Genome-wide Analyses Identify KIF5A as a Novel ALS Gene

Publisher: Elsevier BV

Date: 03-2018

DOI: 10.1016/J.NEURON.2018.02.027

Publication

Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations

Publisher: Cold Spring Harbor Laboratory

Date: 06-04-2016

DOI: 10.1101/GR.201517.115

Abstract: A three-dimensional chromatin state underpins the structural and functional basis of the genome by bringing regulatory elements and genes into close spatial proximity to ensure proper, cell-type–specific gene expression profiles. Here, we performed Hi-C chromosome conformation capture sequencing to investigate how three-dimensional chromatin organization is disrupted in the context of copy-number variation, long-range epigenetic remodeling, and atypical gene expression programs in prostate cancer. We find that cancer cells retain the ability to segment their genomes into megabase-sized topologically associated domains (TADs) however, these domains are generally smaller due to establishment of additional domain boundaries. Interestingly, a large proportion of the new cancer-specific domain boundaries occur at regions that display copy-number variation. Notably, a common deletion on 17p13.1 in prostate cancer spanning the TP53 tumor suppressor locus results in bifurcation of a single TAD into two distinct smaller TADs. Change in domain structure is also accompanied by novel cancer-specific chromatin interactions within the TADs that are enriched at regulatory elements such as enhancers, promoters, and insulators, and associated with alterations in gene expression. We also show that differential chromatin interactions across regulatory regions occur within long-range epigenetically activated or silenced regions of concordant gene activation or repression in prostate cancer. Finally, we present a novel visualization tool that enables integrated exploration of Hi-C interaction data, the transcriptome, and epigenome. This study provides new insights into the relationship between long-range epigenetic and genomic dysregulation and changes in higher-order chromatin interactions in cancer.

Publication

Cover Image

Publisher: Hindawi Limited

Date: 07-2020

DOI: 10.1111/TBED.13242

Publication

Triplex-Inspector: an analysis tool for triplex-mediated targeting of genomic loci

Publisher: Oxford University Press (OUP)

Date: 05-06-2013

DOI: 10.1093/BIOINFORMATICS/BTT315

Abstract: Summary: At the heart of many modern biotechnological and therapeutic applications lies the need to target specific genomic loci with pinpoint accuracy. Although landmark experiments demonstrate technological maturity in manufacturing and delivering genetic material, the genomic sequence analysis to find suitable targets lags behind. We provide a computational aid for the sophisticated design of sequence-specific ligands and selection of appropriate targets, taking gene location and genomic architecture into account. Availability: Source code and binaries are downloadable from www.bioinformatics.org.au/triplexator/inspector. Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data

Publisher: Oxford University Press (OUP)

Date: 16-06-2012

DOI: 10.1093/NAR/GKS505

Publication

Assigning roles to DNA regulatory motifs using comparative genomics

Publisher: Oxford University Press (OUP)

Date: 10-02-2010

DOI: 10.1093/BIOINFORMATICS/BTQ049

Abstract: Motivation: Transcription factors (TFs) are crucial during the lifetime of the cell. Their functional roles are defined by the genes they regulate. Uncovering these roles not only sheds light on the TF at hand but puts it into the context of the complete regulatory network. Results: Here, we present an alignment- and threshold-free comparative genomics approach for assigning functional roles to DNA regulatory motifs. We incorporate our approach into the Gomo algorithm, a computational tool for detecting associations between a user-specified DNA regulatory motif [expressed as a position weight matrix (PWM)] and Gene Ontology (GO) terms. Incorporating multiple species into the analysis significantly improves Gomo's ability to identify GO terms associated with the regulatory targets of TFs. Including three comparative species in the process of predicting TF roles in Saccharomyces cerevisiae and Homo sapiens increases the number of significant predictions by 75 and 200%, respectively. The predicted GO terms are also more specific, yielding deeper biological insight into the role of the TF. Adjusting motif (binding) affinity scores for in idual sequence composition proves to be essential for avoiding false positive associations. We describe a novel DNA sequence-scoring algorithm that compensates a thermodynamic measure of DNA-binding affinity for in idual sequence base composition. Gomo's prediction accuracy proves to be relatively insensitive to how promoters are defined. Because Gomo uses a threshold-free form of gene set analysis, there are no free parameters to tune. Biologists can investigate the potential roles of DNA regulatory motifs of interest using Gomo via the web (meme.nbcr.net). Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

Sorting the nuclear proteome

Publisher: Oxford University Press (OUP)

Date: 14-05-2006

DOI: 10.1093/BIOINFORMATICS/BTR217

Abstract: Motivation: Quantitative experimental analyses of the nuclear interior reveal a morphologically structured yet dynamic mix of membraneless compartments. Major nuclear events depend on the functional integrity and timely assembly of these intra-nuclear compartments. Yet, unknown drivers of protein mobility ensure that they are in the right place at the time when they are needed. Results: This study investigates determinants of associations between eight intra-nuclear compartments and their proteins in heterogeneous genome-wide data. We develop a model based on a range of candidate determinants, capable of mapping the intra-nuclear organization of proteins. The model integrates protein interactions, protein domains, post-translational modification sites and protein sequence data. The predictions of our model are accurate with a mean AUC (over all compartments) of 0.71. We present a complete map of the association of 3567 mouse nuclear proteins with intra-nuclear compartments. Each decision is explained in terms of essential interactions and domains, and qualified with a false discovery assessment. Using this resource, we uncover the collective role of transcription factors in each of the compartments. We create diagrams illustrating the outcomes of a Gene Ontology enrichment analysis. Associated with an extensive range of transcription factors, the analysis suggests that PML bodies coordinate regulatory immune responses. Contact: m.boden@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

Cpipe: A shared variant detection pipeline designed for diagnostic settings

Publisher: Springer Science and Business Media LLC

Date: 10-07-2015

DOI: 10.1186/S13073-015-0191-X

Publication

Gut permeability, its interaction with gut microflora and effects on metabolic health are mediated by the lymphatics system, liver and bile acid

Publisher: Future Medicine Ltd

Date: 08-2015

DOI: 10.2217/FMB.15.54

Abstract: There is evidence to link obesity (and metabolic syndrome) with alterations in gut permeability and microbiota. The underlying mechanisms have been questioned and have prompted this review. We propose that the gut barrier function is a primary driver in maintaining metabolic health with poor health being linked to ‘gut leakiness'. This review will highlight changes in intestinal permeability and how it may change gut microflora and subsequently affect metabolic health by influencing the functioning of major bodily organs/organ systems: the lymphatic system, liver and pancreas. We also discuss the likelihood that metabolic syndrome undergoes a cyclic worsening facilitated by an increase in intestinal permeability leading to gut dysbiosis, culminating in ongoing poor health leading to further exacerbated gut leakiness.

Publication

Genetic Analysis of Tryptophan Metabolism Genes in Sporadic Amyotrophic Lateral Sclerosis

Publisher: Frontiers Media SA

Date: 14-06-2021

DOI: 10.3389/FIMMU.2021.701550

Abstract: The essential amino acid tryptophan (TRP) is the initiating metabolite of the kynurenine pathway (KP), which can be upregulated by inflammatory conditions in cells. Neuroinflammation-triggered activation of the KP and excessive production of the KP metabolite quinolinic acid are common features of multiple neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). In addition to its role in the KP, genes involved in TRP metabolism, including its incorporation into proteins, and synthesis of the neurotransmitter serotonin, have also been genetically and functionally linked to these diseases. ALS is a late onset neurodegenerative disease that is classified as familial or sporadic, depending on the presence or absence of a family history of the disease. Heritability estimates support a genetic basis for all ALS, including the sporadic form of the disease. However, the genetic basis of sporadic ALS (SALS) is complex, with the presence of multiple gene variants acting to increase disease susceptibility and is further complicated by interaction with potential environmental factors. We aimed to determine the genetic contribution of 18 genes involved in TRP metabolism, including protein synthesis, serotonin synthesis and the KP, by interrogating whole-genome sequencing data from 614 Australian sporadic ALS cases. Five genes in the KP ( AFMID, CCBL1, GOT2, KYNU, HAAO ) were found to have either novel protein-altering variants, and/or a burden of rare protein-altering variants in SALS cases compared to controls. Four genes involved in TRP metabolism for protein synthesis ( WARS ) and serotonin synthesis ( TPH1, TPH2, MAOA ) were also found to carry novel variants and/or gene burden. These variants may represent ALS risk factors that act to alter the KP and lead to neuroinflammation. These findings provide further evidence for the role of TRP metabolism, the KP and neuroinflammation in ALS disease pathobiology.

Publication

Genetic and Pathological Assessment of hnRNPA1, hnRNPA2/B1, and hnRNPA3 in Familial and Sporadic Amyotrophic Lateral Sclerosis

Publisher: S. Karger AG

Date: 2017

DOI: 10.1159/000481258

Abstract: b i Background: /i /b Mutations in the genes encoding the heterogeneous nuclear ribonucleoproteins hnRNPA1 and hnRNPA2/B1 have been reported in a multisystem proteinopathy that includes amyotrophic lateral sclerosis (ALS) and inclusion body myopathy associated with Paget disease of the bone and frontotemporal dementia. Mutations were also described in the prion-like domain of hnRNPA1 in patients with classic ALS. Another hnRNP protein, hnRNPA3, has been found to be associated with the ALS/frontotemporal dementia protein C9orf72. b i Objective: /i /b To further assess their role in ALS, we examined these hnRNPs in spinal cord tissue from sporadic (SALS) and familial ALS (FALS) patients, including i C9orf72 /i repeat expansion-positive patients, and controls. We also sought to determine the prevalence of i HNRNPA1 /i , i HNRNPA2B1, /i and i HNRNPA3 /i mutations in Australian ALS patients. b i Methods: /i /b Immunostaining was used to assess hnRNPs in ALS patient spinal cords. Mutation analysis of the i HNRNPA1 /i , i HNRNPA2B1, /i and i HNRNPA3 /i genes was performed in FALS and of their prion-like domains in SALS patients. b i Results: /i /b Immunostaining of spinal motor neurons of ALS patients with the i C9orf72 /i repeat expansion showed significant mislocalisation of hnRNPA3, and no differences in hnRNPA1 or A2/B1 localisation, compared to controls. No novel or known mutations were identified in i HNRNPA1 /i , i HNRNPA2B1, /i or i HNRNPA3 /i in Australian ALS patients. b i Conclusions: /i /b hnRNPA3 pathology was identified in motor neurons of ALS patients with i C9orf72 /i repeat expansions, implicating hnRNPA3 in the pathogenesis of i C9orf72 /i -linked ALS. hnRNPA3 warrants further investigation into the pathogenesis of ALS linked to i C9orf72 /i . This study also determined that i HNRNP /i mutations are not a common cause of FALS and SALS in Australia.

Publication

Blue: correcting sequencing errors using consensus and context

Publisher: Oxford University Press (OUP)

Date: 11-06-2014

DOI: 10.1093/BIOINFORMATICS/BTU368

Abstract: Motivation: Bioinformatics tools, such as assemblers and aligners, are expected to produce more accurate results when given better quality sequence data as their starting point. This expectation has led to the development of stand-alone tools whose sole purpose is to detect and remove sequencing errors. A good error-correcting tool would be a transparent component in a bioinformatics pipeline, simply taking sequence data in any of the standard formats and producing a higher quality version of the same data containing far fewer errors. It should not only be able to correct all of the types of errors found in real sequence data (substitutions, insertions, deletions and uncalled bases), but it has to be both fast enough and scalable enough to be usable on the large datasets being produced by current sequencing technologies, and work on data derived from both haploid and diploid organisms. Results: This article presents Blue, an error-correction algorithm based on k-mer consensus and context. Blue can correct substitution, deletion and insertion errors, as well as uncalled bases. It accepts both FASTQ and FASTA formats, and corrects quality scores for corrected bases. Blue also maintains the pairing of reads, both within a file and between pairs of files, making it compatible with downstream tools that depend on read pairing. Blue is memory efficient, scalable and faster than other published tools, and usable on large sequencing datasets. On the tests undertaken, Blue also proved to be generally more accurate than other published algorithms, resulting in more accurately aligned reads and the assembly of longer contigs containing fewer errors. One significant feature of Blue is that its k-mer consensus table does not have to be derived from the set of reads being corrected. This decoupling makes it possible to correct one dataset, such as small set of 454 mate-pair reads, with the consensus derived from another dataset, such as Illumina reads derived from the same DNA s le. Such cross-correction can greatly improve the quality of small (and expensive) sets of long reads, leading to even better assemblies and higher quality finished genomes. Availability and implementation: The code for Blue and its related tools are available from www.bioinformatics.csiro.au/Blue . These programs are written in C# and run natively under Windows and under Mono on Linux. Contact: paul.greenfield@csiro.au Supplementary information: Supplementary data are available at Bioinformatics online.

Denis Bauer

Researcher

Research Topics

Top 5 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

Targeted next-generation sequencing of 22 mismatch repair genes identifies Lynch syndrome families

Genome-wide analysis of chemically induced mutations in mouse in phenotype-driven screens

Predicting SUMOylation sites in developmental transcription factors of Drosophila melanogaster

Isling: A Tool for Detecting Integration of Wild-Type Viruses and Clinical Vectors

Studying the functional conservation of cis-regulatory modules and their transcriptional output

Unlocking HDR-mediated nucleotide editing by identifying high-efficiency target sites using machine learning

Ankyrin-1 gene exhibits allelic heterogeneity in conferring protection against malaria

Genetic and immunopathological analysis of CHCHD10 in Australian amyotrophic lateral sclerosis and frontotemporal dementia and transgenic TDP-43 mice

Genetic analysis of GLT8D1 and ARPP21 in Australian familial and sporadic amyotrophic lateral sclerosis

A Navigation System for Base Editing: Are We There Yet?

Data-driven platform for identifying variants of interest in COVID-19 virus

Evidence for polygenic and oligogenic basis of Australian sporadic amyotrophic lateral sclerosis

Monozygotic twins and triplets discordant for amyotrophic lateral sclerosis display differential methylation and gene expression

Artificial Intelligence and Machine Learning in Bioinformatics

NGSANE

Balancing the safeguarding of privacy and data sharing: perceptions of genomic professionals on patient genomic data ownership in Australia

WHO O2CoV2: oxygen requirements and respiratory support in patients with COVID-19 in low-and-middle income countries—protocol for a multicountry, prospective, observational cohort study

Stress analysis of nano porous material using computed tomography images

Fast and accurate exhaustive higher-order epistasis search with BitEpi

A novel ENU-induced ankyrin-1 mutation impairs parasite invasion and increases erythrocyte clearance during malaria infection in mice

NGSANE: A lightweight production informatics framework for high-throughput data analysis

VariantSpark

Host Porphobilinogen Deaminase Deficiency Confers Malaria Resistance in Plasmodium chabaudi but Not in Plasmodium berghei or Plasmodium falciparum During Intraerythrocytic Growth

Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan

Domain-specific introduction to machine learning terminology, pitfalls and opportunities in CRISPR-based gene editing

Variantspark: Cloud-based machine learning for association study of complex phenotype and large-scale genomic data

Scalable genomic data exchange and analytics with sBeacon

Artificial Intelligence in Medicine: Applications, Limitations and Future Directions

Adenosine monophosphate deaminase 3 activation shortens erythrocyte half-life and provides malaria resistance in mice

Early life events influence whole-of-life metabolic health via gut microflora and gut permeability

INSIDER

The inequity of targeted cystic fibrosis reproductive carrier screening tests in a multiethnic Australian population

VariantSpark: Population scale clustering of genotype information

Supporting pandemic response using genomics and bioinformatics: A case study on the emergent SARS-CoV-2 outbreak

Genetic correlation between amyotrophic lateral sclerosis and schizophrenia

Methylome and transcriptome maps of human visceral and subcutaneous adipocytes reveal key epigenetic differences at developmental genes

TRIBES: A user-friendly pipeline for relatedness detection and disease gene discovery

Dual-functioning transcription factors in the developmental gene network of Drosophila melanogaster

VariantSpark, a cloud-based random forest GWAS platform, identifies novel loci and epistasis in Alzheimer's disease

Predicting CRISPR-Cas12a guide efficiency for targeting using Machine Learning

Novel Alzheimer’s disease genes and epistasis identified using machine learning GWAS platform

Optimized nickase- and nuclease-based prime editing in human and mouse cells

VARSCOT: variant-aware detection and scoring enables sensitive and personalized off-target detection for CRISPR-Cas9

Host porphobilinogen deaminase deficiency confers malaria resistance in Plasmodium chabaudi but not in Plasmodium berghei or Plasmodium falciparum during intraerythrocytic growth

High Activity Target-Site Identification Using Phenotypic Independent CRISPR-Cas9 Core Functionality

Effects of low-dose hydrocortisone and hydrocortisone plus fludrocortisone in adults with septic shock: a protocol for a systematic review and meta-analysis of individual participant data

Unlocking HDR-mediated Nucleotide Editing by identifying high-efficiency target sites using machine learning

IBD analysis of Australian amyotrophic lateral sclerosis SOD1-mutation carriers identifies five founder events and links sporadic cases to existing ALS families

A novel ENU-induced ankyrin-1 mutation impairs parasite invasion and increases erythrocyte clearance during malaria infection in mice

Predicting SUMOylation sites

Mutation analysis of MATR3 in Australian familial amyotrophic lateral sclerosis

A bioinformatic pipeline for simulating viral integration data

Predicting structural disruption of proteins caused by crossover

GOANA: A Universal High-Throughput Web Service for Assessing and Comparing the Outcome and Efficiency of Genome Editing Experiments

Thresholding Gini Variable Importance with a Single-Trained Random Forest: An Empirical Bayes Approach

The Current State and Future of CRISPR-Cas9 gRNA Design Tools

STAR: predicting recombination sites from amino acid sequence

Interoperable medical data: The missing link for understanding COVID‐19

VariantSpark, A Random Forest Machine Learning Implementation for Ultra High Dimensional Data

Human and microbial transcriptomics from lean and obese individuals with colorectal cancer: A comparison of Total and Poly A RNA sequencing from clinical samples.

Ankyrin-1 gene exhibits allelic heterogeneity in conferring protection against malaria

Optimizing static thermodynamic models of transcriptional regulation

Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data

STREAM: Static Thermodynamic REgulAtory Model of transcription

Feasibility of Targeted Next-Generation DNA Sequencing for Expanding Population Newborn Screening

Genomics and personalised whole-of-life healthcare

Fast and Accurate Exhaustive Higher-Order Epistasis Search with BitEpi

Evaluation of computational programs to predict HLA genotypes from genomic sequencing data

A comparative study of multi-omics integration tools for cancer driver gene identification and tumour subtyping

Identity by descent analysis identifies founder events and links SOD1 familial and sporadic ALS cases

Lessening Organ Dysfunction With Vitamin C (LOVIT) Trial: Statistical Analysis Plan (Preprint)

Genome-wide Analyses Identify KIF5A as a Novel ALS Gene