Efficient data mining methods for evidence-based decision making. This project aims to develop efficient data mining methods for causal predictions. Evidence-based decision making (EBD), such as evidence-based medicine and policy, is always preferable. To support EBD, causal predictions forecast how outcomes change when conditions are manipulated. Progress has been made in theoretical research on causal inference based on observational data, but few methods can automatically mine causal signals ....Efficient data mining methods for evidence-based decision making. This project aims to develop efficient data mining methods for causal predictions. Evidence-based decision making (EBD), such as evidence-based medicine and policy, is always preferable. To support EBD, causal predictions forecast how outcomes change when conditions are manipulated. Progress has been made in theoretical research on causal inference based on observational data, but few methods can automatically mine causal signals from the data and methods for efficient causal predictions based on data are even fewer. This project will apply its methods to biomedical problems. The outcomes could support smart and data-driven evidence based decision making in many areas, such as therapeutics and government policy making.Read moreRead less
Deep correction of DNA sequencing errors by data mining algorithms. This project aims to investigate the many layers of error correction problems in the terabytes of genomic sequence data, and aims to solve these problems by novel data mining algorithms. High-throughput sequencing platforms have generated massive amounts of useful raw data, but also made widespread errors. The new algorithms are capable of correcting errors at deeper layers to further enhance data quality. Expected outcome inclu ....Deep correction of DNA sequencing errors by data mining algorithms. This project aims to investigate the many layers of error correction problems in the terabytes of genomic sequence data, and aims to solve these problems by novel data mining algorithms. High-throughput sequencing platforms have generated massive amounts of useful raw data, but also made widespread errors. The new algorithms are capable of correcting errors at deeper layers to further enhance data quality. Expected outcome includes the knowledge advancement of genomic data industry and interdisciplinary collaboration between biotechnology and data mining. This also provides significant benefit for genomic decisions in forensics and personalised medicine which demand accurate genomic information.Read moreRead less
Reconstructing proteins to explain and engineer biological diversity. The aim of this project is to develop computational methods to construct entirely new proteins. Computational reconstruction of enzymes that have been extinct for over 400 million years has revealed remarkable opportunities for biotechnological innovation. The intended outcomes are to develop bioinformatics methods to broaden the scope of ancestral protein reconstruction to include protein super-families, to establish what spe ....Reconstructing proteins to explain and engineer biological diversity. The aim of this project is to develop computational methods to construct entirely new proteins. Computational reconstruction of enzymes that have been extinct for over 400 million years has revealed remarkable opportunities for biotechnological innovation. The intended outcomes are to develop bioinformatics methods to broaden the scope of ancestral protein reconstruction to include protein super-families, to establish what specific changes led to the evolutionary success of a protein, and to re-run evolution to generate proteins that perform in conditions suitable for industrial and agricultural applications, in particular the production of hydroxylated fatty acids for bioplastics. By examining proteins from many life forms, the project plans to develop a novel bioinformatics strategy to understand their evolution and engineer new proteins for use in production of chemical commodities.Read moreRead less
Searching for near-exact protein models. This project aims to develop novel and efficient heuristic-based algorithms leading to near accurate protein tertiary structure models. Knowledge about protein structures is fundamental to our understanding of living systems. The progress on experimental determination of these structures has been extremely limited and remains an open challenge in molecular biology. Computational prediction of protein structures from sequences is emerging as a promising ap ....Searching for near-exact protein models. This project aims to develop novel and efficient heuristic-based algorithms leading to near accurate protein tertiary structure models. Knowledge about protein structures is fundamental to our understanding of living systems. The progress on experimental determination of these structures has been extremely limited and remains an open challenge in molecular biology. Computational prediction of protein structures from sequences is emerging as a promising approach, but its accuracy is far from satisfactory. The software systems developed in this project will be used in structural identification of target proteins in drug design. This will make drug design process more efficient, saving time and cost, potentially saving lives.Read moreRead less
Flexible user-guided network layout for biomedical applications. This project will develop techniques for automatic layout of biological network diagrams, allowing users to guide the layout while satisfying any required placement constraints and drawing conventions. As part of the project, these methods will be integrated into several real-world systems biology applications for network browsing and authoring.
RNA structure prediction by deep learning and evolution-derived restraints. This project addresses the long-standing structure-folding problem of Ribonucleic acids (RNA) whose solution is essential for elucidating the roles of noncoding RNAs in living organisms. The proposed approach will detect hidden homologous sequences and enhance evolutionary covariation signals by developing new algorithms for search and smarter neural networks for deep learning. The project expects to generate new tools ....RNA structure prediction by deep learning and evolution-derived restraints. This project addresses the long-standing structure-folding problem of Ribonucleic acids (RNA) whose solution is essential for elucidating the roles of noncoding RNAs in living organisms. The proposed approach will detect hidden homologous sequences and enhance evolutionary covariation signals by developing new algorithms for search and smarter neural networks for deep learning. The project expects to generate new tools for structure-based probing of RNA evolutional and functional mechanisms. The outcomes should provide significant benefits by high-accuracy computational modelling of RNA structures that are difficult and costly to solve by current structural biology techniques but important for enabling biotech and clinical applications.Read moreRead less
BioPPSy: An open source BIOchemical Property Prediction SYstem. Computer software will be developed for the prediction of the pharmacokinetic properties of small molecules to assist in the development of new compounds with drug-like properties. The software will be made freely available to promote its use and further development.
Evolutionary analyses of short-read sequences from pooled samples. This project aims to provide biologists with a means of making sound, statistical inferences about evolution by using next-generation data from mixed samples. When biologists make statements about history, they use evolutionary trees, frequently reconstructed from the genetic data of many individuals. Next-generation sequencing provides large amounts of genetic data at low cost, but biologists have difficulty using these data for ....Evolutionary analyses of short-read sequences from pooled samples. This project aims to provide biologists with a means of making sound, statistical inferences about evolution by using next-generation data from mixed samples. When biologists make statements about history, they use evolutionary trees, frequently reconstructed from the genetic data of many individuals. Next-generation sequencing provides large amounts of genetic data at low cost, but biologists have difficulty using these data for evolutionary research, particularly when they sample mixtures of DNA from many individuals. The anticipated value of this project is that it allows evolutionary biologists to capitalise on the benefits of next-generation sequencing, without sacrificing their ability to make reliable inferences about history.Read moreRead less