Novel techniques for statistical and mathematical analyses of sequence data. Algorithms will be developed for analysing and comparing the sequences of DNA letters and amino acids constantly being generated in massive quantities by biological research. The novel approach taken is based on the statistical frequency of occurrence of short words and is designed specifically for situations where current methods fail.
Evolutionary analyses of short-read sequences from pooled samples. This project aims to provide biologists with a means of making sound, statistical inferences about evolution by using next-generation data from mixed samples. When biologists make statements about history, they use evolutionary trees, frequently reconstructed from the genetic data of many individuals. Next-generation sequencing provides large amounts of genetic data at low cost, but biologists have difficulty using these data for ....Evolutionary analyses of short-read sequences from pooled samples. This project aims to provide biologists with a means of making sound, statistical inferences about evolution by using next-generation data from mixed samples. When biologists make statements about history, they use evolutionary trees, frequently reconstructed from the genetic data of many individuals. Next-generation sequencing provides large amounts of genetic data at low cost, but biologists have difficulty using these data for evolutionary research, particularly when they sample mixtures of DNA from many individuals. The anticipated value of this project is that it allows evolutionary biologists to capitalise on the benefits of next-generation sequencing, without sacrificing their ability to make reliable inferences about history.Read moreRead less