Next-generation techniques for analysing massive data sets. To process enormous amounts of data, leading computing companies are turning to modern computing frameworks, for which little theory of efficient computational techniques has been developed. This project will resolve key theoretical questions and provide fast techniques for poorly understood pattern recognition and bioinformatics problems.