Algorithms and Data Structures in Genomics
We aim to develop fast and memory-efficient tools for the analysis of DNA and RNA sequencing data. We have developed several state-of-the-art tools for sequence alignment (Edlib), mapping long reads (Graphmap and Graphmap2) and single and metagenome de novo genome assembly (Racon, Raven and RA) and classification of microbes from a metagenomics sample. In our work, we use classical algorithms and data structures for work with strings and graphs. In our work, we prefer using C++, which enables various levels of optimization. We use SIMD instruction (SPOA), MPI and CUDA GPU (SW#) parallelization.
AI in Genomics
Most of the problems in genomics one might transform into graphs, strings and raw sequencing signal. Thus, we use contemporary AI methods in graph neural networks, natural language processing and methods for audio recognition on various genomics problems such as de novo assembly, detection of DNA and RNA modification and fast detection of microbes in a sample.
AI in Structural Biology
Usage of AI in structural is one of the hottest topics. We use graph neural networks for representation and supervised and reinforcement learning methods for solving problems related to RNA folding and stability and protein interaction sites.