The majority of studies in population genetics make use of data consisting of single nucleotide polymorphisms (SNPs); however, there exist studies using alternate methods that work with real sequence data of alleles of individuals. While deep learning (DL) architectures such as Basset and DeepSEA are designed to deal with biological questions in the context of regulatory genomics (e.g., gene accessibility prediction), a similar yet sophisticated architecture can be implemented to address problems in population genetics; such a problem this paper is interested in to study is recombination initiation maps of individual human genomes containing double-strand breaks (DSBs) that mostly occur at discrete hotspots.
This project is currently undergoing completion.