Analysis of tested guide RNA’s used for gene editing via CRISPR to predict optimal future guide sequences
Despite the apparent simplicity of this sequence matching strategy to target the CRISPR machinery to the correct location in the DNA, all sgRNAs do not work equivalently well (discussed in Doench et al., 2016). It is hypothesized that there are other aspects/features of the sgRNAs themselves, as well as the target and the surrounding DNA sequences in the genome, that effect how well a given sgRNA will work (i.e., how efficiently it will target the CRISPR machinery to the correct place in the genome versus targetting it somewhere else). This project aims to use a machine learning approach to identify the key features for optimal sgRNA sequences to make targetting of the CRISPR gene editing machinery more efficient and effective. These features could then be used to rank and predict success/failure of future/newly designed sgRNAs. Such a tool would improve gene editing success rates, reduce off-target effects and thus potentially help fast track the use of CRISPR technology for human health applications.
Team Lead: Matthew Emery | matthew.emery44@gmail.com | @lstmemery | Data Scientist| Imbellus Inc.