(External) De novo viral genome assembly and other applications using Oxford Nanopore long read data

Advisor: Dr. Oliver Lung, Canadian Food Inspection Agency

Suggested Co-Advisors: Zeny Feng, Stefan Kremer, Baozhong Meng, Dan Tulpan

The efficiency and success of discovery of novel and unexpected pathogens using metagenomics data heavily depends on the availability of bioinformatics tools to identify organisms present in the sample. Many existing tools and approaches heavily rely on reference databases to classify reads and assemble genomes. This approach, however, is problematic, particularly for novel pathogens, when no closely related reference is available. Therefore, reference-free (de novo) genome assembly methods provide an advantage of assembling contigs without relying on a reference sequence. Additionally, long read sequencing technologies such as Oxford Nanopore and Pacific Biosciences facilitate de novo contig assembly by generating more contiguous reads. Long read de novo genome assemblers have been successfully used to recover complete and nearly complete genomes of many groups of organisms such as animals, plants, bacteria; however, their application for viral genome assembly remains challenging due to the generally smaller genome size and lower abundance in metagenomics data. The main aim of the project is to assess performance of various long read de novo assemblers to reconstruct viral genomes from metagenomics data and to develop the best practices guidelines. Students will first familiarize themselves with the bioinformatics tools through the literature review and will select a set of the de novo assemblers for further testing. They will perform a thorough benchmark assessment of the software using various metagenomics datasets generated in the lab.

This is a one-semester project with the potential to extend to two semesters.

Knowledge/Skills

Familiarity with metabarcoding, metagenomics, tools used in computational biology, shell scripting

If students have any additional skills, we encourage them to point it out as it would be an asset.