Data Science: Taming the complicated world of wheat genetics

Wheat as seen under a microscope.

By Samantha McReavy  

Using high-performance computing (HPC) power provided by the SOSCIP consortium cloud services, researchers at the University of Guelph are identifying novel genes linked to disease resistance in wheat to develop a robust crop variety. Disease-resistant varieties make better use of farmers’ resources.  

Post-doctoral researchers Soren Seifi and Mina Kaviani and senior scientist Mitra Serajazari in the Department of Plant Agriculture wheat breeding lab are using novel resistance genes to design breeding programs that will prevent crop loss from diseases such as fusarium head blight and stripe rust. 

Ontario produces on average 2.5 million tonnes of wheat a year, making the crop an essential part of the agricultural industry. But because of the damp and humid climate in wheat-growing parts of the province, yield is seriously threatened by multiple diseases. New varieties must be exceptionally robust. 

Fungal diseases such as fusarium head blight produce toxic compounds called mycotoxins that harm humans as well as livestock and their feed. 

“It’s not just a matter of economics—it’s a matter of health,” says Seifi. 

Using the SOSCIP cloud computing service, researchers analyzed two wheat varieties. The first, Avocet+YR15, is resistant to stripe rust, a fungal disease that reduces its productivity. The second, Tenacious, is highly resistant to fusarium head blight. The researchers are studying how these resistant lines protect themselves against both infections. 

“Understanding wheat’s immune response to disease at the genomic level requires super-computers for big data modelling and analysis,” says Seifi. The team works with 24 central processing units and 250GB RAM. 

SOSCIP’s HPC cloud—one of the most advanced cloud computing services in Canada—is required for the analysis, storage, and reconfiguration of the raw data generated from wheat’s enormous genome, which is about five times the size of the human genome. (The wheat genome contains many repeated patterns.) 

Each sample plant creates a huge raw data file. That file is placed into a bioinformatics pipeline, a collection of software tools that convert raw data into a manageable form. Researchers analyze information from this pipeline to determine which genes are involved in disease resistance and, eventually, to create resistant wheat varieties. 

To date, more than 300,000 mRNA transcripts have been analyzed, resulting in more than 300 potential avenues to control fusarium head blight and stripe rust. 

Ultimately, researchers aim to deploy data from bioinformatics and molecular analyses to help improve breeding programs.  

Funding for this project was provided by SOSCIP and Grain Farmers of Ontario