Weakly conserved protein sequences – noise or pattern?

Advisors:
Steffen Graether (MCB)
Dan Ashlock (Math and Stats)

Dehydrins are proteins that protect plants from damage caused by drought and cold. They are intrinsically disordered, meaning that they have no well-defined 3D structure. Dehydrins have an unusual composition: they consist of 1-3 conserved segments which are surrounded by regions with only very weak conservation (generally small and polar amino acids) and are of variable length. These regions are known as the phi-segment. The question we wish to answer is whether the phi-segment is essentially random noise restricted by amino acid composition, or whether these is an underlying pattern that is not obvious from visual inspection. The project will use a number of techniques to assess randomness, including BLAST on protein fragments, reduced amino acid alphabets to measure sequence distributions and distances, calculating evolutionary rates and AI pattern recognition technology. Understanding the phi-segment will be key to understanding the evolution of this unusual protein family.