Music and Science Meet at the Micro Level: Time-Frequency Methods and Granular Synthesis
Barry Truax (Simon Fraser University, Canada)
Musical research over the last century has become increasingly entwined with the scientific areas of acoustics, psychoacoustics, and electroacoustics, among others. During the last half century, the computer has become the central site of this research, including sound synthesis, digital signal processing and computer-assisted composition. One of the most striking developments in this encounter has been to push the frontiers of models of sound and music to the micro level, what is generally becoming termed "microsound". At this level, concepts of frequency and time are conjoined by a quantum relationship, with an uncertainty principle relating them that is precisely analogous to the more famous uncertainty principle of quantum physics. Dennis Gabor articulated this quantum principle of sound in 1947 in his critique of the "timeless" Fourier theorem.
Gabor illustrated the quantum as a rectangular area in the time and frequency domain, such that when the duration of a sound is shortened, its spectrum in the frequency domain is enlarged. The auditory system balances its frequency and temporal resolution in a manner that is consistent with the perception of linguistic phonemes where the simultaneous recognition of both spectral and temporal shapes plays a crucial role in rapid identification of speech. The analogy to the Heisenberg uncertainty principle of quantum physics is not metaphorical but exact, because just as velocity is the rate of change of position (hence the accuracy of determination of one is linked to a lack of accuracy in the other), so frequency can be thought of as the rate of change of temporal phase.
A class of contemporary methods of sound synthesis and signal processing known as time-frequency models that emerged over the last two decades has their basis at this quantum level such that changes in a signal's time domain result in spectral alterations and vice versa. The best known of these methods is called granular synthesis and the granulation of sampled sound that produce their results by the generation of high densities of acoustical quanta called grains. These grains are composed of enveloped waveforms, usually less than 50 ms (meaning a repetition rate of more than 20 Hz), such that a sequence of grains fuses into a continuous sound, just as the perception of pitch emerges with pulses repeating at rates above 20 Hz. So-called "Gabor grains" have the frequency of the waveform independent of the grain duration, whereas "wavelets" maintain an inverse relation between frequency and duration, and hence are useful in analysis and re-synthesis models.
Several other established synthesis methods are now regarded as time-frequency models, for instance the VOSIM and FOF models, both originally designed for speech simulation. Each is based on an enveloped, repeating waveform. Moreover, it is the time domain parameters involved in each model that control the bandwidth of the result, usually intended to shape the formant regions of the simulated vowels. Michael Clarke realized the relationship of the FOF method to granular synthesis early on, and has proposed a hybrid version called FOG. In his work, a fused formant-based sound can disintegrate into a rhythmic pattern or granular texture, and then revert to the original sound, even maintaining phase coherence in the process.
In my own work, the granular concept has informed most of my processing of sampled sound, the most striking application being to stretch the sound in time without necessarily changing its pitch. It is a revealing paradox that by linking time and frequency at the micro level, one can manipulate them independently at the macro level. In fact, all of the current methods for stretching sound are based on some form of windowing operation, usually with overlapping envelopes whose shape and frequency of repetition are controllable. The perceptual effect of time stretching is also very suggestive. As the temporal shape of a sound becomes elongated, whether by a small percentage or a very large amount, one's attention shifts towards the spectral components of the sound, either discrete frequency components, harmonics or inharmonics, or resonant regions and broadband textures. I often refer to this process as listening "inside" the sound, and typically link the pitches that emerge from the spectrum with those used by live performers. In other cases, the expanded resonances of even a simple speech or environmental sound suggest a magnification of its imagery and associations, as in my work Basilica where the stretched bell resonances suggest entering the large volume of the church itself.
Barry Truax is a Professor in both the School of Communication and the School for the Contemporary Arts at Simon Fraser University where he teaches courses in acoustic communication and electroacoustic music. He has worked with the World Soundscape Project, editing its Handbook for Acoustic Ecology, and has published a book Acoustic Communication dealing with all aspects of sound and technology. As a composer, Truax is best known for his work with the PODX computer music system which he has used for tape solo works and those which combine tape with live performers or computer graphics. In 1991 his work Riverrun was awarded the Magisterium at the International Competition of Electroacoustic Music in Bourges, France, a category open only to electroacoustic composers of 20 or more years experience. He is also the recipient of one of the 1999 Awards for Teaching Excellence at Simon Fraser University.