Project Background

What is Convolution?

Mathematically, convolution is an operation which combines the shapes of two functions into a third function, which takes on characteristics of both the input functions. In signal processing, this operation models how a system’s output will behave for a given input function. (1) This only works if the impulse response of a system is known.

An ideal impulse response is a function which characterizes the behavior of a system completely. As the name suggests, impulse responses are generated via sending an impulse (dirac delta function) through the system. It is possible to understand how this works when considering the frequency domain for an impulse function: an impulse’s spectrum ranges across all frequencies at a magnitude of 1. Next, it is understood that convolution in time is the same as multiplication in frequency, therefore if a system receives an impulse, the output will be the exact frequency behavior of that system, i.e. the impulse response.

With the impulse response, an engineer may now model how a system will react to any input simply by convolving an input function with the impulse response. The frequency domain interpretation of this is that the magnitudes of the input function’s frequency will be scaled to the output characteristics of the system given in the impulse response.

What is Convolution in Audio?

Convolution has become a sort of buzzword in the field of audio applications. Technically, as previously discussed, convolution can be used to model how any sort of audio processing affects a signal. Nevertheless, in the Audio industry, most people are familiar with convolution in the context of Convolution Reverb. Convolution Reverb is a technique to simulate the acoustical reverberation of a space and apply it to an audio recording. This means that your recording can sound like it happened in a different space than it was captured in. A recent and famous example was when researchers at the Center for Computer Research in Music and Acoustics created a simulated acoustic of the Hagia Sophia, then had a vocal ensemble perform a traditional Byzantium piece with this simulated environment. (2)

This was possible because the researchers had characterized the acoustic response of the Hagia Sophia via Impulse Response recordings, which are essentially very short bursts of sound. Then, a live DSP plugin runs a continuous convolution on the input audio and generates the reverberation.

What is Deconvolution in Audio?

Deconvolution refers to the inverse operation of Convolution. Essentially, obtaining the input function from an output. Theoretically, this is possible if you have the output function and the impulse response as the multiplication in frequency instead becomes a division. In terms of audio, this would mean one could obtain the original sound that was played in a room without the sound of the room included. An example may be someone speaking a big stairwell, which is a highly reverberant space where the intelligibility is very low due to all of the reverberation smearing the transient information of the speech. If one captured the impulse response of the stairwell, it is theoretically possible to obtain a much clearer recording of their speech.

Our Goal

Our objective is to create a flexible and reliable reverberation deconvolution tool. This way, users can generate more intelligible recordings. This tool has a lot of potential in the realm of film, television, podcasting, and media in general. Often, audio recordings are made in acoustically untreated and unpredictable spaces so we would like to alleviate the detrimental impacts of these confounding variables on client’s audio.

References

1.) "Continuous Signal Processing"

2.) "Hear the Hagia Sophia"