Sound Synthesis I: How Sound Recording Works

Hello, and welcome to the first of several blog entries inspired by one of my projects, Project Noah. Actually that’s just my code name – its real name is Next Generation Sound Synthesis. I actually get paid for working on this one!

As the name suggests, the project is about new ways of synthesising sound, creating more realistic sounding digital instruments and better acoustic models. I think it’s a pretty interesting area, and the approach being taken (physical modelling synthesis) shows a lot of promise. But before I get onto that, I’d like to go back to basics a bit and talk about computer synthesis in general, giving some examples of the different ways it’s been done over the years, what they sound like, their strengths and weaknesses, etc. I’ll be talking mainly from a computer programmer’s perspective rather than a musicians, so my examples will draw mainly from the sound chips and software used to make sounds in computers and games consoles rather than from music synthesizers. (Although I do play music keyboards, I don’t know a great deal about the technical side of them, especially not the earlier ones).

In fact, before I even start on that, I’m going to go even further back to basics and talk about how sound recording works in this first entry. (If it’s not immediately clear how that’s relevant to synthesis, I hope it will become clearer by the end).

Recording Sound

Sound is vibrations in the air that we can hear when they are picked up by our ear drums. To record sound and be able to play it back later, we need some means of capturing and storing the shape of those vibrations as they happen – and also a means of turning the stored shapes back into vibrations again so that they can be heard.

The earliest methods of sound recording didn’t even rely on any electronics – they were entirely mechanical. A diaphragm would pick up the vibrations from the air, then a needle connected to the diaphragm would etch out a groove in some soft medium – initially wax cylinders, later flat vinyl discs. The cylinder or disc would be turned by hand or by clockwork. The groove’s shape would correspond to the shape of the sound waves as they changed over time.


This isn’t actually a mechanical gramophone, but it is the oldest one I could easily get hold of. It used to be my Granny’s.

To play back the sound, the process was reversed; the needle was made to run along the groove, transmitting its shape to the diaphragm, which would vibrate in the right way to recreate the original sound (more or less – the quality of these early systems left a lot to be desired).

It’s worth pausing for a moment to say something about how the shapes of sound wave relate to what they actually sound like to us. First of all, and maybe not surprisingly, a sound wave with bigger variations (larger peaks and troughs) sounds louder than one with smaller variations. So this:


sounds exactly the same as this:


except the first one is a lot louder.

So the height of the peaks in the wave (often called the amplitude) determines the loudness, more or less. The pitch (that is, whether the sound is high like a flute or low like a tuba) depends on how close together the peaks are. When there are a lot of peaks in quick succession like this:


the sound is high pitched. When there aren’t so many, like this:


the sound will be deeper. This is called the frequency of the sound wave. Frequency is measured in Hertz (abbreviated to Hz), which is simply the number of peaks the wave has per second. Humans can hear frequencies in the range of about 20Hz up to 20,000Hz, but are much better at hearing sounds around 1,000Hz than sounds at either of those extremes. Also, ability to hear high frequencies tends to tail off quite dramatically with age, so it’s unlikely adults will be able to hear all the way up to 20,000Hz (20kHz).

Real sound waves (such as speech, music, everyday noises) are usually more complex than the ones I’ve shown above and are made up of a whole mixture of different frequencies and amplitudes, which also vary over time. This makes things more interesting from the perspective of synthesising sounds.

Electronic Recording

The simple mechanical recording system was improved with the advent of electronics. Electronic recording was more complex but resulted in much better sound quality. In the electronic system, a microphone is used to turn the sound vibrations into electrical signals whose voltage varies over time in the same shape as the sound waves. Having the sound in electronic form opens up lots more possibilities – for example, it can be boosted by electronic amplifiers, allowing a stronger signal to be stored, and the sound to be played back at much louder volumes. It can also be much more easily mixed with other sound signals, very useful for editing recordings.


The first electronic systems still stored the sound as a groove cut in a vinyl disc, just as the original mechanical systems had. And as in the mechanical systems, the groove was the same shape as the original sound waves – there was no fancy encoding or conversion going on. Later, sound was also stored as a varying magnetic field on magnetic tape. The variations in magnetic field strength, like the shape of the grooves, corresponded exactly to the shape of the sound being recorded. This is known as analogue recording.


Tune in next time for lots of information about the next big innovation in sound recording: digital recording!


1 thought on “Sound Synthesis I: How Sound Recording Works

  1. Pingback: Sound Synthesis II: Digital Recording | GCat's World of Stuff

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.