Making the Online Botanic Gardens Station Model (Part 2: The Viewer)

Last time, I talked about how the 3D model itself was made. In this post, I’ll discuss how I embedded it into a web page so it can be explored in a web browser.

Not so long ago, it was difficult or impossible to produce real time 3D graphics in a web browser, at least it was if you wanted your page to work in a variety of browsers and not require any special plug-ins. That’s all changed with the advent of WebGL, which allows the powerful OpenGL graphics library to be accessed from JavaScript running in the browser. WebGL is what’s used to render the Botanic Gardens Station model.

The finished WebGL viewer

The finished WebGL viewer

There are already a number of frameworks built on top of WebGL that make it easier to use, but I decided I was going to build on WebGL directly – I would learn more that way, as well as having as much control as possible over how the viewer looked and worked. But before I could get onto displaying any graphics, I needed to somehow get my model out of Blender and into the web environment.

I did this by exporting the model to Wavefront OBJ format (a very standard 3D format that’s easy to work with), then writing a Python script to convert the important bits of this to JSON format. Initially I had the entire model in a single JSON file, but it started to get pretty big, so I had the converter split it over several files. The viewer loads the central model file when it starts up, then starts loading the others in the background while the user is free to explore the central part. This (along with a few other tricks like reducing the number of digits of precision in the file, and omitting the vertex normals from the file and having the viewer calculate them instead) reduces the initial page load time and makes it less likely that people will give up waiting and close the tab before the model even appears.

How not to convert quads to triangles

How not to convert quads to triangles

Once the model is loaded and processed, it can be displayed. One feature of WebGL is that (in common with the OpenGL ES API used on mobile devices) it doesn’t have any built in support for lighting and shading – all of that has to be coded manually, in shader programs that are compiled onto the graphics card at start up. While this does increase the learning curve significantly, it also allows for a lot of control over exactly how the lighting looks. This was useful for the Botanics model – after visiting the station in real life, one of my friends observed that photographing it is tricky due to the high contrast between the daylight pouring in through the roof vents and the dark corners that are in the shade. It turns out that getting the lighting for the model to look realistic is tricky for similar reasons.

The final model uses four distinct shader programs:

  1. A “full brightness” shader that doesn’t actually do any lighting calculations and just displays everything exactly as it is in the texture images. This is only used for the “heads up display” overlay (consisting of the map, the information text, the loading screen, etc.). I tried using it for the outdoor parts of the model as well but it looked rubbish.
  2. A simple directional light shader. This is what I eventually settled on for the outdoor parts of the model. It still doesn’t look great, but it’s a lot better than the full brightness one.
  3. A spotlight shader. This is used in the tunnels and also in some parts of the station itself. The single spotlight is used to simulate a torch beam coming from just below the camera and pointing forwards. There’s also a bit of ambient light so that the area outwith the torch beam isn’t completely black.
  4. A more complex shader that supports the torch beam as above, but also three other “spotlights” in fixed positions to represent the light pouring in through the roof vents. This is only used for elements of the model that are directly under the vents.
The full brightness shader in all its horrible glory

The full brightness shader in all its horrible glory

Although there’s no specular reflection in any of the shaders (I suspect it wouldn’t make a huge difference as there’s not a lot of shiny surfaces in the station), the two with the spotlights are still quite heavyweight – for the torch beam to appear properly circular, almost everything has to be done per-pixel in the fragment shader. I’m not a shader expert so there’s probably scope for making them more efficient, but for now they seem to run acceptably fast on the systems I’ve tested them on.

Can’t see the wood or the trees

In Part 1, I mentioned that the trees weren’t modelled in Blender like the rest of the model was. I considered doing this, but realised it would make the already quite large model files unacceptably huge. (Models of organic things such as plants, animals and humans tend to require far more vertices and polygons to look any good than models of architecture do). Instead I chose to implement a “tree generator” in JavaScript – so instead of having to save all of the bulky geometry for the trees to the model file, I could save a compact set of basic parameters, and the geometry itself would be generated in the browser and never have to be sent over the internet.

A Black Tupelo with no leaves

A Black Tupelo with no leaves

The generator is based on the well-known algorithm described in this paper. It took me weeks to get it working right and by the end I never wanted to see another rotation matrix again as long as I lived. I wouldn’t be surprised if it fails for some obscure cases, but it works now for the example trees in the paper, and produces trees for the Botanics model that are probably better looking than anything I could model by hand. I didn’t mean to spend so much time on it, but hopefully I’ll be able to use it again for future projects so it won’t have been wasted time.

A Black Tupelo with leaves

A Black Tupelo with leaves

(Blender also has its own tree generator based on the same algorithm, called Sapling. I didn’t use it as it would have caused the same file size problem as modelling the trees manually in Blender would).

Spurred on by my success at generating the trees programmatically (eventually!), I decided to apply a similar concept to generating entire regions of woodland for the cutting at the Kirklee end of the tunnel. Given a base geometry to sprout from and some parameters to control the density, the types of trees to include, etc., the woodland generator pseudo-randomly places trees and plants into the 3D world, again only requiring a compact set of parameters to be present in the model file.

The viewer also contains a texture overlay system, which is capable of adding graffiti, dirt, mineral deposits or whatever to a texture after it’s been downloaded. This is achieved by having a second hidden HTML 5 canvas on the page on which the textures are composited before being sent to the GPU. (The same hidden canvas is also used for rendering text before it’s overlaid onto the main 3D view canvas, since the 2D text printing functions can’t be used directly on a 3D canvas).

Why not just have pre-overlaid versions of the textures and download them along with the other textures? That would work, but would increase the size of the data needing to be downloaded: if you transferred both graffiti’d and non-graffiti’d versions of a brick wall texture (for example), you’d be transferring all of the detail of the bricks themselves twice. Whereas if you create the graffiti’d version in the browser, you can get away with transferring the brick texture once, along with a mostly transparent (and therefore much more compressible) file containing the graffiti image. You also gain flexibility as you can move the overlays around much more easily.

A selection of the station model's many items of graffiti

A selection of the station model’s many items of graffiti

The rest of the code is reasonably straightforward. Input is captured using standard HTML event handlers, and the viewpoint moves through the model along the same curve used to apply the curve modifier in Blender. Other data in addition to the model geometry (for example the information text, the parameters and positions for the trees, etc.) is incorporated into the first JSON model file by the converter script so that it can be modified without changing the viewer code.

So that’s the viewer. Having never used WebGL and never coded anything of this level of complexity in JavaScript before, I’m impressed at how well it actually works. I certainly learned a lot in the process of making it, and I’m hoping to re-use as much of the code as possible for some future projects.

 

Making the Online Botanic Gardens Station Model (Part 1: The Model)

One of my “fun projects” this year has been to make an interactive model of the abandoned Botanic Gardens Station in Glasgow. Although I’ve dabbled in 3D modelling before, including making a documentary video about Scotland Street Tunnel last year, the Botanics project turned out to be by far the most complicated 3D thing I’ve made, as well as by far the most complicated bit of web coding to make a viewer for it. It’s been a lot of fun as well as a hell of a learning experience, so I thought I’d write it up here in case anyone is interested.

The finished model, viewed in Chrome for Linux

The finished model, viewed in Chrome for Linux

In Part 1, I’ll talk about making the actual 3D model. Part 2 will cover the viewer code that actually makes it possible to explore the model from the comfort of your web browser.

I made the station model using Blender, a very capable free, open source 3D package. While various software and hardware now exists that allows you to generate a 3D model automatically from photographs or video, I didn’t have access to or knowledge of it, and I’m not sure how well it would work in a confined and oddly shaped space like the Botanic Gardens Station anyway. So I did it the old fashioned way instead, using the photos I took when I explored the station as a reference and crafting the 3D model to match using Blender’s extensive modelling tools.

The whole model in Blender

The whole model in Blender

I tried to keep the dimensions as close to reality as I could, using one grid square in Blender per metre, referring to the published sizes of the station and tunnels where possible, and estimating the scale of everything else as best I could.

It was actually surprisingly easy and quick to throw together a rough model of the station itself – most of the elements (the platforms, stairs, walls, roof, etc.) are made up of fairly simple geometric shapes and I had the basic structure there within a couple of hours. But as with a lot of these things, the devil is in the details and I spent countless more hours refining it and adding the trickier bits.

The beginnings of the station model

The beginnings of the station model

Because there’s quite a lot of repetition and symmetry in the station design, I was able to make use of some of Blender’s modifiers to massively simplify the task. The mirror modifier can be used for items that are symmetrical, allowing you to model only one side of something and have the mirror image of it magically appear for the other side. (In fact, apart from the roof the station is almost completely symmetrical, which saved me a lot of modelling time and effort). The array modifier is even more powerful: it can replicate a single model any number of times in any direction, which allowed me to model a single short section of roof or tunnel or wall and then have it stretch away into the distance with just a few clicks.

Tunnel, modelled with array modifier

Tunnel, modelled with array modifier

Finally, the curve modifier was very valuable. The entire station (and much of the surrounding tunnel) is built on a slight curve, which would be a nightmare to model directly. But thanks to the curve modifier, I was able to model the station and tunnels as if they were completely straight, and then add the curve as a final step, which was much easier. (I still don’t find the curve modifier very intuitive; it took quite a lot of playing around and reading tutorials online to get the effect I wanted, and even now I don’t fully understand how I did it. But the important thing is, it works!).

Tunnel + curve modifier = curving tunnel

Tunnel + curve modifier = curving tunnel

Texturing the model (that is, applying the images that are “pasted onto” the 3D surfaces to add details and make them look more realistic) turned out to be at least as tricky as getting the actual geometry right. The textures had been a major weak point of my Scotland Street model and I wanted much better ones for the Botanics. Eventually I discovered the great texture resource at textures.com, which had high quality images for almost everything I needed, and under a license that allowed me to do what I wanted with them – this is where most of the textures for the model came from. The remainder are either hand drawn (the graffiti), extracted from my photos (the tunnel portal exteriors and the calcite), or generated by a program I wrote a while ago when I was experimenting with Perlin Noise (some of the rusted metal).

The fiddly part was assigning texture co-ordinates to all the vertices in the model. I quickly discovered that it would have been much easier to do this as I went along, rather than completing all the geometry first and then going back to add textures later on (especially where I’d “applied” array modifiers, meaning that I now had to assign texture co-ordinates individually for each copy of the geometry instead of just doing it once). Lesson learned for next time. At first I found this stage of the process really difficult, but by the time I’d textured most of the model I was getting a much better feel for how it should be done.

The model in Blender, with textures applied

The model in Blender, with textures applied

(The trees and bushes weren’t in fact modelled using Blender… more about them next time!).

 

A very geeky web project

Update: the Glasgow version of the map is now live!

My interest in railways started off about 3 years ago, as simply a desire to squeeze into disused and supposedly-sealed-up tunnels and take photos of them. Normal enough, you might think. But since then it’s grown into a more general interest. I’ve collected a lot of books on railways, especially the ones around Edinburgh and Glasgow (in fact, so many that I’m starting to fear for the structural integrity of my bookshelf). I haven’t yet graduated to skulking on station platforms in all weathers wearing a cagoule and meticulously writing down the numbers of all the passing trains, but it may just be a matter of time now.

Maybe I inherited it from my mother. She writes a whole blog about trains and railways, here.

My rapidly growing collection of railway books (minus a few that are scattered around the house, wherever I last read them)

My rapidly growing collection of railway books (minus a few that are scattered around the house, wherever I last read them)

One thing I found while researching the history of the rail network was that I always wanted more maps to help me visualise what was going on. There were a few good ones in the books, but I often found myself struggling to imagine how things were actually laid out in the past, and how the old lines fitted in with the present day railways. I wished there was some sort of interactive map out there that would let you change the date and watch how the railway network changed over time, but I couldn’t find anything like that (the closest thing I found was a “Railway Atlas” book that has a map of the present day network in each area with a map from 1922 on the opposite page). So I decided to make one.

(Actually, I decided to make two: one for Edinburgh and one for Glasgow. The Glasgow one is taking a bit longer due to the more complex network on that side of the country, but I’m hoping to release it soon).

The project fitted in well with some other things I’d been wanting to do as well. I’ve always had an interest in maps and have been collecting the Ordnance Survey 1:50000 series (among others) for most of my life now, so when I discovered that Ordnance Survey now release a lot of their data for free, I was excited at the possibilities. I knew that the OS OpenData would make a good basis for my railway maps. I’d also been wanting to experiment with some of the newer web technologies for a while, and coding the viewer for the maps seemed like a good opportunity to do that.

My (mostly) Ordnance Survey map collection. I don't have a problem. Honest, I don't. I can stop any time I want to.

My (mostly) Ordnance Survey map collection. I don’t have a problem. Honest, I don’t. I can stop any time I want to.

As with a lot of projects, it seemed simple at first but once I actually started work on it, I quickly realised it was going to take longer than I thought. There were two main elements to it:

  1. The data sets. To be able to draw the map, I would need detailed data on all of the railway lines and stations in the Edinburgh and Glasgow areas, past and present, including their names, opening and closing dates, which companies built them, and so on. As far as I knew, this information didn’t even exist in any one single source, and if it did it was sure to be under copyright so I wouldn’t be able to just take it and use it. I was going to have to create the data sets pretty much from scratch.
  2. The viewer. Once I had the data, I needed to make a web page that could display it in the form I wanted. I already had quite a clear idea in my head of what this would look like: it would show the map (of course), which could be scrolled and zoomed just like Google or Bing Maps, and there would also be a slider for changing the date. The lines on the map would be colour coded to show which company they were owned by, or their current status, and special lines like tunnels and freight routes would also be shown differently.

It turned out I also needed to build a third major element as well: an editor for creating the data sets. Previously when I’d drawn maps, I’d either used the Google map maker (which has copyright problems if you want to actually use your creations for anything), or drawn them using Inkscape (which, great though it is, isn’t really designed for making maps in). I didn’t think either of those was going to cut it for this project… I needed something better, something that had all the features I needed, but was free from copyright issues. So I decided to make a map editor first.

Step 1: The Editor

At this point, anyone who’s a software engineer and has had it drummed into them “Don’t re-invent the wheel!” is probably shaking their head in exasperation. “You built your own map editor? Why would you do that? Surely there must be one out there already that you could have used!”. To be honest, I’m sure there was, but I don’t regret my decision to make my own. I had three good reasons for doing it that way:

  1. I would learn a lot more.
  2. I could make an editor that was very well suited to the maps I wanted to make. It would have all the features I needed, but wouldn’t be cluttered with extra ones I didn’t need. And I would know exactly how to use it, and would be able to change it if anything started to annoy me.
  3. It would be fun!

I’d had my eye on the Qt GUI toolkit for a while, wanting to give it a try and see if it was better than the others I’d used in the past. So I downloaded Qt Creator and got building.

Of course, I needed some map data first, so I downloaded one of the Ordnance Survey OpenData products: “OS OpenMap Local”, for grid squares NS and NT. (Ordnance Survey products don’t use the latitude and longitude co-ordinates familiar to users of Google Maps or OpenStreetMap; they have their own “National Grid” system that divides the UK into hundred kilometre squares, and uses numerical co-ordinates within those squares). These came in the form of two enormous (nearly a gigabyte altogether) GML files.

GML stands for “Geography Markup Language”, and is a standard XML grammar used for expressing geographical information. The contents of the OpenMap Local files are actually pretty simple conceptually; there’s just a hell of a lot of them! They mostly consist of great long lists of map elements (which can be areas such as forests or lakes or buildings, linear items like roads or railways, or point locations like railway stations) with names, national grid references, and any other relevant information. I wanted to use this information to display a background map in my map editor, on top of which I could draw out the railway routes for my interactive map.

I knew that parsing several hundred megabytes of XML data was likely to be pretty slow, and I didn’t really want the editor to have to do this every time I started it up, so I wrote a Python script that would trawl through the GML files and extract just the bits I was interested in, saving them in a much more compact file format for the actual editor to read.

Now I was onto the fun part: actually displaying the map data on the screen. Thankfully, Qt’s excellent graphics functionality was a great help here. After writing a quick function to translate OS national grid references to screen co-ordinates, and using it to project the map data onto the screen, I was looking at a crude map of Edinburgh. I spent a while tweaking the details to get it to look the way I wanted it: changing the colours of each type of element, changing the line widths for different types of road, hiding the more minor details when the view was zoomed out (OpenMap Local is very detailed and includes the outline for every single building, so trying to display all of that when you’re zoomed out far enough to see an entire city results in a very cluttered map, not to mention one that displays very slowly!).

Edinburgh, courtesy of Ordnance Survey's OpenData, and my map editor.

Edinburgh, courtesy of Ordnance Survey’s OpenData, and my map editor.

Once I had the background map displaying to my satisfaction, I turned my attention to the actual editing functions and finding a suitable way to store the data for the railway map…

Step 2: The Data

The data model for the interactive map is pretty simple. The three main concepts are: segments (simple sections of track without any junctions), stations (pretty self explanatory I hope) and events. An event is a change in one of the segments’ or stations’ properties at a certain date. For example, the segment that represents Scotland Street Tunnel has an event in 1847 when it came into use (a “change of status” event), another in 1862 when it was taken over by the North British Railway company (a “change of company” event), and another in 1868 when it was abandoned (another “change of status”). When the events are complete and accurate, this gives the viewer all the information it needs to work out how the map should look at any particular date. For a file format, I decided on JSON – it was straightforward, easy to access from both Qt and JavaScript, and easy to inspect and edit by hand for debugging.

Editing the data for Scotland Street Tunnel

Editing the data for Scotland Street Tunnel

I considered storing the data in a database rather than a file and having the viewer page query it in the background to retrieve whatever data it needed. But for this particular application, the data is relatively small (about 150KB for the Edinburgh map), and the viewer needs almost all of it pretty much straight away, so throwing a database into the mix would just have added complexity for no good reason.

Creating the data set was by far the most time-consuming part of the whole process. Every railway line and station, past and present, had to be painstakingly added to the map, and then all of the event dates had to be input. I collated the information from many different sources: present-day railway lines are part of the Ordnance Survey OpenData that I was using for the background map, so it was easy enough to trace over those. However, disused lines are not included, so I had to refer to old maps to see their routes and then draw them onto my map as best I could. For the dates, I referred to several books and websites – “An Illustrated History of Edinburgh’s Railways”, and the corresponding volume for Glasgow, were particularly valuable. Where possible, the event dates are accurate to the nearest day, although the current viewer only cares about the year.

The whole data set for Edinburgh, loaded into the editor

The whole data set for Edinburgh, loaded into the editor

I think I made the right choice in creating my own map editor – if I’d used existing software, it’s doubtful that I would have got the maps done any more quickly. There would have been a learning curve of course, but even after I’d got past that, it’s doubtful that I would have been as productive in a general map editor as I was in my specialised one.

Step 3: The Viewer

The viewer was the final piece of the jigsaw, and although I’d given it some thought, I didn’t properly start work on it until the Edinburgh map data was nearly completed. Unlike for the editor, there was only one real choice of technology for the viewer – if I wanted it to run on a web page and work across virtually all modern devices, it was going to have to be HTML5.

HTML5 extends previous versions of HTML with new elements like the canvas tag, which allows graphics to be rendered in real-time from JavaScript – in days gone by, this required a plug-in such as Flash or Java, but now it can be done in a vanilla browser without anything added. I hadn’t used the canvas before, but a bit of quick experimentation confirmed that it was more than capable of doing everything I needed for my interactive map. I also made use of the JQuery library to simplify operations such as fetching the map data from the web server in the background.

First, I wrote a small library of line drawing routines for all the different sorts of railways: dashed lines for tunnels, crossed lines for freight, and dashed-line-within-solid-line for single track railways (as used on some OS maps). These aren’t supported directly by the canvas, but it only took just over a hundred lines of JavaScript code to add them. Then I was ready to build a map renderer on top.

Different line styles and their uses

Different line styles and their uses

I had a basic version up and running pretty quickly, but it took a lot longer to implement all the features I wanted: background images, scrolling and zooming, the slider for changing the date, clicking on items for more information. Getting the background images lined up perfectly with the lines and stations turned out to be the trickiest part, though it really shouldn’t have been hard. It took me an embarrassingly long time of debugging before I realised I was truncating a scaling factor to two decimal places in one place but not in another, and once that was fixed everything looked fine.

It lives! The finished product

It lives! The finished product

There are still a few things that annoy me about the end product (the mobile browser and touch screen support, especially, could be better), but overall I’m pretty happy with it. It was fun to create and I learned a lot in the process… about the history of the local railways of course; about how geographical data is stored and processed; about programming GUIs with Qt; and about creating interactive graphics using HTML5.

 

Pi Emulators Now Work Again!

Or they should do, at least…

rpi2-1

Whenever I start getting comments and emails rolling in about compilation problems, I always know a new version of Raspbian must have been released, complete with the seemingly-obligatory moving around of libraries and header files ;). I’ve updated the Makefiles to deal with this, so if you were having problems, please head on over to my emulator download page and grab the latest version of the source:

http://www.gcat.org.uk/emul/

As always, please let me know if you have any problems, but hopefully it should work now (at least until the next distro update moves things again!).

 

Android Emulators Update

I just made a minor update to my Android emulators for 8-bit machines (the Raspberry Pi versions have not been changed). Since I updated my HTC One X to Android 4.1.1, the sound in all three of the emulators had been really horrible and distorted (yes, even more so than usual 😉 ). So it seemed a good time to update them to use 16-bit sound output, which seems to be better supported in Android. It turns out that 8-bit samples, which I was using before, aren’t actually guaranteed to work at all on every device, so this change would have been worth making even without the sudden appearance of the distortion.

Nothing else has changed except that they’re now being built with a newer version of the Android SDK; however, they should still work on all devices back to Android 2.1, and indeed they do still work on my old Wildfire. Please let me know if you encounter any problems.

Much as I like Android and Google and HTC in some ways, they do seem to like changing things that worked perfectly well already, and not always for the better. Almost every system update for my phone seems to turn into a fresh game of hunt-the-process-that’s-draining-the-whole-battery-and-guess-how-to-make-it-stop… including the ones that claim to improve battery life. And the latest update not only broke 8-bit sound, the phone also refuses point blank to talk to my desktop PC anymore, either as a USB disk drive or for app debugging purposes – both worked fine before. Ah well… got to keep the users and developers on their toes, I guess.

 

Let me be the first (err, second actually) to say I’ll miss netbooks

I was interested to see this article in the Register. The majority of the comment online about the death of netbooks seems to be along the lines of “Tablets are so much cooler and slicker, netbooks were clunky and annoying to use and who really needs a full PC these days anyway, especially when they’re travelling? Hardly anyone, that’s who. Good riddance netbooks”. But I for one am disappointed that they’ve stopped making them… I can’t see that anything else is going to meet my needs quite so well when I’m travelling… and finally someone agrees with me!

I took my HP netbook running Xubuntu away with me several times last year. I always found it useful, but on the three trips where I combined work with pleasure, it was indispensable. It was light enough to carry around in my backpack without taking up half my cabin baggage allowance or knackering my shoulders. It was cheap enough that if it did get damaged or stolen it wouldn’t be the end of the world (yes, I do have insurance, but you never know when they’re going to worm their way out of actually paying up). Its battery lasts a genuine six hours on a single charge, even when I’m actually doing work on it. It has a proper (if fairly small) keyboard so typing emails or documents on it doesn’t make me lose the will to live. It has enough storage space to keep most of my important files locally in case I can’t get online.

DSC_0984s

Most of all, it actually runs a proper full operating system! This isn’t something I’m just arbitrarily demanding because I’m a technology snob. I really do need it and do make use of it. At my technical meeting in Madrid in September, I was running a Tomcat web server, a MySQL database server, a RabbitMQ server running on top of an Erlang interpreter, and a full Java development environment. Try doing that on an iPad or an Android tablet! You might think all of that would be pretty painful on a single core Atom with 2GB of memory, but it actually ran surprisingly well. I wouldn’t want to work like that all the time but for a three day meeting it was perfectly adequate and usable. The full OS also means I can encrypt the whole disk which gives me a lot of peace of mind that my files are secure even if the thing does get stolen.

But now I’m starting to get worried about what I’m going to replace it with when the netbook finally departs for the great electronics recycling centre in the sky. Despite the market being flooded with all sorts of portable computing devices, I can’t see any that are going to do what I want quite so well as the netbook did.

Get a tablet? Urgh, no thanks… I’m sure they have their place, but even if I added a proper keyboard there is no way I’d get all that development software to run on Android or iOS. OK, I wouldn’t be surprised if there is some way to hack some of it into working on Android, but Android is hardly a standard or well supported environment for it. It’s not going to Just Work the way it does on a standard PC running Windows or Ubuntu.

Get a Microsoft Surface Pro? This tablet actually does run a full version of Windows 8 (or will when it comes out), but at $900 it costs nearly three times as much as my netbook did. I couldn’t justify spending that on something I’m going to throw into my backpack and take all over the place with me. I’d be constantly worrying it was going to get broken or stolen.

Get an “ultrabook”? Again would do the things I need, but again would cost WAY more than the netbook, would almost certainly weigh a lot more than the netbook, and I’d be very surprised if it had comparable battery life either (at least not without spending even more money on SSDs, spare batteries, etc.). For the “pleasure” part of my Madrid trip I was staying in a hostel room with seven other people. There was ONE power socket between the eight of us. When travelling, battery life really does matter.

Get a Chromebook and install a full Linux distribution on it? This is actually the option I’d lean towards at present. Chromebooks have price, portability and battery life on their side and apparently are easy to install Linux on. The downsides would be the ARM processor (which could limit software compatibility as well as making even the lowly Atom look blazingly fast in comparison), and the lack of local storage (Chromebooks generally seem to have a few gigabytes of storage. My netbook has a few hundred gigabytes!). So, still not an ideal option, but unless some enterprising company resurrects the netbook concept, could be the best of a bad lot :(.

(I freely admit I’m in a small minority here… not many people need to run multiple servers on their computer while travelling, and not many of those that do tend to extend their business trips with nights in hostels. But that doesn’t stop it being annoying that something that met my needs perfectly is no longer being made 😉 ).

Sound Synthesis IV: Next Generation Sound Synthesis

Last time we looked at (and listened to!) various methods of digital sound synthesis, beginning with the very primitive systems used by early computers, and ending with the sample-based methods in widespread use today. This time I’m going to talk about a new and very promising method currently in development.

What’s wrong with sample-based synthesis?

Our glockenspiel test sound already sounded pretty good using the sample-based method… do we really need a more advanced method? The answer is, although sample-based synthesis does work very well for certain instruments under certain conditions, it doesn’t work well the whole time.

Although it’s based on samples of real instruments, it’s still not fully realistic. Often the same sample will be used for different notes and different volumes, with the synth altering the frequency and amplitude of the sample as needed. But on a real piano (for example), the notes will all sound subtly different. A high C won’t sound exactly the same as a low C with its frequency increased, and pressing a key hard will result in a very different sound from pressing the same key softly – it won’t just be louder. Some of the better synths will use a larger number of samples in an attempt to capture these nuances, but the amount of data can become unmanageable. And that’s just considering one note at a time. In a real piano, when multiple notes are being played at the same time, the vibrations in all the different strings will influence each other in quite complex ways to create the overall sound.

It gets even worse for string and brass instruments. For example, changing from one note to another on a trumpet can sound totally different depending on how fast the player opens and closes the valves and it is unlikely a sample-based system will be able to reproduce all the possibilities properly without recording an unrealistically large number of samples. In some genres of music, the player may do things with the instrument that were never intended, such as playing it with a valve only part way open. A sample-based system would have no way of dealing with such unforeseen cases – if no-one recorded a sample for that behaviour, it can’t synthesise it.

The other problem with many of the synthesis methods is one of control. Even if it were possible to get them to generate the desired sound, it’s not always very obvious how to do it. FM synthesisers, for example, take a bewildering array of parameters, many of which can seem like “magic numbers” that don’t bear any obvious relation to the sound being generated. To play a note, sound envelopes and frequencies need to be set for every operator, the waveforms can be adjusted, and the overall configuration of the operators also needs to be set. Hardly intuitive stuff for people accustomed to thinking in terms of instruments and notes.

Physical Modelling Synthesis

A newer synthesis method has the potential to solve both the realism problem and the control problem, giving musicians virtual instruments that not only sound more realistic but are much easier to “play” and will correctly handle all situations, even ones that weren’t envisaged when the synth was designed. This is called Physical Modelling Synthesis, and it’s the basis for the project I’m working on just now.

The basic idea is that instead of doing something abstract that just happens to give the result you want (like e.g. FM synthesis), or “cheating” with recordings to give a better sounding result (like sample-based synthesis), you simulate exactly how a real instrument would behave. This means building a mathematical model of the entire instrument as well as anything else that’s relevant (the surrounding air, for example). Real instruments create sound because they vibrate in a certain audible way when they are played – whether that’s by hitting them, bowing them, plucking their strings, blowing into them, or whatever. Physical modelling synthesis works by calculating exactly how the materials that make up the instrument would vibrate given certain inputs.

How do we model an instrument mathematically? It can get very complex, especially for instruments that are made up of lots of different parts (for example, a piano has hundreds of strings, a sound board, and a box filled with air surrounding them all). But let’s start by looking at something simpler: a metal bar that could be, for example, one note of a glockenspiel.

glockdiagram1

To simulate the behaviour of the bar, we can divide it into pieces called elements. Then for each element we store a number, which will represent the movement of that part of the bar as it vibrates. To begin with, the bar will be still and not vibrating, so all these numbers will be zero:

glockdiagram2

We also need something else in this setup – we need a way to hear what’s going on, otherwise the whole exercise would be a bit pointless. So, we’ll take an output from towards the right hand end of the bar:

glockdiagram3

Think of this like a sort of “virtual microphone” that can be placed anywhere on our instrument model. All it does is take the number from the element it’s placed on – it doesn’t care about any of the other elements at all. At the moment the number (like all the others) is stuck at zero, which means the microphone will be picking up silence. As it should be, because a static, non-moving bar doesn’t make any sound.

Now we need to make the bar vibrate so that it does generate some sound. To do this, we will simulate hitting the bar with a beater near its left hand end:

glockdiagram4

What happens when the beater hits the bar? Essentially, it just makes the bar move slightly. So now, instead of all zeroes in our element numbers, we have a non-zero value in the element that’s just been hit by the beater, to represent this movement:

glockdiagram5

But the movement of the bar won’t stay confined to this little section nearest where the beater hit. Over time, it will propagate along the whole length of the bar, causing it to vibrate at its resonant frequency. After some short length of time, the bar might look like this:

glockdiagram6

and then like this:

glockdiagram7

then this:

glockdiagram8

As you can see, the value from the beater strike has “spread out” along the bar so now the majority of the bar is displaced in one direction or another. The details of how this is done depend on the material and exactly how the bar is modelled, but basically each time the computer updates the bar, the number in each box is calculated based on the previous numbers in all the surrounding boxes. (The values that were in those boxes immediately before the update are the most influential, but for some models numbers from longer ago come into play as well). Sometimes the boxes at the ends of the bar are treated differently from the other boxes – in fact, they are different, because unlike the boxes in the middle they only have a neighbouring box on one side of them, not both. There are various different ways of treating the edge boxes, and these are referred to as the model’s boundary conditions. They can get quite complex so I won’t say more about them here.

Above I said “some short length of time”, but that’s quite vague. We actually want to wait a very specific length of time, called the timestep, between updates to the bar. The timestep is generally chosen to match the sampling rate of the audio being output, so that the microphone can just pick up one value each time the bar is updated and output it. So, for a CD quality sample rate of 44100Hz, a timestep lasts 1/44100th of a second, or 0.0000226757 seconds.

If the model is working properly, the result of all this will be that the bar vibrates at its resonant frequency – just like the bar of a real glockenspiel. Every timestep, the “microphone” will pick up a value, and when this sequence of values is played back through speakers, it should sound like a metal bar being hit by a beater.

Here are the first 20 values picked up by the microphone: 0, 0, 0.022, -0.174, -0.260, 0.111, 0.255, 0.123, 0.426, 0.705, 0.495, 0.342, 0.293, 0.116, 0.016, 0.009, 0.033, -0.033, -0.312, -0.321, -0.030

and here’s a graph showing the wave produced by them:

pmgraph

To simulate a whole glockenspiel, we can model several of these bars, each one a slightly different length so as to produce a different note, and take audio outputs from all of them. Then if we hit them with our virtual beater at the right times, we can hear our test sample, this time generated by physical modelling synthesis:

pmsynth

I used a very primitive version of physical modelling synthesis to generate this sample, so it doesn’t sound amazing. I also used a bit of trial and error tweaking to get the bar lengths I wanted, so the tuning isn’t perfect. Both the project, and my knowledge of this type of synthesis, are still in fairly early stages just now! In the next section I’ll talk about what we can do do improve the accuracy of the models, and therefore also the quality of the sound produced.

Accuracy and model complexity

In our project we are mainly going for quality rather than speed. We want to try and generate the best quality of sound that we can from these models; if it takes a minute (or even an hour) of computer time to generate a second of audio, we don’t see that as a huge problem. But obviously we’d like things to run as fast as possible, and if it’s taking days or weeks to generate short audio samples, that is a problem. So I’ll say a bit about how we’re trying to improve the quality of the models, as well as how we hope to keep the compute time from becoming unmanageable.

A long thin metal bar is one of the simplest things to model and we can get away with using a one-dimensional row of elements (as demonstrated above) for this. But for other instruments (or parts of instruments), more complex models may be required. To model a cymbal, for example, we will need a two-dimensional grid of elements spaced across the surface of the cymbal. And for something big and complicated like a whole piano, we would most likely need individual 1D models for each string, a 2D model for the sound board, and a 3D model for the air surrounding everything, all connected and interacting with each other in order to get an accurate synthesis. In fact, any instrument model can generally be improved by embedding it in a 3D space model, so that it is affected by the acoustics of the room it is in.

There are also different ways of updating the model’s elements each timestep. Simple linear models are very easy and fast to compute and are sufficient for many purposes (for example, modelling the vibration of air in a room). Non-linear models are much more complicated to update and need more compute time, but may be necessary in order to get accurate sound from gongs, brass instruments, and others.

Inputs (for example, striking, bowing, blowing the model instruments) and how they are modelled can have an effect as well. The simplest way to model a strike is to add a number to one of the elements of the model for just a single timestep as shown in the example above, but it’s more realistic to add a force that gradually increases and then gradually decreases again across several timesteps. Bowing and blowing are more complicated. With most of these there is some kind of trade-off between the accuracy of the input and the amount of computational resources needed to model it.

2D models and especially 3D models can consume a lot of memory and take a huge number of calculations to update. For CD quality audio, quite a finely spaced grid is required and even a moderately sized 3D room model can easily max out the memory available on most current computers. Accurately modelling the acoustics of a larger room, such as a concert hall, using this method is currently not realistic due to lack of memory, but should become feasible within a few years.

The number of calculations required to update large models is also a challenge, but not an insurmountable one. Especially for the 3D acoustic models, the largest ones, we usually want to do the same (or very similar) calculations again and again and again on a massive number of points. Fortunately, there is a type of computer hardware that is very good at doing exactly this: the GPU.

GPU stands for graphics processing unit, and these processors were indeed originally designed for generating graphics, where the same relatively simple calculations need to be applied to every polygon or every pixel on the screen many, many times. In the last few years there has been a lot of interest in using GPUs for other sorts of calculations, for example scientific simulations, and now many of the world’s most powerful supercomputers contain GPUs. They are ideal for much of the processing in our synthesis project where the simple calculations being applied to every point in a 3D room model closely parallel the calculations being applied to every pixel on the screen when rendering an image.

Advantages of Physical Modelling Synthesis

You might wonder, when sample-based synthesis is getting so good and is so much easier to perform, why bother with physical modelling synthesis? There are three main reasons:

  • Sound quality. With a good enough model, physical modelling synthesis can theoretically sound just as good as a real instrument. Even with simpler models, certain instrument types (e.g. brass) can sound a lot better than sample-based synthesis.
  • Flexibility. If you want to do something more unusual, for example hitting the strings of a violin with the wooden side of the bow instead of bowing them with the hair, or playing a wind instrument with the valves half-open, you are probably going to be out of luck with a sample-based synthesiser. Unless whoever designed the synthesiser foresaw exactly what you want and included samples of it, there will be no way to do it. But physical modelling synthesis can – you can use the same instrument model and just modify the inputs however you want.
  • Ease of control. I mentioned at the beginning that older types of synthesiser can be hard to control – although they may theoretically be able to generate the sound you want, it might not be at all obvious how to get them to do it, because the input parameters don’t bear much obvious relation to things in the “real world”. FM is particularly bad for this – to play a note you might have to do something like: “Set the frequency of operator 1 to 1000Hz, set its waveform type to full sine wave, set its attack rate to 32, its decay rate to 18, its sustain level to 5 and its release rate to 4. Now set operator 2’s frequency to 200Hz, its attack rate to 50, decay rate 2, sustain level 14, release rate 3. Now chain the operators together so that 2 is modulating 1”. (In reality the quoted text would be some kind of programming language rather than English, but you get the idea). Your only options for getting the sound you want are likely to be trial and error, or using a library of existing sounds that someone else came up with by trial and error.

Contrast this with how you might play a note on a physical modelling synthesiser: “Hit the left hand bar of my glockenspiel model with the virtual beater 10mm from its front end, with a force of 10N”. Much better, isn’t it? You might still use a bit of trial and error to find the optimum location and force for the hit, but the model’s input parameters are a lot closer to things we understand from the real world, so it will be a lot less like groping around in the dark. This is because we are trying to model the real world as accurately as possible, unlike FM and sample-based synthesisers which are abstract systems attempting to generate sound as simply as possible.

Here’s a link to the Next Generation Sound Synthesis project website. The project’s been running for a year and has four years still to go. We’re investigating several different areas, including how to make good quality mathematical models for various types of instruments, how to get them to run as fast as possible, and also how to make them effective and easy to use for musicians.

Of course, whatever happens I doubt we will be able to synthesise the bassoon ;).

Sound Synthesis III: Early Synthesis Methods

Digital Sound Synthesis

Before I delve into describing different types of synthesis, I should start with a disclaimer: I’m coming at this mainly from the angle of how old computers (and video game systems) used to synthesise sound rather than talking about music synthesisers, because that’s where most of my knowledge is. Although I have owned various keyboards, I don’t have a deep knowledge of exactly how they work as I’m more of a pianist than a keyboard player really. There is quite a bit of overlap between methods used in computers and methods used in musical instruments though, especially more recently.

To illustrate the different synthesis methods, I’m going to be using the same example sound over and over again, synthesised in different ways. It’s the glockenspiel part from the opening of Sonic Triangle‘s sort-of Christmas song “It Could Be Different”. For comparison to the synthesised versions, here it is played (not particularly well, but you should get the idea!) on a real glockenspiel:

glockenspiel

(In fact, in the original recording of the song, it isn’t a real glockenspiel. It’s the sample-based synthesis of my Casio keyboard… there’ll be more about that sort of synthesis later).

If you have trouble hearing the sounds in this post, try right clicking the links, saving them to your hard drive and opening them from there. Seriously, I can’t believe that in 2013 there still isn’t an easy way of putting sounds on web pages that works on all major browsers. Grrrr!

Primitive Methods

As we saw last time, digital sound recordings (which include CDs, DVDs, and any music files on a computer) are just very long lists of numbers that were created by feeding a sound wave into an analogue-to-digital converter. To play them back, we feed the numbers into a digital-to-analogue converter and then play back the resulting sound using a loudspeaker. But what if, instead of using a list of numbers that was previously recorded, we used a computer program to generate a list of numbers and then played them back in the same way? This is the basis of digital sound synthesis – creating entirely new sounds that never existed in reality.

Very old (1980s) home computers and games consoles tended to only be able to generate very primitive, “beepy” sounding music. This was because they were generating basic sound wave shapes that aren’t like anything you’d get from a real musical instrument. The simplest of all, used by a lot of early computers, is a square wave:

synth3_1

square wave sound

Another option is the triangle wave, with a slightly softer sound:

synth3_2

triangle wave sound

The sound could be improved by giving each note a “shape” (known as its envelope), so that a glockenspiel sound, for example, would start loud and then die away, like a real glockenspiel does:

synth3_3

triangle wave with envelope sound

None of these methods sound particularly nice, and it’s hard to imagine any musician using them now unless they were deliberately going for a retro electronic sort of effect. But they have the advantage of being very easy to synthesise, requiring only a simple electronic circuit or a few lines of program code. (I wrote a program to generate the sound samples in this section from scratch in about half an hour). The square wave, for example, only has two possible levels, so all the computer has to do is keep track of how long to go before switching to the other level. The length of time spent on each level determines the pitch of the sound produced, and the difference in height between the levels determines the volume.

FM Synthesis

I remember being very excited when we upgraded from our old ZX Spectrum +3, which could only do square wave synthesis, to a PC and a Sega Megadrive that were capable of FM (Frequency Modulation) Synthesis. They could actually produce the sounds of different instruments! Looking back now, they didn’t sound very much like the instruments they were supposed to, but it was still a big improvement on square waves.

FM synthesis involves combining two (or sometimes more) waves together to produce a single, more complex wave. The waves are generally sine waves and the combination process is called frequency modulation – it means the frequency of one wave (the “carrier”) is altered over time in a way that depends on the other wave (the “modulator”) to produce the final sound wave. So, at low points on the modulator wave, the carrier wave’s peaks will be spread out with a longer distance between them, while at the high points of the modulator they will be bunched up closer together, like this:

synth3_4

Some FM synthesisers can combine more than two waves together in various ways to give a richer range of possible sounds.

Here’s our glockenspiel snippet synthesised in FM:

fm sound

(In case you’re curious, this was done using DOSBox, which emulates the Yamaha OPL-2 FM synthesiser chip used in the old Adlib and SoundBlaster sound cards common in DOS PCs, and the Allegro MIDI player example program. Describing how to get an ancient version of Allegro up and running on a modern computer would make a whole blog post in itself, but probably not a very interesting one).

It’s certainly a step up from the square wave and triangle wave versions. But it still sounds unnatural; you would be unlikely to mistake it for a real glockenspiel.

FM synthesis is a lot more complicated to perform than the older primitive methods, but by the 90s FM synthesiser chips were cheap enough to put in games consoles and add-in sound cards for PCs. Contrary to popular belief, they are not analogue (or hybrid analogue-digital) synths; they are fully digital devices apart from the final conversion to analogue at the end of the process.

In case you were wondering, this is pretty much the same “frequency modulation” process that is used in FM radio. The main difference between the two is that in FM radio, you have a modulator wave that is an audio signal, but the carrier wave is a very high frequency radio wave (up in the megahertz, millions-of-hertz range). In FM synthesis, both the carrier and modulator are audio frequency waves.

Sample-based Synthesis

Today, when you hear decent synthesised sound coming from a computer or a music keyboard, it’s very likely to be using sample-based methods. (This is often referred to as “wavetable synthesis”, but strictly speaking this term refers to only a quite specific subset of the sample-based methods). Sample-based synthesis is not really true synthesis in the same way that the other methods I’ve talked about are – it’s more a clever mixture of recording and synthesis.

Sample-based synthesis works by using short recordings of real instruments and manipulating and combining them to generate the final sound. For example, it might contain a recording of someone playing middle C on a grand piano. When it needs to play back a middle C, it can play back the recording unchanged. If it needs the note below, it will “stretch out” the sample slightly to increase its wavelength and lower its frequency. Similarly, for the note above it can “compress” the sample so that its frequency increases. It can also adjust the volume if the desired note is louder or quieter than the original recording. If a chord needs to be played, several instances of the sample can be played back simultaneously, adjusted to different pitches.

This synthesis method is not too computationally intensive; sound cards capable of sample-based synthesis (such as the Gravis Ultrasound and the SoundBlaster AWE 32/64) became affordable in the mid 90s and today’s computers can easily do it in software. Windows, for example, has a built-in sample-based synthesiser that is used to play back MIDI sound if there isn’t a hardware synth connected. Sound quality can be very good for some instruments – it is typically very good for percussion instruments, reasonable for ensemble sounds (like a whole string section or a choir), and not so good for solo string and wind instruments. The quality also depends on how good the samples themselves are and how intelligent the synth is at combining them.

Here’s the glockenspiel phrase played on a sample-based synth (namely my Casio keyboard):

sample based

This is a big step up from the other synths – this time we have something that might even be mistaken for a real glockenspiel! But it’s not perfect… if you listen carefully, you’ll notice that all of the notes sound suspiciously similar to each other, unlike the real glockenspiel recording where they are noticeably different.

Next time I’ll talk about the limitations of the methods I’ve described in this post, and what can be done about them.

 

Sound Synthesis II: Digital Recording

Digital Recording

Things changed with the advent of compact discs, and later DVDs and MP3s as well. Instead of storing the continuously changing shape of the sound wave, these store the sound digitally.

What do we mean by digitally? It means the sound is stored as a collection of numbers. In fact, the numbers are binary, which means only two digits are allowed – 0 and 1. The music on a CD, or in an MP3 file, is nothing more than a very long string of 0s and 1s.

How do you get from the shape of the sound to a string of numbers? After all, the sound wave graphics we saw last time looks very different from 1000110111011011010111011000100. First of all, you sample the sound signal. That means you look at where it is at certain individual points in time, and ignore it the rest of the time. Imagine drawing the shape of a sound wave on a piece of graph paper like this:

digital1

To sample this signal, we can look at where the signal is each time it crosses one of the vertical lines. We don’t care what it’s doing the rest of the time – only its intersections with the lines matter now. Here’s the same sound, but instead of showing the full wave, we just show the samples (as Xs):

digital2

To simplify things further so we can stick to dealing with whole numbers, we’re also going to move each sample to the nearest horizontal grid line. This means that all the samples will be exactly on an intersection where two of the grid lines cross:

digital3

So far, so good. We have a scattering of Xs across the graph paper. Hopefully you can see that they still form the shape of the original sound wave quite well. From here, it’s easy to turn our sound wave into a stream of numbers, one for each sample. We just look at each vertical grid line and note the number of the horizontal grid line where our sample is:

digital4

The wave we started with is now in digital form: 5, 9, 5, 6, 7, 1, 2, 6, 4, 6. It’s still in ordinary decimal numbers, but we could convert it to binary if we wanted to. (I won’t go into details of how to convert to binary here, but if you’re curious, there are plenty of explanations of binary online – here’s one). We can record this stream of numbers in a file on a computer disk, on a CD, etc. When we want to play it back, we can reverse the process we went through above to get back the original sound wave. First we plot the positions of the samples onto graph paper:

digital3

And now we draw the sound wave – all we have to do is join up our samples:

digital5

Voila! All ready to be played back again.

This might look very spiky and different from the original smooth sound wave. That’s because I’ve used a widely spaced grid with only a few points here so you can see what’s going on. In real digital audio applications, very fine grids and lots of samples are used so that the reconstructed wave is very, very close to the original – to show just one second of CD quality sound, you would need a grid with 65,536 horizontal and 44,100 vertical lines!

(In electronics, the device that turns an analogue sound wave into samples is called an analogue to digital converter, and its cousin that performs the inverse task is a digital to analogue converter. As you probably guessed, it’s not really done using graph paper).

But why?

At this point you may be wondering, why bother with digital recording? It seems like we just went through a complicated process and gained nothing – in fact, we actually lost some detail in the sound wave, which doesn’t look quite the same after what it’s been through! There are several advantages to digital recording:

  • Digital recordings can be easily manipulated and edited using a computer. Computers (at least all the ones in common use today) can only deal with digital information – anything analogue, such as sounds and pictures, has to be digitised before they will work with it. This opens up a huge range of possibilities, allowing much more sophisticated effects and editing techniques than could be accomplished in the analogue domain. It also allows us to do clever things like compressing the information so it takes up less space while still sounding much the same (this is what the famous MP3 files do).
  • I noted above that we lost a bit of detail in our sound wave when we converted it to digital and then converted it back. However, in real life situations digital recordings generally give much better sound quality than analogue recordings. This is because the small inaccuracies introduced in the digitisation process are usually much smaller and less noticeable than the background noise that inevitably gets into analogue recording and playback equipment no matter how careful you are. Digital is more or less immune to background noise for reasons I’ll explain shortly.
  • Digital recordings can be copied an unlimited number of times without losing any quality. This is closely related to the point above about sound quality. If you’re old enough to have copied records or cassettes onto blank tapes, or taped songs off the radio, you may have noticed this in action. The copy always sounds worse than the original, with more background noise. If you make another copy from that copy instead of from the original, it will be worse still. But it isn’t like that with digital recording – if you copy a CD to another CD, or copy an MP3 file from one computer to another, there is no loss of quality – the copy sounds exactly like the original, and if you make another copy from the copy, it will also sound exactly like the original. (This isn’t just a case of the loss in quality being so small you can’t hear it – there genuinely is no loss whatsoever. The copies are absolutely identical!).

Notes on background noise

I mentioned above that digital recordings are more or less immune to background noise and that’s one of their big advantages. But first of all, what is background noise, where does it come from, and what does it do to our sound signals?

Background noise is any unwanted interference that gets into the sound signal at some point during the recording or playback process. It can come from several different sources – if the electrical signal is weak (like the signal from a microphone or from a record player’s pick-up), it can be affected by electromagnetic interference from power lines or other devices in the area. If there is dust or dirt on the surface of a record or tape, this will also distort the signal that’s read back from it.

There is no getting away from background noise, it will always appear from somewhere. If we have a vinyl record with a sound signal recorded onto it that looks like this:

digital1

by the time it gets played back through the speakers, noise from various sources will have been added to the original signal and it might look more like this:

digital7

Once the noise is there, it’s very difficult or impossible to get rid of it again, mainly because there’s no reliable way to tell it apart from the original signal. So ideally we want to minimise its chances of getting there in the first place. This is where digital recording comes in. Let’s say we have the same sound signal recorded onto a CD instead of a vinyl record. Because it’s in digital form, it will be all 0s and 1s instead of a continuously varying wave like on the vinyl. So the information on the CD will look something like this:

digital8

This time there are only two levels, one representing binary 0 and the other binary 1.

There will still be some noise added to the signal when it gets read back from the CD – maybe there is dust on the disc’s surface or electrical interference getting to the laser pick-up. So the signal we get back will look more like this:

digital9

But this time the noise doesn’t matter. As long as we can still tell what is meant to be a 0 and what is a 1, small variations don’t make any difference. In this case it’s very obvious that the original signal shape was meant to be this:

digital8

So, despite the noise, we recovered exactly the original set of samples. We can pass them through the digital to analogue converter (DAC) and get back this:

digital1

a much more accurate version of the original sound wave than we got from the analogue playback. Although the noise still got into the signal we read from the CD, it’s disappeared as if by magic and doesn’t affect what we hear like it did with the record.

(Of course, digital recording isn’t completely immune to noise. If the noise level was so high that we could no longer tell what was meant to be a 0 and what was a 1, the system would break down, but it’s normally easy enough to stop this from happening. Also, we can’t prevent noise from getting into the signal after it’s converted back to analogue form, but again this is a relatively small problem as the majority of the recording and playback system works in digital form).

Does digital recording really sound better?

Not everyone thinks so. A lot of people say they prefer the sound of analogue recordings, often saying they have a “warmer” sound compared with the “colder” sound of digital. In my opinion, yes there is a difference, but digital is more faithful to the original sound – the “warmth” people talk about is actually distortion introduced by the less accurate recording method! It’s absolutely fine to prefer that sound, in the same way that it’s absolutely fine to prefer black and white photography or impressionist paintings even though they’re less realistic than a colour photograph or a painting with lots of fine detail.

“Ah”, you might say. “But surely a perfect analogue recording would have to be better than a digital recording? Because you’re recording everything rather than just samples of it”. Technically this is true… but in reality (a) there’s no such thing as a perfect analogue recording because there are so many ways for noise to get in, and (b) at CD quality or better, the loss of information from digitising the sound is miniscule, too small for anyone to be able to hear. Double-blind tests have been done where audio experts listened to sounds and had to determine whether the sound had been converted to digital and back or not. No-one was able to reliably tell.

Phew! That was longer than I meant it to be. That’s the background… next time I really will start on actual sound synthesis, I promise!

 

Why I don’t like restricted computing

Sort-of-preachy, serious-ish post today, sorry.

Back in the mists of time when I first started using computers, you could basically do whatever you wanted with them, in the sense that you could run any program you wanted, whether you wrote it yourself or bought it or acquired it from some dodgy geyser down the pub or whatever. It wasn’t necessarily easy to get it to do what you wanted, but at least the computer companies didn’t go out of their way to make it particulary difficult either. Now, with systems like the iPad and iPhone gaining popularity and Windows 8 on the horizon, that’s no longer true, and I find it worrying.

For those of you that don’t know, you can’t install just any software you want on an iPhone or iPad like you can on a PC – you can only install things from the official app store. Unless you jailbreak the thing or pay to become a developer, that is. The same situation exists with Windows Phone 7 (though not, thankfully, with Android and Blackberry – they both allow you to install apps from anywhere you want, more like a traditional PC).

You could argue, though, that the iPhone is still an improvement over previous mobile phones in this regard. After all, on earlier phones you couldn’t generally install extra software at all, and at least on the iPhone you can do this, even if you are restricted in where it comes from. But locking users into one source for their software appears to be a trend that’s migrating away from phones and towards more traditional computers. The upcoming Windows 8 contains two examples of this:

1. “Metro” apps, which are a new kind of application more similar to the ones you get on Windows Phones, will only be installable from the official Microsoft store. While you’ll still be able to install traditional Windows programs from anywhere like before, the new style apps won’t allow this.

2. The new ARM version of Windows 8 is mandating locked-down boot code, preventing users from replacing the operating system with something else. Installing Linux, for example, which is generally fairly easy on current PCs, would be made much more difficult or impossible on a Windows 8 ARM system. (There were fears that this would be the case for Windows 8 on all platforms, including standard PC systems, though this thankfully seems to have fallen by the wayside for now). By the way, there is no technical reason for this difference whatsoever – it’s like a car manufacturer deciding that if you buy a blue car from them you’ll be allowed to service it yourself and get hold of spare parts and so on, but if you buy a red one, the bonnet will be welded shut to stop you doing anything to it.

So why do I think this is a bad thing? Well, several reasons, but mainly this: I got into programming and doing other more advanced (and fun!) things with computers by tinkering around with my own computer (first a Spectrum, then a PC) at home. There was nothing to stop me writing my own programs, or modifying ones other people had written to see what would happen. It’s now my career and I’m pretty convinced that without those early experiences, I wouldn’t be doing it now. At the very least I certainly wouldn’t be as good at it or as enthusiastic about it. I worry that if the trend towards locked-down systems continues, the next generation of kids won’t get this same chance. They will miss out on experimenting and having fun with technology the way I did, and everyone will miss out on the things they might have created as a result.

The argument about whether this is a good thing or not rages on in various forums online. I already know a lot of what people will say in defence of the closed systems, so I’ll pre-empt it by giving my responses first.

“Kids will still be able to learn programming at school, or on specialist teaching devices”

Yes, no-one’s suggesting that learning programming is going to become impossible, but there’s no denying that it could get difficult enough to put a lot of people off. If the trend towards closed systems continues to the point where you can’t write programs on a standard PC anymore without either paying a “developer fee” or jumping through hoops to jailbreak it, there are going to be significant extra barriers in the way. And some people who might have persisted if it was as easy as just downloading one of the many free programming environments available online (as it is today) are going to give up if they find they have to make costly purchases or risky modifications to their computers before they even get to write a line of code.

And yes, school computing departments are likely to have programmer-friendly computers of some description. But I know I would never have got into programming in such a big way if I was restricted to doing it in school classes, and I’m sure the same applies to a lot of people. I did most of my learning and experimentation on my own at home.

“This isn’t like the old 8-bit days. Computers are too complicated for anyone to be able to understand now anyway, so it doesn’t matter if you’re prevented from programming them yourself”

They’re still perfectly understandable to many of us, thank you very much. The concepts are still very much the same. It’s true that computers are a lot more complex than they were in the days of the Spectrum and C64, probably complex enough that no one person can be an expert on every area of them now… but at the same time, software has improved a lot and got much better at hiding that complexity. So most times you don’t need to think about the details of the hardware while you’re coding an app… though in some cases it is useful to know those details, and in those cases it’s nice to be able to find them without slamming into an artificial brick wall that really doesn’t need to be there.

“Haven’t you got anything better to complain about? Surely this is trivial compared to global warming/starvation in Africa/the economic crisis”

Yes, it is, but I still think it’s important enough to be worth talking about, and I’m a lot more qualified to talk about this than about any of those other issues.

“This is nothing to do with you. Companies have a right to sell locked down computers and people have a right to buy them. If you don’t like them, just don’t buy them!”

I certainly don’t intend to buy them, in fact one of the main reasons I chose an HTC phone is that their policy on this is much better than most phone manufacturers (they allow you to unlock the phone’s bootloader and replace the entire operating system with a new one of your choice if you want to). And yes, people have a right to choose whatever hardware they want for themselves. But I also have a right to say why I don’t like this trend and try to warn other people of the possible downsides before we reach the point of no return. This is especially important because the downsides may not become obvious to most people until it’s too late to do much about them.

“Only a tiny minority of geeks want to write their own apps or install a different operating system. They will be knowledgeable enough to choose hardware that doesn’t restrict them”

Until we reach the point where it’s virtually impossible to find hardware that doesn’t restrict them, that is.

In any case, I think that argument is flawed. Not everyone knows when they buy a computer that they will at some point in the future want to replace the OS. I’m typing this right now on my netbook, and when I bought it I fully intended to just use it with the operating system it came with (Windows 7). It wasn’t until after I’d had it a while and found it was quite slow that I decided to give Linux a try on it. Thankfully today it’s quite easy to install Linux on any computer, even one that wasn’t designed for it, so it worked even though I hadn’t specifically looked for a Linux-compatible machine when purchasing it. Tomorrow I might not be so lucky if current trends continue.

I’ve also installed Linux for my dad and brother at various points when they were having problems with their Windows PCs and didn’t have the recovery disk anymore. They certainly wouldn’t have specifically looked for Linux compatibility when they bought their computers, but they were very grateful for it when it saved them from having to buy a new copy of Windows or a whole new computer just to get back up and running. (Though one can see why Microsoft and the hardware manufacturers might be less keen on users having this option).

I’m not convinced that it is only a “tiny minority” that want more flexibility with their computers, anyway. I think if you looked at the percentage of people who’d used their PCs for something that might not be approved of (or even envisaged at all) by Microsoft or Apple – whether that’s using a Bittorrent client, installing a free alternative to some of the built in software of Windows/OSX, or messing around with a programming language or game creator or something similar – it would be a lot more than a tiny minority. Almost everyone I know has probably used their PC for at least one thing that is likely to become difficult or impossible in future if Windows transitions to a locked down, iPhone-like model.

It’s hard to find numbers for the percentage of iPhones that have been jailbroken, but most estimates seem to put it around 10-15% in the West and much higher (33-50%) in China. That’s hardly a tiny number of people, and it doesn’t even include the percentage who might like to jailbreak if they knew it was possible and knew what it could do, or the people who decided not to buy an iPhone at all because of the restrictions but might have if they were a bit more open.

Even if it was only a “tiny minority” of geeks that want to do this stuff, I’m still not convinced that’s a good argument for preventing them from doing it. They may be tiny in number, but they’re likely to be playing a disproportionately large part in creating the next generation of innovative software. If they’re held back from doing so, it will be bad for the majority as well in the longer term.

“Phones and tablets aren’t computers, they’re appliances, so it’s fine that you can’t modify the software on them. Would you expect to be able to modify the software in your microwave?”

This seems a completely arbitrary distinction to me. Of course a microwave is an appliance, it may technically have a computer inside to control the electrics, but you wouldn’t be able to run a word processor or browse the web on it, so it’s no big loss that you can’t install your own software on it. I’d even say the original (pre-Touch) iPod was more of an appliance because its hardware made it not particularly useful for anything other than playing music. But tablets and phones are pretty obviously just small computers. They can do a lot of the same things a full size PC can do, access the same websites, even run some of the same apps. If you add a Bluetooth keyboard to your iPad, as many people do, it’s not that much different from a small netbook – so why is one a “computer” and one an “appliance”? It’s entirely an artificial distinction intended to muddy the waters and get people to accept that it’s OK for tablet computers to be locked down and restricted because they’re somehow “different” from traditional computers with a keyboard.

“This is a good thing for most people. It means better security and higher quality software”

If you can’t install software, you can’t install buggy software or malware, right? So it must be a good thing!

I don’t really believe this argument. If (for example) Apple wanted to, they could include a checkbox in iOS like the Android one that allows you to install apps from anywhere you like, they could also include an HTC-style bootloader unlocker tool on their website, and these things would have absolutely zero impact on security or usability for most people. How could they? Most people would never use them anyway (at least that’s what we keep being told). Sure, if people really wanted to, they could then go and install malware from dodgy websites. If people really want to, there’s nothing to stop them driving the wrong way up a motorway or walking down a dark alley in a bad area at night laden with expensive jewellery either, but I don’t hear many people clamouring for rules to stop people from driving their own cars, or going out after dark without a police escort. So why this desire to stop people installing software on their own computers?

In any case, if you think locking down computer systems makes them immune to bugs and security flaws, just look at the iOS 6 maps fiasco, the alarm clock fiascos, the security hole that allowed iPhones to be jailbroken just by visiting a website, numerous games console hacks and jailbreaks, and one study that found the official App Store had more spyware in it than the unofficial, unsanctioned Cydia store (sorry, I’ve lost the link for this one).

So if security and usability isn’t the real reason for doing this, what is? I can think of at least two:

1. Money. Apple takes a 30% cut of anything you buy from the App Store, and anything (such as music, ebooks) that you buy from within apps as well. Allowing competitors to operate app stores, or allowing you to install apps from outwith the store would obviously interfere with them getting this cut. I keep hoping this will be ruled illegal under EU competition law or something – I don’t think a car manufacturer would get away with demanding a 30% cut of any accessories you buy, or a stereo manufacturer demanding 30% from all the CDs you buy, so I’m not sure why it’s considered acceptable in the computing world.

2. Control. If they can control what you can and can’t install on your computer, they can prevent you installing things they don’t approve of. For example, programs that copy DVDs and BluRays can be outlawed, as well as software that competes with one of the supplier’s own products. Bittorrent clients could be forbidden altogether, and web browsers could be made to automatically block access to sites that Apple (or the authorities) disapprove of.

“It doesn’t matter, people will just jailbreak them anyway”

Typically, jailbreaking relies on finding holes in the security that aren’t supposed to be there. Often these are fixed in future software or hardware updates, meaning that new jailbreaking methods have to be found. Often they are found quickly, but there are no guarantees – some devices haven’t been jailbroken even after months or years. Security is getting ever more sophisticated as well.

Relying on jailbreaks being available is like agreeing to live in a prison cell because you’ve found a secret tunnel in the floor that allows you to go out whenever you like. Maybe that will work for a while, but what are you going to do if the guards discover the tunnel and block it up? Maybe you’ll find another one… then another… what if one day there isn’t another one? If you value being able to go outside, the only safe option is to not agree to live in the cell in the first place.

There is also the fact that jailbreaking will likely void your warranty and may permanently damage your device if it goes wrong, so a lot of people will be hesitant to do this to their expensive phone or tablet.

“This will never happen, you’re just scaremongering and you’ve got no evidence”

Right now, I don’t know what’s going to happen. No-one does. I would love nothing more than to be proved wrong about all of this, to still be living in a world where people can do whatever they want with their computers (and indeed phones and whatever other devices are around by then) in ten or twenty years time. But currently there is a definite shift going on away from general purpose programmable systems, and that’s enough to make me very concerned. I’d rather say something now and help raise awareness even if it does seem like scaremongering than sit back and watch in silence as Apple and Microsoft attempt to flush the technology landscape that got me to where I am today down the toilet.

“Apple and Microsoft would never use their app store rules for censorship or to block competitors, only to block buggy or dangerous software”

Right now it is likely that they are treading carefully, because they won’t want to scare people away to the many open alternatives that still exist at this point. But if closed platforms become more widespread in the future, there will be nothing to stop them from being a lot more restrictive. Even so, there have already been some bad policies and decisions from both Apple and Microsoft. Many of the rules have got nothing whatsoever to do with the security or quality of software and are purely about censoring content or preventing competition. This article points out that the majority of recent popular PC games would be forbidden from the Microsoft Store due to their content. Both companies ban or severely restrict apps that compete with their own – so in future there may be no more installing Firefox or Chrome, you could be stuck with Internet Explorer whether you like it or not. Apple have already used their app store restrictions for censorship of political cartoons. They also used to have a policy against any app that allowed any sort of scripting – this prohibits (among other things) emulators for old computers and consoles, and software designed to teach kids programming.

Yes, some of these decisions were eventually reversed, others were not. But in any case, can you count on every bad decision that affects you being reversed in the future? No. That’s why the only safe option is not to give them this power in the first place.

Phew. That was longer than it was supposed to be!