Lockdown Blog 2: Training an AI to recognise the Simpsons

Introduction

I’ve been meaning to have a proper play around with modern artificial intelligence techniques for a while, and during lockdown with more time on my hands seemed like a good time to give it a go. So I trained a deep learning neural network to recognise characters from The Simpsons. (As you do).

This was actually my third foray into neural networks: I used one (not very successfully) for my final year university project way back in the mists of time, and I also experimented with training one to generate text last year. (Among other things, I fed it megabytes of text from my diary and got it to generate its own diary entries based on them, which was pretty hilarious if not particularly useful). But this was the first time I’ve attempted to use one in what’s probably their biggest application area, namely computer vision and image recognition.

I thought recognising Simpsons characters would be a good way to get started with this, for several reasons. Firstly, I really like the Simpsons (or at least I did until it all went downhill in the late 90s or so). Secondly, it was relatively easy to get hold of large amounts of Simpsons images for training and testing the network (more on that in a moment). And thirdly, because cartoon characters look so distinctive, it would be easier to get a computer to tell them apart than it would in the case of (for example) real people.

Before I go any further I’d like to give a shout out to the fantastic Practical Deep Learning For Coders course made by the developers of fast.ai. I watched all the course videos a few months ago and found them incredibly interesting and inspiring, possessing the rare combination of being instantly accessible but also going into the subject in great depth. As an illustration of what I mean, after the first half hour or so of the opening lecture you’re already up and running with training a classifier to tell different cat and dog breeds apart, while the second half of the videos delve right into the code, explaining it right down to a line-by-line analysis of the algorithms that make up a neural network. Highly, highly recommended for anyone who knows how to code and is at all interested in AI.

Preparing the data

The first step in building a deep learning model is getting together some data that you can use for training and testing the neural network. In my case, I needed as many images from The Simpsons as I could get hold of, and I also needed to “tag” them (or at least most of them) with the names of the characters that appeared in them.

I decided to write a Python script that would download random images from Frinkiac, which is basically a Simpsons screen grab search engine, often used for making memes and so on. I felt a bit bad as it probably wasn’t intended for this usage, but in my defence I was quite gentle with it – I left my script running over a period of days, grabbing a single image at a time and then sleeping for a while so as not to hammer the site’s bandwidth. By the end of this process I had a completely random selection of around 3,000 screen captures from the first 17 seasons of the show sitting on my hard drive.

The next step was to “tag” these with the names of the characters that appeared in them. You might wonder why I had to do this… after all, my aim was to get the computer to identify the characters automatically, not to have to do it myself, right? Well yes, but in order to train a neural network to perform this sort of recognition task, you need to give it “labelled” data – that is, you show it an image along with a label describing what’s in it, in quite a similar way to how you might train a person to recognise characters they weren’t previously familiar with, in fact. So you need the data to be labelled.

I wasn’t looking forward to this bit as I knew it would take quite a bit of time consuming manual work – I was going to have to look at every image myself and identify the characters present, then enter that information into the computer somehow. To ease the pain, I built a little web app to try and make this process as fast as possible. It showed me the images in turn, allowing me to tag each one and move onto the next one with the minimum of key presses, writing the image names and tags into a CSV file that I could use with the AI software later on. In all I think it took me maybe an hour to write the web app and about 2 hours to tag the images, which wasn’t as bad as I’d feared.

Initially I had planned to train the network to recognise all the named characters in the show, but I later realised I probably didn’t have enough data for this – some of the more minor characters only showed up a handful of times in my training images, not really enough to make the recognition reliable. So instead I decided to focus on just the four main characters: Homer, Marge, Bart and Lisa.

Training the model

Once I had the tagged training data ready, I turned my attention to actually training a neural network to recognise it. I used the same software used in the fast.ai course I mentioned above, namely fast.ai itself (which is built on PyTorch), with the code written in the form of a Jupyter Notebook for easy experimentation. I used a ResNet34, a classic architecture for image recognition, though I also tried using a larger ResNet50 to see if it worked any better (it didn’t). Training (on my GeForce 1050Ti) only took about 5 minutes, then I was able to play with the resulting model, testing it on images it hadn’t seen before.

Overall, I was reasonably happy with it, for a first attempt. It worked very well indeed (almost perfectly) for images that included a reasonably close shot of one of the characters’ faces. For example:

Prediction: 99.78% Bart
Prediction: 99.01% Marge
Prediction: 98.98% Lisa
Prediction: 99.99% Homer

(You may notice that the model doesn’t just give a straight yes or no prediction, but a percentage score indicating how confident it is that each character does appear in the image).

The model doesn’t work so well for more complicated situations such as characters being partially hidden, characters viewed from an unusual angle, characters wearing unusual clothing (especially clothing that covers up some of their distinctive features), characters far away in the distance so that they appear very small in the image, and so on. Below are some examples where it doesn’t make such a confident prediction, and my speculation as to why that might be.

Prediction: 35.4% Marge. The model thinks it’s more likely that Marge is in the image than any of the other characters (who all scored likelihoods of less than 10%), but still isn’t very confident, probably because she’s in a slightly unusual position and has her head turned away.

Prediction: 54.23% Homer. The model thinks there’s a decent chance Homer is in this image, but isn’t very sure, probably because only the top of his face is visible in this one.

Prediction: 99.88% Lisa, 11.55% Bart. The model is very certain Lisa is here, but nowhere near as confident about Bart. I think this is probably because Bart is partially hidden behind Lisa and Maggie, while Lisa is fully visible.

Prediction: 97.23% Bart, 97.49% Homer, 88.81% Lisa, 68.79% Marge. This time the model correctly identifies that all four characters are in the image, but it’s significantly less certain about Marge than the others, probably because her face is obscured behind Bart.

Prediction: 87.75% Bart, 53.09% Homer, 94.77% Lisa, 39.25% Marge. In this shot, all the characters are present but not in their usual clothing. Bart and Lisa are recognised with a high degree of confidence, but the model is understandably not so confident about Homer, since only the top of his face and head is visible. Surprisingly, it’s even less confident about Marge, maybe because her trademark hair is mostly hidden from view.

Prediction: 98.25% Homer, 74.86% Marge. The model is a lot less confident about Marge than Homer, presumably because Homer’s glass is obscuring most of her face.

Prediction: 91.71% Homer, 93.39% Marge, 67.27% Lisa. Homer and Marge are recognised with more than 90% certainty as expected. Interestingly, the model also thinks that Lisa is probably here, I’m guessing because Maggie looks very similar to Lisa in some ways, notably her hair and eyes.

So that’s my model. I have no doubt at all that it could be done much better by someone with more expertise (or, for that matter, a better training data set), but as someone who started programming back in the days when it would have been unimaginable for a computer to do this, it’s amazingly cool to see it working even as well as it is.

Can I play with it?

More seriously, I’d like to find out how to make models like this available online for people to have a go of, but I’m not there yet. I’m new to all this and don’t want to end up overloading my web host, or running up a huge bill if I go down the cloud hosting route, so I’d definitely want to do some research or testing before attempting this.

Game Project part 6: My tools

Right now on the game project, I’m working on something that I think is going to be pretty cool once it’s done, but it’s going to take me a while to get it done, or even to get it to the point where it’s worth me writing about it here. So I thought I would break up this interlude by writing a bit about the tools I’m using to develop the game, in case anyone is interested. I’ve mentioned some of these already in passing but in this post I’ll try and flesh out the full picture a bit more, and also say a bit about why I chose the tools I did (since they probably seem an odd choice to a lot of people).

As I said before, I am writing the game in JavaScript so that it runs in a browser environment, using WebGL for the graphics. The major advantages of this approach are: 1. I don’t have to grapple with all the annoying differences between the various platforms I want to target (Windows, macOS and Linux to begin with), I can just target the web environment and the game will run on almost anything that has a browser; and 2. once the game’s finished, people won’t have to download and install the game in order to play it, they can just navigate to a web page. The major disadvantage is that the performance will be worse than “native” code in C++, but the game isn’t likely to be super demanding so this shouldn’t be too much of a problem in practice.

For testing the game while I develop it I’m mostly using Google Chrome, since that’s what I have on all my machines anyway. I’ll test it on other browsers (Firefox, Safari, etc.) at some point and check that everything works as it should, though I’m trying to steer clear of relying on obscure or browser-specific features as much as possible.

The only JavaScript library I’m currently relying on is glmatrix, a very useful library for doing the vector and matrix operations that are very common in 3D graphics applications. I’ve used it before and there seemed no point in re-inventing the wheel for something so basic and ubiquitous. Other than glmatrix I’m targeting the browser directly and writing all the code myself, though bits of it are adapted from my previous projects rather than being written from scratch for this game.

XAMPP running an Apache server on Windows

As well as a browser, it’s also necessary to have a web server installed for testing purposes, because the game expects to load its data files from a server via HTTP rather than from a local disk. So I have Apache, the most widely used web server in the world, installed on all my machines. On Linux I installed it through the normal package manager; on Windows and Mac I’ve installed XAMPP, which does a great job of packaging up Apache plus some other useful software and making it easy to install and use.

To edit the actual game code I’m using GNU Emacs. This is largely due to habit: Emacs has been my editor of choice for nearly 20 years now and its most useful shortcut keys are burned into my brain so deeply that I don’t even have to think about them anymore. It’s a pretty powerful editor and has some decent support for JavaScript, so it probably wouldn’t be a bad choice even if it wasn’t for my long history of using it for everything.

Editing the rendering code using GNU Emacs

I’m storing the code in a Fossil repository. I wrote a whole blog post about Fossil when I first discovered it, so I won’t say too much here. I still like its philosophy of storing all the code along with wiki pages and a bug tracker in a single compact file, while the Fossil executable itself is also a single file that can be dropped onto any system without going through a complex installation process. I probably wouldn’t use Fossil to co-ordinate a huge software project with multiple authors, but for a personal project like this one it’s ideal.

I’m using Fossil’s wiki facility to make notes as I go, covering things like the binary file formats used for models, and the exact process of authoring content for the game engine. Hopefully I’ll manage to keep this up going forwards as I know from past experience how useful it will be later on! I haven’t used the bug tracker system yet but I might later on if it becomes useful.

So that the code is automatically synchronised between all my machines, I keep the Fossil repository file in my Dropbox. I also keep many of the asset files in Dropbox, though I haven’t checked them into Fossil as it’s not really designed for working with large binary files, and putting them in there would just bloat the repository file and cause it to take longer to sync. I’ve used this workflow, for want of a better word, on various projects for the last few years and found it works pretty well for me.

Editing terrain in Blender

I think that pretty much covers the coding and testing side of things. The other major aspect to the game is creating the assets (3D models, landscapes, animations, textures, icons, and so on). My primary 3D software is Blender. As I said in the first post of this series, it’s free, it’s very powerful (people have made entire animated movies using it), it runs on almost anything, and it’s got a great community behind it. Its biggest downside is probably the steep learning curve, but since I’m already mostly over that after using it for my previous 3D dabblings, there was no reason not to use it for this.

I covered MakeHuman, which I’m using for the (you’ve guessed it) human models in some depth in part 2, so I won’t go into it further here.

Even when making 3D games, some 2D image editing is still usually required for preparing textures, making heads up displays, that kind of thing. Gimp has been my paint program of choice for years now. Apparently it’s got a horrible interface compared to Photoshop, but then I’ve never really used Photoshop so I guess you can’t miss what you’ve never had. In any case, Gimp has never let me down so far in terms of features. I’ve only used it once so far for this particular game, for tiling the ground textures into a single image (the “snap to grid” feature was very helpful here), but I’m sure I’ll be breaking it out again before too long. My other favourite image editor is Inkscape, which handles vector graphics rather than bitmapped. In the past I’ve found it great for designing icons and stuff like that.

Editing the terrain texture map in Gimp

I think that’s pretty much all my tools covered now. If you’ve been following along, you’ll notice that all of them are free to use and the majority are open source, which is no accident. I’m not a free software zealot who demands that all software must be freely licensed, but there are some good practical reasons why I much prefer using open source tools wherever possible.

Firstly, I don’t have to pay for them. I’m not a complete cheapskate, but at the same time I have a family to support now and better things to do with my limited money than pour it into (for example) Adobe’s pockets for a Photoshop subscription when Gimp meets my needs perfectly well.

Secondly, they tend to be cross-platform. That’s important for me because I regularly use all three major operating system platforms (Windows, Linux and macOS) so I much prefer tools that work on all three of them, as all of the tools I’ve described in this post do. I like it this way; it means I’m not locked into one particular platform and am free to switch whenever I want without having to throw away the time I’ve invested in learning this stuff and start from square one with a whole new suite of software. For example, last year it made sense for me to get a MacBook (I needed a Mac for a project and didn’t own a decent laptop at the time) and, even though I hadn’t used one since high school, I was able to install all my preferred free tools and get up and running with it very quickly.

Thirdly, they’re not going anywhere. With commercial software there’s always the worry that the company will go out of business or will discontinue their product after I’ve come to rely on it. Sure, I could continue using an old version (unless it’s a subscription service *shudder* ) but it might not keep working forever, or might keep me locked into an older operating system and unable to upgrade. That’s much less likely with open source, because even if the original developer disappears, someone else from the community can step up and take over maintenance. (I could even do it myself in some theoretical world where I have time for such things 😉 ).

So yeah. While I could probably make my game more rapidly if I switched to some expensive all-singing-all-dancing commercial Windows-only solution, I’m happy with the approach I’m taking, and it also fits well with my desire to understand everything and be able to tinker with the low level code if I want to. I hope you found this at least somewhat interesting or informative. Next time I’ll be back with a proper progress report to share with you.

 

Game Project part 5: Billboards, Culling and Depth Cuing: not just a load of random words…

… though you might be forgiven for thinking that at first 😉 . Why do so many things in computing have such weird names?

In my post about trees, I mentioned that having too many trees in the scene can make the game engine run pretty slowly, because each one contains a lot of polygons and vertices. This could be a problem for me because some of the areas of my game are going to be quite big and contain quite a lot of trees, and I want the game to perform reasonably well even on quite modest computers. So in this post I’m going to talk about some of the tricks that can be used to speed up the rendering of complex 3D scenes, which I’ve been spending a lot of time lately coding up for my game engine.

Culling

The first trick is a pretty simple one: don’t waste time drawing things that aren’t going to be visible in the final scene. This might seem obvious but in fact it’s quite a common approach in simple 3D graphics applications just to throw everything onto the screen and let the GPU (Graphics Processing Unit) sort out what’s visible and what isn’t. (My game engine as described in the earlier posts used this method). This still works fine because the GPU is smart enough not to try and draw anything that shouldn’t be there, but it’s inefficient because we’ve wasted time on processing objects and sending them to the GPU when we didn’t need to. It would be better if we could avoid as much of this work as possible.

This is where culling comes in. It refers to the process of removing items from the graphics pipeline as early as possible so as not to waste time on them. There are various methods of doing this, because there are various reasons why items might not be visible:

  1. They’re behind the viewer.
  2. They’re too far to the side to be visible.
  3. From the viewer’s point of view they’re completely hidden behind other objects.

The first two cases aren’t too hard to deal with. We can imagine the area of the world that’s visible to the viewer as a big sideways pyramid shape projecting out into 3D space (often called the view frustum), then we can immediately cull anything that falls completely outside of this pyramid, because it can’t be visible. The details of how this is done are quite complicated and involve projections and various different co-ordinate systems, but it’s reasonably efficient to do.

There are a couple of ways of making the clipping even more efficient:

  1. Instead of examining every vertex of an object to see if it’s in or out of the frustum, it’s common to work with the object’s bounding box instead. This is an imaginary cuboid that’s just big enough to contain all of the object’s 3D points within it. It’s much faster just to clip the 8 points of the bounding box against the frustum, and it still gives us nearly all same benefits as clipping the vertices individually.
  2. If you arrange your 3D scene in a hierarchical form (often called a scene graph), then you can cull large parts of the hierarchy with very little effort. For example, if your scene graph contains a node that represents a house, and various nodes within that that represent individual rooms, and various nodes in each room that represent the furniture, then you can start by clipping the top level “house” node against the frustum. If it’s outside, you can immediately cull all of the room nodes and furniture nodes lower down the hierarchy and not have to spend any more time dealing with them.

(The view frustum only extends a limited distance from the viewer, so it’s also common to cull things that are too far away from the viewer. However, if this distance is too short it can cause far away objects that should be visible to disappear from the scene).

The case where an object is hidden behind another object is a bit trickier to deal with, because there’s usually no easy way to tell for sure whether this is the case or not, and we don’t want to have to get into doing complicated calculations to try and work it out because the whole point of culling things in the first place was to try and avoid doing too many calculations! However, there are exceptions; indoor scenes are a bit more amenable to this sort of optimisation because (for example) if you’ve got a completely solid wall separating one room of a building from another, you know straight away that when the viewer is in the first room, nothing in the second room is ever going to be visible (and vice versa).

Depth Cuing

Sometimes, though, even when we’ve culled everything we realistically can, things still run too slowly. For example, imagine a 3D scene looking down from a hill over a big city spread out down below. There could be hundreds or even thousands of buildings and trees and other objects visible to the viewer, and we can’t just start removing them without the player noticing, but on the other hand it’s a hell of a lot of work for the computer to render them all. What can we do?

One other option is depth cuing. This involves using less detailed models for certain objects when they’re further away from the viewer. For example, I can instruct my tree generator code to use fewer vertices on the stems and trunks, and simpler shapes made up of fewer triangles for the leaves. This wouldn’t look good for trees close to the camera, because you’d notice the shapes looking less curved and more blocky, but for trees in the distance it’s not too bad.

MakeHuman can also use less detailed “proxy” meshes which would be an option for adding depth cuing to human models.

Full detail MakeHuman model (left), and with low resolution proxy mesh (right)

Ideally it’s better if we can generate the less detailed models of the objects automatically, but it’s also possible to make them manually in Blender if necessary.

Billboards

In 3D graphics terms, billboards are a bit like depth cuing taken to the extreme. In this case, instead of replacing a 3D model with a less detailed 3D model, we replace it with a flat rectangle with the object “painted” onto it via a texture – just like a billboard!

Obviously this is quite a drastic step and it only really looks acceptable for objects that are pretty far away from the camera, but the speed improvement can be dramatic. We’re going from having to render a tree model that might contain thousands of vertices and polygons to rendering a single flat surface composed of 4 points and two triangles!

In fact, older 3D games used to make extensive use of “billboard sprites” – all of the enemies and power-ups in Doom were drawn this way, as were the trees and some other things in Super Mario 64. The downsides are that they can look quite pixellated and blocky close up, and also that (unless the game creators included images of the objects from different angles) they look the same no matter what angle you view them from.

Creating texture images for every object that we might want to turn into a billboard would be a lot of work, and the resulting images would take up a lot of space as well. Fortunately, we don’t have to do this; WebGL is quite capable of creating the billboard images on-the-fly when they’re required, using a technique called render-to-texture. Basically, this means that instead of drawing a 3D scene directly onto the screen like normal, we draw it into an image stored on the GPU, and that image can then be used as a texture when drawing future scenes.

That little pixellated tree was my very first attempt at a billboard sprite!

This is an incredibly useful technique. As well as making billboards, it can also be used for implementing things like display screens and mirrors in games, and some 3D systems use it extensively for doing multiple rendering passes so that they can do clever stuff with lights and shading. I’d never used it myself before, but once I’d coded it up for generating the billboards, I was pleased that it seemed to work pretty well.

Up close, it’s pretty obvious which tree is the 3D model and which is the billboard…

… but from a bit of a distance the billboard looks a lot more convincing

One potential problem with both depth cuing and billboards is known as “pop in”. This is the effect you sometimes see when you’re walking forwards in a game and you see a sudden visible “jump” in the scenery coming towards you, because you’ve now got close enough to it that the billboard (or less accurate model) being used for speed has been replaced by the proper 3D model. It’s difficult to get rid of “pop in” altogether, because no matter how good the billboard is, it’s never going to look exactly the same as the original model, even from quite a distance; but we can minimise it by using as good a substitute as possible and by only using it for objects a long way from the viewer.

Phew! That was pretty long and quite technical this time, but I’m really pleased to have got all of this stuff into the game engine and working. (It’s swelled the engine code up to a much larger 3,751 lines, but it’ll be worth it). I’ve tried to make it all as general as possible – there’s a mechanism in the code now for any object in the game world to say to the engine, “Hey, you can replace me with a 256×256 pixel billboard once I’m 20 metres away from the camera!” or “Here’s a less detailed model you can use once I’m 10 metres away!”, so it should be useful for speeding up all sorts of things in the future. Hopefully next time I should be back doing something a bit more fun… I haven’t quite decided what yet, but it’ll probably involve adding more elements to the game world, so stay tuned for that.

But why now?

You might reasonably ask why I chose to do all this optimisation work so early on in the project. After all, there were plenty of more interesting (to most people anyway!) things I could have been working on instead, like adding streets and buildings to my town. Also, the general advice given to programmers is not to get caught up in optimising code too early, because it complicates the code and because you might end up wasting your time if it turns out it would have run fast enough anyway. I had three main reasons for disregarding this advice:

  1. I already knew from similar projects I’d done recently that I was going to need these optimisations or the engine would be nowhere near fast enough.
  2. I also expected that the billboarding was going to be (along with the skeletal animation) one of the trickiest things to code, so I wanted to get them both out of the way as early as possible, because if it turned out that they were beyond my coding ability or beyond what JavaScript could realistically cope with, I’d rather find that out now than when I’ve spent months perfecting the rest of the game only to find out I can’t actually finish it.
  3. In my experience it’s usually easier to build fast code from the start than it is to try and “retrofit” speed to slow code later on. Some optimisations require a certain code architecture to work properly, and it’s not ideal if you find you’ve already written 10,000 lines of code using a completely different architecture.

Anyway, I’m happy. It’s all working now and the coding difficulty should hopefully be mostly downhill from this point onwards.

 

Game Project part 4: Vegetation… that’s what you need

Oops. I didn’t mean to leave it quite so long after part 3 before posting this. I have been doing quite a lot on the game, it’s just either been dull (but necessary) rearranging of the code that I’m not going to bother writing about, or it’s been other coding stuff that I’ll discuss in part 5 (my coding has got a bit out of sync with my posts on here).

Having now got the basics of human models and animation working using WebGL, for the next several posts I’ll be shifting my attention back to creating the game world, and building the tools I need to give the characters a more interesting environment to explore. Starting, in this post, with trees!

As luck would have it, I already had a very useful chunk of tree-related code that I wrote for the Botanic Gardens Station model I made a couple of years ago, and I was able to integrate that code mostly unchanged into my new game engine, allowing me to place trees into the game world. But although the main core of the tree generation code was already there, I wanted to add three major things for this game:

  1. A nice tree editor. I always intended to make one for the Botanics project, but in the end I only used one type of tree for that model, so I did it by just editing numbers in the code until I got it to look vaguely like I wanted it to. For the game I want a more powerful way of editing tree types.
  2. An easy way of placing trees into the game world. I want the ability to place individual trees at specific points, but I also want to be able to generate areas of woodland without having to manually specify the exact location of every single tree. I already had something like this for the Botanics model which I could adapt.
  3. A way to keep the game running fast even with lots of trees in the world. The tree models contain quite a lot of polygons and on less powerful systems (like the Macbook Air that I’m using for much of the game development) things can easily slow down to a crawl when there are lots of trees visible. I need to speed up the code so that it can cope better with this.

Altogether that added up to quite a lot of work, so much so that I’m going to split off the third item into a separate blog post about optimisation and just concentrate on 1 and 2 here.

No. 1: The Larch. The Larch *

I wrote most of the tree editor on the train down to London. All the people standing in the aisles who would have been on the previous train if it hadn’t been cancelled made it a bit hard to concentrate, but luckily this code didn’t require too much thought – it was mostly a case of adding edit controls to a web page and writing code to move values to and from them. The resulting editor isn’t particularly advanced or pretty, but it works and will be far better than trying to create tree types by editing numbers in the code like I was before.

It’s a bit like a very primitive MakeHuman for trees, in the sense that it lets you edit tree models by tweaking meaningful(ish) parameters like the lengths of the branches, the density of the leaves and the overall shape rather than having to worry about the individual vertices and faces like you would in traditional 3D editing. Once I’m done editing each tree, I can copy the text below the tree graphic and paste it into my JavaScript code to include that tree type in the game.

Placing trees using Blender

One downside to generating geometry such as trees within my code is that it means I can’t easily use a 3D editor like Blender to put them in the scene – if I was to model a tree using Blender and put it in my scene and export it, I’d have a full 3D mesh, which isn’t what I want, because I want to generate the meshes for the trees within my JavaScript code instead.

However, I still want to use Blender to build the game scenes. I’m already using it for my terrain, and it has lots of amazing editing tools that will be handy for future parts of the game. I could just leave the trees out of the Blender scenes and put them in some other way (either by building some editing tool of my own or just by tweaking the co-ordinates manually), but that would be more work, and I’d have to remember that there were going to be trees in certain places when editing the other stuff in Blender, so it’s not ideal.

“That’s not a cube, it’s a tree”. “Oh. I see you’ve played Cubey-Treey before!”

Instead I’m just putting placeholder objects in the Blender scene (I’m using cubes but they could be anything), and giving them names that make it obvious they’re meant to be trees, and also include their types. Then I’ve extended the terrain converter program that I wrote way back in post 1 to recognise the placeholders and spit out a list of their positions and types in a form that I can easily incorporate into the game code. That way I can move the trees around and change their types from within Blender like I wanted, but I still get the advantages that come with generating the tree models within my game engine.

But what about whole areas of woodland? As I said earlier on, I didn’t want to have to tediously place every tree manually for those. Once again I’m using the idea of a “placeholder” object in the Blender scene. This time it’s merely a base that trees will later sprout from. The code I wrote for the Botanic Gardens Station model can automatically place trees on this base according to some parameters. I can tell it what types of tree I want in this woodland, how densely they cover the ground, how close they’re allowed to be to each other, etc. and it will generate a woodland for me with very little effort on my part!

My terrain model in Blender. That yellow rectangle is an area that I’m marking out to become a woodland.

Of course, there’s always a risk that I won’t like what it comes up with. If so, I can change a “seed” number and it will place the trees differently, though still obeying the same constraints as before.

Back in the game engine, my character can now go for a walk in the woods.

So now the game has a slightly more interesting landscape! There are still many more elements to be added of course, but my next task is going to be to speed it up. It’s already slowing my laptops down noticeably when I add a woodland to the scene (though my nice big Linux desktop PC with proper graphics card doesn’t even break a sweat), so that needs to be addressed before I go too much further.

* may not actually be a larch

Game Project part 3: An animated discussion

… well, a blog post about animation, at least 😉 .

Last time I left my preliminary game characters to move around the preliminary game world with their legs gliding across the ground and their arms sticking out like scarecrows’. My task since then has been to get them animated properly.

There’s a few different ways of handling animation in 3D. At the most basic level, if you want to make a character model appear to walk, you need to show the model with its arms and legs in different positions each frame to give the illusion of motion. So one method would be to just make a separate 3D model for each animation frame and show them in quick succession, one after another.

This is a pretty simple approach, but it has its drawbacks. Firstly, you’re going to have to make a lot of models! If you want a walking animation that runs at 30fps and is 2 seconds long, you need to edit and export 60 separate models, one for each frame. That’s a lot of work, and if you then want to make a walking animation for a second character, you need to do it all over again. Secondly, a lot of models means a lot of data, and in the case of a web-based game like mine a lot of data is bad news because it’s all got to be transferred over the internet whenever someone plays the game.

These drawbacks can be overcome by using skeletal animation instead. In this case you assign a skeleton (a hierarchy of straight line “bones”) to your character model, as well as a “skin” that describes how the character mesh deforms when the skeleton moves. Then you can create animations simply by determining the shape of the skeleton for each frame, and the model itself will automatically contort itself into the right shape. This means that you don’t need to store a complete new copy of the mesh for each frame, only the angles of each bone, which is a much smaller amount of data. Even better, as long as all your human models share the same skeleton structure, you can apply the same animations to all of them.

A skeleton “walking” in Blender:

Despite the big advantages of skeletal animation, I hadn’t been planning to use it at first for this game, because it requires some slightly complicated calculations that JavaScript isn’t ideally suited to. But once I’d thought about it a bit I realised that realistically I would have to use it to keep the amount of data involved (and the amount of manual effort to create the animations) manageable, so I coded it up. It wasn’t as bad as I feared and the core code to deform a mesh based on a skeleton only came to about 100 lines of JavaScript in the end.

Even small bugs in the animation code can cause effects that surrealist artists would have loved

You can make animations using Blender’s “Pose Mode” and animation timeline, then save them to BVH (BioVision Hierarchy) files. I wrote (yup, you’ve guessed it) yet another converter tool to convert these into binary files for the game engine, as well as extending my previous Collada converter to include the skeleton and skin information from MakeHuman. MakeHuman has various built-in skeletons that you can add to your human. I use the CMU skeleton, partly because at 31 bones it’s the simplest one on offer, and partly because it works with some nice ready made animations that I’ll talk about in a bit.

Here’s a simple animation of a character’s head turning that I made in Blender. It probably won’t win me any awards, but it served its purpose of reminding me how to create animations:

I didn’t, however, make that “walking skeleton” animation that I showed earlier. It came from a great animation resource, the Carnegie Mellon University Motion Capture Database. This contains a huge library of animations captured by filming real people with ping pong balls* attached to them performing various actions, and they’re free to use for any purpose. I will probably use some of these in the game, though they probably won’t have all the animations I need and I’ll still have to make some myself, so I’m a bit worried that the CMU ones will show up mine as being pretty rubbish! Still, we’ll get to that later.

Here’s one of my game characters walking with an animation from the CMU database:

With the animation in place it suddenly starts to look much more like an actual game, though admittedly a pretty dull one at this point!

Next I think I’ll turn my attention to re-organising the code a bit so that it can better manage all the assets required for an actual game – not the most glamourous of tasks but it needs to be done and will be well worth it. Then I’ll probably switch back to working on the game world and add something more interesting than a bare landscape for the characters to explore! Hopefully talk to you again soon.

(In case anyone’s interested, the current JavaScript game code runs to 1,558 lines).

* may not actually be ping pong balls

Game Project part 2: I guess this is character building

Last time I worked out a method of editing terrain using Blender, exporting it and then rendering it using JavaScript and WebGL. This time we’re going to add something a bit more interesting: namely a character!

Now, the characters in my game are going to be humans, and modelling 3D humans is not an easy thing to do unless you’re a pretty experienced 3D artist (which I’m not). Fortunately for me and others like me, there’s a great little free program called MakeHuman that does most of the difficult bits for you! In fact, I would go so far as to say that I probably wouldn’t be building this game at all if it wasn’t for MakeHuman, because I’d be worried about my character models looking so depressingly awful that they’d ruin the whole thing.

Making a male character in MakeHuman

MakeHuman is (to me, at least) one of those tools that’s so amazing that it’s almost difficult to believe it really exists. Basically, you start it up and you’re confronted with a 3D human model and a bunch of different controls (mostly sliders) that you can use to change various properties of the human. The sliders on the initial screen control very important properties that affect the shape of the entire human: here you can control gender (a continuum between male and female rather than a binary choice), age (from 1 to 90 years), height, weight, race (as a mix of African, Asian and Caucasian) and a handful of other things. But if you drill down into the other tabs, there are sliders to change just about every detail you can imagine (for example, there are 6 sliders just in the “Nose size” category, and there are also “Nose size detail” and “Nose features” categories!). The quality of the generated models is very good indeed.

As well as giving you a huge amount of control over the shape of your human model, MakeHuman also provides a wealth of other useful features. On the “Materials” tab you can assign skin and eye textures to the model, and there’s also a “Pose/Animate” tab that controls the character’s pose and allows you to add a skeleton for skeletal animation (more on that in a future post). And unless you’re exclusively making naked, hairless characters (not that there’d be anything wrong with that 😉 ), you’ll definitely want to visit the “Geometries” tab to add some hair and clothes to your model. If MakeHuman has a weakness it’s probably that the selection of hair, clothes and other accessories included is a bit barren, but you can make your own using Blender or download ones other people have made from the MakeHuman Community site, which has a much better selection.

Making a female character in MakeHuman

As you’ve probably guessed by now, I rather like MakeHuman! I’m only really scratching the surface of it here as this is supposed to be a blog about my game rather than about MakeHuman, but there’s plenty more information on the MakeHuman Community website and elsewhere online if you’re interested.

The licensing of MakeHuman is set up so that as long as you use an unmodified official build of the software and export using the built-in exporters, the resulting model is under the Creative Commons Zero licence, allowing you to do anything you want with it, even incorporate it into a commercial product. In order to get my character models out of MakeHuman and into my JavaScript game engine, I decided to use a similar approach to that used for the terrain last time: export to a standard file format and then write a converter program to convert to a compact binary format for the engine.

MakeHuman supports the same OBJ format that I used for exporting the terrain from Blender, but I didn’t use it this time; it doesn’t support all the features of MakeHuman and although this limitation wouldn’t be a problem right now, it will become a problem in the near future. Instead I used Collada, a complex XML-based format that captures a lot more information than OBJ does. Loading Collada files is a lot more involved than loading OBJ, but luckily I already had some C++ code from a previous project that was capable of extracting everything I needed. I modified this to write out the important data (at this point just the basic 3D mesh for the human body as well as accessories like the hair and clothes) to a binary file that I could load into my engine.

I also had to make quite a lot of changes to the engine at this point. When I wrote my last blog post, all it could do was display a single terrain mesh with a single texture mapped onto it. I rewrote the code to include a scene graph system, allowing multiple models to be placed in the 3D world in a hierarchical fashion, and also to support multiple textures. Then I wrote a loader for the models I converted from MakeHuman and added my human to the scene.

Male character in the game world

(The meshes exported from MakeHuman contain a lot of faces and are very detailed, in fact probably too detailed for my purposes. Fortunately it gives you the option of using a a less detailed but more rough looking “proxy” mesh in the place of the detailed mesh, so that’s what I’m doing for my characters, so that file sizes stay small and rendering isn’t too slow. However, the clothes and hair meshes are still very detailed so I suspect I will end up making my own so as not to bog the game down with too many polygons to deal with, as well as to make them look just how I want).

At this point (once I’d ironed out the inevitable bugs) I had a character in my game world, but all it could do was stand there. I wanted some movement, so I added code allowing me to move the human across the terrain using the keyboard, keeping the model at ground level at all times. I also added a very basic “floating camera” that would follow along behind the character. All of this (the models, the physics, the camera) is very preliminary and will need a lot of improvement in the future, but right now it’s quite cool to see the humans and the terrain working together in the game engine like this.

Female character in the game world

“But”, you might say, “Why are their arms sticking out like scarecrows’?”. If I’d been arsed to make a video rather than just post still screenshots, you would probably also comment on the fact that they’re sliding along the ground without their legs moving at all. Fear not! My very next step will be to add some animation to the characters. I was originally planning to have that done for this post but once I realised how much work it would be I decided to split it off, so stay tuned for that.

 

Game Project part 1: Do you think it’s going terrain today?

Last time I talked about the new game project I’m planning to start. I was feeling quite enthusiastic about it and I had a bit of time while I was away in the caravan, so I decided it was time to actually start doing stuff on it!

I won’t say too much about the actual concept of the game yet, and in fact it might still change a bit before it’s finished, but it’s going to be set in a 3D town that you can wander round. So one of the first things to do will be to get the bare bones of a game engine running that can display the 3D world using WebGL, and also get an editing pipeline working so that I can create and edit the environment.

For most of the 3D editing I’m planning to use Blender. It’s free, it’s very powerful, it has a great community, it runs on almost everything and I already know how to use it, so what’s not to like? At some point I might want a more customised editing experience, maybe either writing a plugin or two for Blender or adding an editing mode to the game engine itself, but for the moment vanilla Blender will do.

The first element of the environment that I’m going to focus on is the actual ground, since it’s (literally) the foundation on which everything else will be built. I could model the ground as a standard 3D mesh, but it would be more efficient to treat it as a special case: it’s basically a single plane but with variations in height and texture across it, so we can store it as a 2D array of height values, plus another 2D array of material indices. My plan for the ground was as follows:

  • Add the ground as a “grid” object in Blender, and model the height variations using Blender’s extensive array of modelling tools
  • Export the geometry in OBJ format (a nice simple format for doing further conversions on)
  • Write a converter program in C++ to convert the OBJ file into a compact binary format containing the height values for each point and the material values for each square
  • Create a single texture image containing tiled textures for all the materials used
  • Write JavaScript code to parse the binary file and actually display the terrain!

(We could load and parse the OBJ file directly in JavaScript, but this would be significantly larger, and size matters when working in a browser environment, because every data file has to be downloaded over the internet when running the game).

Editing terrain in Blender

The terrain work went reasonably smoothly once I got started. Editing the heights as a Blender grid worked well, with the proportional editing tool being very useful. Writing the C++ converter tool didn’t take too long, and the binary terrain files it creates are about 10 times smaller than the original OBJ files exported from Blender, so it’s well worth doing the conversion.

Terrain rendered in WebGL

Writing a WebGL renderer for the terrain was a bit more involved. The main problem I ran into was an unexpected one: I could see dark lines appearing along the edges of the terrain “tiles” when they should have joined up with each other seamlessly. I eventually traced this to my decision to store all of the textures for the ground in a single image. This works fine for the most part, but I hadn’t foreseen that it would cause problems with the filtering used by WebGL to make the textures look smoother at different scales. This causes a slight blurring effect and when you have multiple textures side-by-side in a single image, it causes their edges to “bleed” into each other slightly.

I solved this by putting 4 copies of each texture into the image, in a 2×2 layout, and mapping the centre section onto the terrain, so that the blurred edges are never used. This reduces the amount of texture data that can be stored in a single image, but it’s still better than storing each texture in a separate image and having to waste time switching between them when rendering.

Now that it’s done I’m reasonably happy with how it looks. I was a bit worried that the height differences might distort the textures too much, but they actually don’t seem to. I do plan to add some additional texture images for variety, and it should also look a lot better once some of the other elements of the scene (buildings, roads, vegetation, etc.) are in place.

Another shot of the rendered terrain

The WebGL renderer is still in its very early stages; right now all it can do is render a single 3D terrain object with a plain sky blue background, illuminated by a single directional light source, and allow me to move the camera around for testing. Obviously it’ll need a lot of other stuff added to enable it to show everything else required for the game, as well as to make things look a bit nicer and run a bit faster – but we’ll get onto that next time.

(Incidentally, the texture images are all from textures.com, a great resource for anyone doing anything 3D related. You can get loads of textures of all sorts from there and they’re free to use for most purposes).

New game project

I’ve decided it’s time to start a new game project. I haven’t done one (well, not a proper one) in a few years but things now seem to be nudging me back in that direction.

A screenshot from the first full game I created. Yes, it’s a fan-made Dizzy game for the Spectrum.

Since the Union Canal Unlocked project finished a year or so ago, I’ve been working on a 3D project in my spare time, but I’ve become frustrated that it’s not really going anywhere, at least not very fast. I had very ambitious goals for it and maybe I’m just starting to realise how long it would realistically take for me to achieve them. I may go back to it at some point, but right now I’m getting tired of pouring time and effort into code that may not actually produce any interesting output for several months or even years.

At the same time, a few things have happened that reminded me how much I used to enjoy making games. I read through an old diary from the time when I was making my first one (well, the first one I actually finished), back in 1994, which seems impossibly long ago now. It brought back the feeling of achievement and progress I used to get from making another screen or another graphic. I’ve also recently played through a game that a friend made a few years ago, and another friend started studying game design just a few weeks ago. It feels like the right time to go back to it.

My second game, also for the Spectrum. This was going to have a ridiculously ambitious 56 levels but I only got around to making 6

Of course, it’s going to be challenging to find the time, especially with our new arrival in the household! But in a way that just makes me more determined to use my scarce time more effectively, on something that I’ll actually find rewarding, rather than trying to force myself to work on something I’ve lost interest in.  Even if I only manage to do a little bit each week, I’ll get there eventually.

I have a rough plan for the new game, which will no doubt get refined and altered a lot once I get started on it. It’s going to be my first 3D game (except for a little joke one I made late last year), something I’ve shied away from in the past mainly due to the additional complexity of 3D asset creation, but after actually completing some 3D models in the last few years, I feel a bit more confident that I can do it, and I think it will fit my concept better.

My third game, this time a historical Scottish game for DOS PCs. I only finished one level of this one

I’ve decided to write it to run in a browser, using JavaScript and WebGL. This will have its pros and cons. On the plus side, it’s technology I already have quite a bit of experience of; the game will automatically work on pretty much every platform without much extra effort on my part; people won’t have to install it before playing it; and not using a ready-made game engine will give me freedom to do everything exactly the way I want (plus I find tinkering with the low level parts of the code quite fun!). On the minus side it likely won’t run as fast as it would have as a “native” app, though I don’t see this being a huge problem in practice as what I have in mind shouldn’t be too demanding; and building the engine from scratch will take quite a lot of work.

To begin with I’m just going to target computers rather than tablets and phones. The control system I have in mind will work with keyboards and mice but not so well with touchscreens. At some point later on I might add a touch control scheme since most of the rest of the code should work fine on touch devices.

“Return of the Etirites”, probably the best game I ever made. It’s basically a rip-off of Mystic Quest on the Gameboy

I’m intending to write a series of posts on here to chronicle my progress. Of course, it’s always a bit dangerous to commit to something like this publicly, but that’s part of my reason for wanting to do it… I hope it will encourage me to actually do some stuff and not just think about it! And it will give me something nice and constructive to write blog posts about, instead of Brexit 😉 . It might take me a while to get the first post up, because as anyone who’s used WebGL (or done any OpenGL coding without using the fixed pipeline) knows, it takes quite a lot of code to even display anything at all. But once the basics are done it should be possible to build on it incrementally and progress a bit more rapidly.

Wish me luck!

 

My take on “Codes of Conduct” for software projects

The news that the Linux kernel development project has adopted a new code of conduct has prompted a lot of comment. As someone who’s been a software developer for all my working life and who’s written about vaguely related stuff before, I thought I would stick my oar in as well, at least to address what I think are some widespread misconceptions.

First off, I’ll say a bit about myself and my own experience. I’ve been a software professional for 16 years. During that time I seem to have impressed a lot of the people I’ve worked with. I have more than once “rescued” projects that were previously thought to be doomed and turned them into success stories. Collaborators who have worked with me in the past have frequently requested to work with me specifically when they approach my organisation for further consultancy. Last year I was promoted to a fairly senior technical position, and also last year I did my first paid freelance project, receiving glowing praise from the client for the way I handled it.

I’m not saying this to brag. I’m normally a pretty modest person and believe me, talking about myself in those terms doesn’t come easily. I’m saying it because it’s going to be relevant to what I say next.

I’m also, by pretty much any definition, a snowflake. (That’s the term these days, isn’t it?). I don’t like confrontation and I tend to avoid it as much as I can. I find it hurtful being on the receiving end of harsh words or blunt criticism and I also tend to avoid situations where this is likely to happen. When it does happen I find I need to retreat and lick my wounds for a while before I feel ready to face the world again.

I didn’t choose to be this way, and if I’d been given the choice I wouldn’t have chosen it, because to be honest it’s pretty damned inconvenient. But it’s the way I am, the way I’ve always been for as long as I can remember. (Again, this may not seem relevant yet, but trust me, I’m bringing it up for a reason).

It’s maybe not surprising, then, that I’m broadly supportive of any initiative that tries to make software development a friendlier place. I don’t follow Linux kernel development closely enough to have a strong opinion on it, but some open source communities certainly have acquired reputations for being quite harsh and unpleasant working environments. This probably is a factor in my choosing not to contribute to them – although I have contributed a bit to open source in the past, these days if I want to code for fun I prefer to just tinker with my own solo projects and avoid all that potential drama and discomfort.

Not everyone agrees, of course, and sites like Slashdot are awash with comments about how this is a disaster, how it’s going to destroy software quality, and how it’s the beginning of the end of Linux now that the Social Justice Warriors have started to take over. I’m not going to attempt to address every point made, but I would like to pick up on a few common themes that jumped out at me from reading people’s reactions.

Fear of harsh criticism makes people deliver

The main justification put forward for keeping the status quo seems to be that people will up their game and produce better code if they’re afraid of being flamed or ridiculed. I don’t believe this works in practice, at least not for everyone.

I remember years ago when I was learning to drive, my first instructor started acting increasingly like a bully. When I made mistakes (as everyone does when they’re learning something new), he would shout at me, swear at me and taunt me by bringing up mistakes I’d made weeks before. But far from spurring me on to improve my driving, this just wound me up and made me stressed and flustered, causing me to make even more mistakes, in turn causing him to throw more abuse my way, and so on. It got so bad that I started to dread my driving lessons and when I was out in the car with him I lost all confidence and became terrified of making even the tiniest mistake.

After a few weeks I got fed up with this so I phoned the driving school and told them I wanted a different instructor, someone who would build up my confidence rather than destroy it. They assigned me to a great instructor, an experienced and patient older man who I got on very well with, and the contrast was dramatic. My driving improved straight away and I started to actually look forward to my lessons. Within a few weeks I was ready to take my test, which I passed on the first attempt. I always remember this experience when I hear someone express the opinion that abuse will make people perform better.

Of course, everyone responds differently to these situations. I knew someone who said he was glad his driving instructor shouted at him because, after all, it was potentially a life-or-death situation and this would help him to take it seriously. So I’m not saying everyone’s experience will be the same as mine, just pointing out that not everyone responds positively under that sort of pressure.

Furthermore, someone who goes to pieces in the face of abuse might still be perfectly capable in other circumstances. I was able to drive just fine once I got away from that first instructor, and since then I’ve driven all over the country, driven minibuses and towed caravans without incident.

People will use the code of conduct to blow grievances out of all proportion and seek attention

Personally, as someone who hates conflict and hates being the centre of attention, I can’t imagine anything I’d be less likely to do than go out of my way to draw attention and publicity to myself. If anything I think I’d more likely be far too reticent about seeking help if someone was violating a code of conduct, and I imagine it would be the same for most of the people who would actually benefit the most from the code.

That’s not to say everyone would be the same, of course. There might well be a vocal minority who would act in this way, but that shouldn’t stop us from trying to improve things for people who genuinely do need it. In any case, whether a given behaviour really constitutes gratuitous “attention seeking” or whether it’s out of proportion is very much a subjective judgement.

Emotionally fragile people have nothing to offer anyway

I hope my description above of my own working life has shown that we do have something to offer. I think this belief is due to confusion between “people who are good at software development” and “people who are good at being loud and obnoxious”. If you create a working environment so toxic that 70% of people can’t cope with it and leave, that doesn’t mean you’ve retained the 30% best developers, it means you’ve retained the 30% of people best equipped to thrive in an abusive environment. I see no reason to think there’s going to be much correlation there.

I think a similar argument can be made about the contentious “safe spaces” I’ve written about before. Many of their opponents argue that it’s healthier to be exposed to a diverse range of different points of view rather than living in a bubble. I completely agree, but I disagree about how best to achieve that. A complete free-for-all isn’t necessarily a reliable way to foster open debate – you can easily end up with a situation where the loudest, most abrasive people come to dominate and everyone else is reluctant to express a contrary opinion for fear of being abused and ridiculed. If you genuinely want (and I’m not convinced many of the detractors actually do want this) to hear as wide range a of opinions as possible, you need an environment where everyone feels comfortable expressing themselves.

Maybe if there were unlimited good software developers in the world you could make a case for only working with the emotionally hardy ones and avoiding the possible difficulties of dealing with us “snowflakes”. But there aren’t. In most places developers are highly in demand, so it makes no sense to dismiss people who might actually be able to make a valuable contribution.

It’s not up to us to accommodate your emotional frailties, it’s up to you to get over them

Of all the views expressed in these discussions, I think this is the one that irks me the most. It implies that anyone who reacts badly to harsh words and insults could easily “get over it” if they chose to do so, and that just doesn’t tally with my experience at all.

I’ve spent many decades trying to “get over” the problems I’ve had. I’ve spent a five figure sum of money on therapy. I’ve read more self help books than I care to remember and filled notebooks cover-to-cover with the exercises from them. I’ve forced myself into numerous situations that terrified me in the hope that they would be good for me. I’ve practised mindfulness, attended support groups, taken medication, taken up exercise, talked things over with friends and family, spent long hours in painful introspection. You name it, I’ve probably tried it.

And you know what? I’m a lot better than I was. At the start of the process I could barely even hold a conversation with someone unless I knew them well, and I certainly wouldn’t have been able to hold down a job. Now I function reasonably well most days, I do pretty well at work and I have a decent social life as well. But despite all this progress, I’m still pretty emotionally sensitive, and I still don’t cope well with insults and intimidation. Maybe I’ll get even better in the future (I certainly hope to and intend to), but I suspect I will always find that kind of situation unpleasant enough to want to avoid it when possible, even if I no longer find it as debilitating as I once did.

So it makes me pretty angry when people who don’t even know me assume that, because I still get upset more easily than most, I obviously just haven’t tried hard enough. It’s noticable that these people almost never mention how you should “get over it”. Some of them seem to just assume that if you keep putting yourself in the situation that upsets you then you’ll eventually adjust and be OK with it, but this has never worked particularly well for me – as with the driving lessons example I gave above, it typically just leads to me feeling more stressed and harassed.

Basically, I think this one is an example of the just-world fallacy. It’s uncomfortable to realise that some people might struggle with certain situations through no fault of their own and that there might not be any easy solution open to them. It raises all kinds of awkward questions about whether we should be making adjustments to help them and so on, not to mention the fear of “maybe this could happen to me too some day”. It’s much neater to pretend that those people must have done something to deserve their problems, or at the very least that they must be “choosing” to forego a perfectly good solution.

Whilst I do have a tiny bit of sympathy for some of the objections to the way things are going (I wouldn’t personally relish software development becoming yet another field where social skills and confidence are valued over actual technical ability, for example), overall I find it really hard to take most of the objectors seriously. They moan and whinge about what a disaster it would be to have to treat others with basic civility, then go on to accuse the other side of being over-sensitive and blowing things out of proportion. They heap disdain on people for having problems they never asked for and almost certainly don’t want, but fail to put forward any useful suggestions on how to deal with those problems.

Making the Online Botanic Gardens Station Model (Part 2: The Viewer)

Last time, I talked about how the 3D model itself was made. In this post, I’ll discuss how I embedded it into a web page so it can be explored in a web browser.

Not so long ago, it was difficult or impossible to produce real time 3D graphics in a web browser, at least it was if you wanted your page to work in a variety of browsers and not require any special plug-ins. That’s all changed with the advent of WebGL, which allows the powerful OpenGL graphics library to be accessed from JavaScript running in the browser. WebGL is what’s used to render the Botanic Gardens Station model.

The finished WebGL viewer

The finished WebGL viewer

There are already a number of frameworks built on top of WebGL that make it easier to use, but I decided I was going to build on WebGL directly – I would learn more that way, as well as having as much control as possible over how the viewer looked and worked. But before I could get onto displaying any graphics, I needed to somehow get my model out of Blender and into the web environment.

I did this by exporting the model to Wavefront OBJ format (a very standard 3D format that’s easy to work with), then writing a Python script to convert the important bits of this to JSON format. Initially I had the entire model in a single JSON file, but it started to get pretty big, so I had the converter split it over several files. The viewer loads the central model file when it starts up, then starts loading the others in the background while the user is free to explore the central part. This (along with a few other tricks like reducing the number of digits of precision in the file, and omitting the vertex normals from the file and having the viewer calculate them instead) reduces the initial page load time and makes it less likely that people will give up waiting and close the tab before the model even appears.

How not to convert quads to triangles

How not to convert quads to triangles

Once the model is loaded and processed, it can be displayed. One feature of WebGL is that (in common with the OpenGL ES API used on mobile devices) it doesn’t have any built in support for lighting and shading – all of that has to be coded manually, in shader programs that are compiled onto the graphics card at start up. While this does increase the learning curve significantly, it also allows for a lot of control over exactly how the lighting looks. This was useful for the Botanics model – after visiting the station in real life, one of my friends observed that photographing it is tricky due to the high contrast between the daylight pouring in through the roof vents and the dark corners that are in the shade. It turns out that getting the lighting for the model to look realistic is tricky for similar reasons.

The final model uses four distinct shader programs:

  1. A “full brightness” shader that doesn’t actually do any lighting calculations and just displays everything exactly as it is in the texture images. This is only used for the “heads up display” overlay (consisting of the map, the information text, the loading screen, etc.). I tried using it for the outdoor parts of the model as well but it looked rubbish.
  2. A simple directional light shader. This is what I eventually settled on for the outdoor parts of the model. It still doesn’t look great, but it’s a lot better than the full brightness one.
  3. A spotlight shader. This is used in the tunnels and also in some parts of the station itself. The single spotlight is used to simulate a torch beam coming from just below the camera and pointing forwards. There’s also a bit of ambient light so that the area outwith the torch beam isn’t completely black.
  4. A more complex shader that supports the torch beam as above, but also three other “spotlights” in fixed positions to represent the light pouring in through the roof vents. This is only used for elements of the model that are directly under the vents.

The full brightness shader in all its horrible glory

The full brightness shader in all its horrible glory

Although there’s no specular reflection in any of the shaders (I suspect it wouldn’t make a huge difference as there’s not a lot of shiny surfaces in the station), the two with the spotlights are still quite heavyweight – for the torch beam to appear properly circular, almost everything has to be done per-pixel in the fragment shader. I’m not a shader expert so there’s probably scope for making them more efficient, but for now they seem to run acceptably fast on the systems I’ve tested them on.

Can’t see the wood or the trees

In Part 1, I mentioned that the trees weren’t modelled in Blender like the rest of the model was. I considered doing this, but realised it would make the already quite large model files unacceptably huge. (Models of organic things such as plants, animals and humans tend to require far more vertices and polygons to look any good than models of architecture do). Instead I chose to implement a “tree generator” in JavaScript – so instead of having to save all of the bulky geometry for the trees to the model file, I could save a compact set of basic parameters, and the geometry itself would be generated in the browser and never have to be sent over the internet.

A Black Tupelo with no leaves

A Black Tupelo with no leaves

The generator is based on the well-known algorithm described in this paper. It took me weeks to get it working right and by the end I never wanted to see another rotation matrix again as long as I lived. I wouldn’t be surprised if it fails for some obscure cases, but it works now for the example trees in the paper, and produces trees for the Botanics model that are probably better looking than anything I could model by hand. I didn’t mean to spend so much time on it, but hopefully I’ll be able to use it again for future projects so it won’t have been wasted time.

A Black Tupelo with leaves

A Black Tupelo with leaves

(Blender also has its own tree generator based on the same algorithm, called Sapling. I didn’t use it as it would have caused the same file size problem as modelling the trees manually in Blender would).

Spurred on by my success at generating the trees programmatically (eventually!), I decided to apply a similar concept to generating entire regions of woodland for the cutting at the Kirklee end of the tunnel. Given a base geometry to sprout from and some parameters to control the density, the types of trees to include, etc., the woodland generator pseudo-randomly places trees and plants into the 3D world, again only requiring a compact set of parameters to be present in the model file.

The viewer also contains a texture overlay system, which is capable of adding graffiti, dirt, mineral deposits or whatever to a texture after it’s been downloaded. This is achieved by having a second hidden HTML 5 canvas on the page on which the textures are composited before being sent to the GPU. (The same hidden canvas is also used for rendering text before it’s overlaid onto the main 3D view canvas, since the 2D text printing functions can’t be used directly on a 3D canvas).

Why not just have pre-overlaid versions of the textures and download them along with the other textures? That would work, but would increase the size of the data needing to be downloaded: if you transferred both graffiti’d and non-graffiti’d versions of a brick wall texture (for example), you’d be transferring all of the detail of the bricks themselves twice. Whereas if you create the graffiti’d version in the browser, you can get away with transferring the brick texture once, along with a mostly transparent (and therefore much more compressible) file containing the graffiti image. You also gain flexibility as you can move the overlays around much more easily.

A selection of the station model's many items of graffiti

A selection of the station model’s many items of graffiti

The rest of the code is reasonably straightforward. Input is captured using standard HTML event handlers, and the viewpoint moves through the model along the same curve used to apply the curve modifier in Blender. Other data in addition to the model geometry (for example the information text, the parameters and positions for the trees, etc.) is incorporated into the first JSON model file by the converter script so that it can be modified without changing the viewer code.

So that’s the viewer. Having never used WebGL and never coded anything of this level of complexity in JavaScript before, I’m impressed at how well it actually works. I certainly learned a lot in the process of making it, and I’m hoping to re-use as much of the code as possible for some future projects.