Lockdown Blog 2: Training an AI to recognise the Simpsons

Introduction

I’ve been meaning to have a proper play around with modern artificial intelligence techniques for a while, and during lockdown with more time on my hands seemed like a good time to give it a go. So I trained a deep learning neural network to recognise characters from The Simpsons. (As you do).

This was actually my third foray into neural networks: I used one (not very successfully) for my final year university project way back in the mists of time, and I also experimented with training one to generate text last year. (Among other things, I fed it megabytes of text from my diary and got it to generate its own diary entries based on them, which was pretty hilarious if not particularly useful). But this was the first time I’ve attempted to use one in what’s probably their biggest application area, namely computer vision and image recognition.

I thought recognising Simpsons characters would be a good way to get started with this, for several reasons. Firstly, I really like the Simpsons (or at least I did until it all went downhill in the late 90s or so). Secondly, it was relatively easy to get hold of large amounts of Simpsons images for training and testing the network (more on that in a moment). And thirdly, because cartoon characters look so distinctive, it would be easier to get a computer to tell them apart than it would in the case of (for example) real people.

Before I go any further I’d like to give a shout out to the fantastic Practical Deep Learning For Coders course made by the developers of fast.ai. I watched all the course videos a few months ago and found them incredibly interesting and inspiring, possessing the rare combination of being instantly accessible but also going into the subject in great depth. As an illustration of what I mean, after the first half hour or so of the opening lecture you’re already up and running with training a classifier to tell different cat and dog breeds apart, while the second half of the videos delve right into the code, explaining it right down to a line-by-line analysis of the algorithms that make up a neural network. Highly, highly recommended for anyone who knows how to code and is at all interested in AI.

Preparing the data

The first step in building a deep learning model is getting together some data that you can use for training and testing the neural network. In my case, I needed as many images from The Simpsons as I could get hold of, and I also needed to “tag” them (or at least most of them) with the names of the characters that appeared in them.

I decided to write a Python script that would download random images from Frinkiac, which is basically a Simpsons screen grab search engine, often used for making memes and so on. I felt a bit bad as it probably wasn’t intended for this usage, but in my defence I was quite gentle with it – I left my script running over a period of days, grabbing a single image at a time and then sleeping for a while so as not to hammer the site’s bandwidth. By the end of this process I had a completely random selection of around 3,000 screen captures from the first 17 seasons of the show sitting on my hard drive.

The next step was to “tag” these with the names of the characters that appeared in them. You might wonder why I had to do this… after all, my aim was to get the computer to identify the characters automatically, not to have to do it myself, right? Well yes, but in order to train a neural network to perform this sort of recognition task, you need to give it “labelled” data – that is, you show it an image along with a label describing what’s in it, in quite a similar way to how you might train a person to recognise characters they weren’t previously familiar with, in fact. So you need the data to be labelled.

I wasn’t looking forward to this bit as I knew it would take quite a bit of time consuming manual work – I was going to have to look at every image myself and identify the characters present, then enter that information into the computer somehow. To ease the pain, I built a little web app to try and make this process as fast as possible. It showed me the images in turn, allowing me to tag each one and move onto the next one with the minimum of key presses, writing the image names and tags into a CSV file that I could use with the AI software later on. In all I think it took me maybe an hour to write the web app and about 2 hours to tag the images, which wasn’t as bad as I’d feared.

Initially I had planned to train the network to recognise all the named characters in the show, but I later realised I probably didn’t have enough data for this – some of the more minor characters only showed up a handful of times in my training images, not really enough to make the recognition reliable. So instead I decided to focus on just the four main characters: Homer, Marge, Bart and Lisa.

Training the model

Once I had the tagged training data ready, I turned my attention to actually training a neural network to recognise it. I used the same software used in the fast.ai course I mentioned above, namely fast.ai itself (which is built on PyTorch), with the code written in the form of a Jupyter Notebook for easy experimentation. I used a ResNet34, a classic architecture for image recognition, though I also tried using a larger ResNet50 to see if it worked any better (it didn’t). Training (on my GeForce 1050Ti) only took about 5 minutes, then I was able to play with the resulting model, testing it on images it hadn’t seen before.

Overall, I was reasonably happy with it, for a first attempt. It worked very well indeed (almost perfectly) for images that included a reasonably close shot of one of the characters’ faces. For example:

Prediction: 99.78% Bart
Prediction: 99.01% Marge
Prediction: 98.98% Lisa
Prediction: 99.99% Homer

(You may notice that the model doesn’t just give a straight yes or no prediction, but a percentage score indicating how confident it is that each character does appear in the image).

The model doesn’t work so well for more complicated situations such as characters being partially hidden, characters viewed from an unusual angle, characters wearing unusual clothing (especially clothing that covers up some of their distinctive features), characters far away in the distance so that they appear very small in the image, and so on. Below are some examples where it doesn’t make such a confident prediction, and my speculation as to why that might be.

Prediction: 35.4% Marge. The model thinks it’s more likely that Marge is in the image than any of the other characters (who all scored likelihoods of less than 10%), but still isn’t very confident, probably because she’s in a slightly unusual position and has her head turned away.

Prediction: 54.23% Homer. The model thinks there’s a decent chance Homer is in this image, but isn’t very sure, probably because only the top of his face is visible in this one.

Prediction: 99.88% Lisa, 11.55% Bart. The model is very certain Lisa is here, but nowhere near as confident about Bart. I think this is probably because Bart is partially hidden behind Lisa and Maggie, while Lisa is fully visible.

Prediction: 97.23% Bart, 97.49% Homer, 88.81% Lisa, 68.79% Marge. This time the model correctly identifies that all four characters are in the image, but it’s significantly less certain about Marge than the others, probably because her face is obscured behind Bart.

Prediction: 87.75% Bart, 53.09% Homer, 94.77% Lisa, 39.25% Marge. In this shot, all the characters are present but not in their usual clothing. Bart and Lisa are recognised with a high degree of confidence, but the model is understandably not so confident about Homer, since only the top of his face and head is visible. Surprisingly, it’s even less confident about Marge, maybe because her trademark hair is mostly hidden from view.

Prediction: 98.25% Homer, 74.86% Marge. The model is a lot less confident about Marge than Homer, presumably because Homer’s glass is obscuring most of her face.

Prediction: 91.71% Homer, 93.39% Marge, 67.27% Lisa. Homer and Marge are recognised with more than 90% certainty as expected. Interestingly, the model also thinks that Lisa is probably here, I’m guessing because Maggie looks very similar to Lisa in some ways, notably her hair and eyes.

So that’s my model. I have no doubt at all that it could be done much better by someone with more expertise (or, for that matter, a better training data set), but as someone who started programming back in the days when it would have been unimaginable for a computer to do this, it’s amazingly cool to see it working even as well as it is.

Can I play with it?

More seriously, I’d like to find out how to make models like this available online for people to have a go of, but I’m not there yet. I’m new to all this and don’t want to end up overloading my web host, or running up a huge bill if I go down the cloud hosting route, so I’d definitely want to do some research or testing before attempting this.

Lockdown Blog 1

I haven’t posted on here in a while… I think it’s fair to say that, back in the now very innocent seeming days of mid-2019, I did not expect my next post to be written from a country in full lockdown, forbidden from leaving our homes except for a few very specific reasons. I don’t think anyone else saw it coming either.

As I write this, we’ve been in lockdown for just over a week, and I personally have been working from home for just over two weeks. I should acknowledge right from the start that I’m in a pretty fortunate position compared to a lot of people: no-one close to me seems to have got the virus yet, I have a pleasant and secure place to spend lockdown with people I love, and I’m relatively safe from the financial effects of all this as well. I fully support the lockdown and I know that people suffering from the virus and those on the front line of treating it are much worse off than I am.

That said, I also think it’s important to acknowledge that this is an unprecedented upheaval for almost all of us, and that it’s clearly going to affect everyone one way or another. So I think it’s completely legitimate to talk about how it might affect our mental health and what might be good coping strategies, even for those of us not on the front line.

Speaking for myself, a few weeks ago when it started to become clear what was about to happen, I was utterly dreading it. Probably not an uncommon reaction, but I had particular reason to be worried. As I’ve mentioned in previous posts, I’ve spent a lot of my life (almost the whole of the first 15 years of adulthood, in fact) living with clinical depression and anxiety. Eventually I managed to get this mostly under control, but the only way I ever found to keep the depression at bay was to keep doing lots of exciting things to keep my mood up: folk dancing, solo foreign travel, urban exploration, taking part in the Beltane Fire Festival, and so on.

I’ve pretty much spent the last several years making sure I always had a few of those things lined up to look forward to within the next few months, and it has made a massive difference: to put it bluntly, the difference between life feeling worth living, and… well… not worth living. So hearing the news that none of these activities were going to be possible for several months, quite likely not for the whole of this year, I didn’t know what I was going to do. I felt as if, after years of thinking I’d never be able to walk again, I’d finally learned to hobble around with the aid of crutches, only to now be told I wasn’t allowed to use the crutches for the next several months. And I really, REALLY didn’t want to go back to the way I used to live my life before I found the crutches.

After that initial panic was over with, though, I feel like I’ve settled into a new routine a bit better than expected. It actually reminds me a bit of two previous periods of my life, in some ways at least. One is the time 18 months ago when my son had just been born and I was off work on paternity leave. All the usual rules and day-to-day routine just went out the window and suddenly there was only one objective: to survive each new day as it came. I don’t mind admitting that I had some pretty dark thoughts at times during those early weeks, wondering whether I’d ever get a decent (or even adequate) night’s sleep again, wondering whether I’d ever get my life back, whether I’d ever be able to do the adult things I’d once so enjoyed again or whether I was destined to sacrifice everything for this tiny little new person for evermore. (In the end it was nowhere near as bad as I feared and, while some stuff obviously has changed, I was back to sleeping OK and back to doing most of my activities within a few months).

The other period this reminds me of is when I was a child myself, in the sense that my horizons seem to have suddenly and drastically shrunk back to nearly where they were back then. As a child, almost my whole life revolved around my home, my school a short walk away, and nearby places like the shops and the green spaces where we would walk our dog. Going anywhere further afield, like to visit extended family, go on holiday, go for a walk in the countryside or even go into the city centre felt like a rare special treat in comparison. As for going abroad, I’d never been at all.

After I became an adult, the world seemed to open up: the city centre became somewhere I would go every day for work and often for multiple nights out per week; I would go for frequent weekends away, sometimes as many as two or three in the same month; everywhere within an hour or two’s drive could be visited on a whim just by jumping in the car on a day off; and I would go abroad, either for work or pleasure, anything up to four or five times a year. That became the new normal for me. Now it’s abruptly reversed and I’m suddenly back in that closed, parochial world of childhood again, only even more so this time.

Whilst neither of those past experiences were quite like what’s happening now, I feel I did learn some stuff from them that might help in the present. I’m trying to view the current situation very much like I viewed the early days of fatherhood: focussing on surviving one day at a time, not worrying about anything bar the essentials, and trying to keep the faith that things will go back to normal eventually. I’m also trying to remember the habits that got me through spending so much time in or near the house back in my teens: enjoying music, TV, movies and video games, being creative, and looking forward to the fun stuff I can do in future when the opportunities arise. I’ll probably write some more entries about specific things I’m doing (I’ve already got a few ideas) over the coming days and weeks.

Of course, it’s a bit hard to look forward to fun stuff in the future when we have absolutely no idea how long this is all going to go on for. I find myself really hoping that the government are going to follow the “hammer and dance” strategy set out in this very informative article, because that would mean only a few weeks of strict lockdown, followed by relaxing some restrictions and applying some more targetted and intelligent measures instead. But it’s hard to tell from the briefings whether this is their intention, and I’m not qualified to judge whether it’s even a viable plan at all. So I’m trying to prepare myself for the possibility that we might be locked in for much longer than that.