Teaching machines to think

At this moment, your brain is being bombarded with data. Not just from the conventional five senses, but with information about your balance, hunger, the positions of your limbs and the urgency of your next bowel movement.

To cope with this assault, the brain must be ruthless in its compression. For instance, light signals are summarised by the eye into abstract descriptions of shape, colour and movement before they even reach the brain. What we perceive is inferred from these hints, with obvious gaps filled in by memory and intuition.

Of course, in the rush to discard unimportant information the brain can make the occasional mistake. It can be as innocuous as a misread word or a face glanced over at a party, or as serious as a cyclist slipping past at a junction.

What happens when we turn over such decisions of salience to our computers? Techniques for machine learning have been studied for decades, and its proponents are equipping computers with the ability to learn strategies for decision-making from input data, much as humans do.

And just as the brain seeks to summarise its input, half the battle comes in choosing features of the data to feed our algorithm. For instance, let's say we have a four-legged mammal. Knowing the sound it makes will help us decide whether it is a cat or a dog, but knowing if it has a wet nose will not.

Google has recently introduced Priority Inbox, which uses simple statistical techniques to rank the messages that arrive at your inbox in order of importance. Their servers crunch your behaviour into fine-grained features such as whether you generally reply to emails from your mother, or how often you open message that contain the word "Viagra". Google can then check incoming messages for indicators of how likely you are to read or reply to them.

In some cases, the cost of a false positive (a message from your mum being classified as spam) is rather low. But machine learning has been applied in the field of medicine, to the point where some systems for the classification of skin blemishes can outperform human doctors. In such cases, a false negative (allowing someone with a serious condition to remain undiagnosed) has a much greater cost.

When we seek to emulate our human capacity for learning, we must be mindful of our human capacity for mistakes. We are perhaps the most advanced system for learning in the universe, and yet we require decades of experience to reach a level where our expert judgements can be trusted. Machine learning is becoming more widespread, from supplying your film recommendations to recognising your handwriting, and increasingly in helping financial institutions tease out trends from market data. In situations where the cost of misjudgement is significant, who is to blame if the machine makes a mistake?

Rhodri is a full-time software engineer for Softwire Ltd, and records next-generation music as Uther Moads.