The human brain is exceptionally good at many things: conscious thought, emotions, memory, control of movement, not to mention the five senses that let us take in the world around us. It’s also good at detecting patterns in the countless signals that come in through those five senses. But the brain isn’t perfect, and there are some problems that are so complex that they are better suited for computers than for people. This is the realm of machine learning.
“Machine learning” is a bit of a misnomer. To the uninitiated, the term conjures up images of R2-D2, Wall-E, or HAL. In reality it’s just as powerful, but doesn’t have quite the same star power: it refers to any systematic way of finding patterns or similarities in data, usually for the purpose of making predictions. There are lots of different ways to do this, ranging from techniques you can do by hand (machine learning with pencil and paper!) to computationally intensive algorithms that tax even the most powerful supercomputer.
Machine learning algorithms come in different flavors. One commonly used class of algorithms is known as “supervised” algorithms, because the learning is guided by observations with known outcomes—these are treated as the “ground truth.”
For example, to create a model for predicting who will win an NFL football game, you might start by looking at games over the past decade, so you already know who ended up winning. You would create a number of “features”—signals for each game that might be useful for predicting a win or a loss, such as the team’s average scoring per game, statistics about its offense and defense, its opponent, etc.
If you were only interested in predicting who would win a game in the future, you would then use an algorithm to create a model for “classification”—simply taking all of the features for a new game and mapping them onto either a win or a loss. If you cared about predicting the final score, or how much a team would win or lose by, you’d use an algorithm for “regression,” and your output would be a real number (like “win by 7 points”) rather than just a classification (“win”).
Another common type of algorithm is known as “unsupervised,” and the name is apt: unlike its counterpart it doesn’t use known outcomes to guide the learning. Clustering algorithms are one example. They take features from many different observations and simply look for similarities that make certain observations group together.
If you were to take physical measurements of people—height, weight, etc.—and apply a clustering algorithm, you might find that two clusters emerge, one for men and one for women. This is because women tend to be more similar to other women in their physical measurements, and men more similar to other men.
The perk of using unsupervised algorithms is that they can reveal underlying structure in a data set (like the presence of two different sexes) even when you didn’t know you should be looking for it. Often, finding clusters when you didn’t expect them is a great motivation for doing further research to find out the meaning of the groupings.