What is Machine Learning ?

This post follows my previous post Getting startet with Machine Learning. In this post I will try to answer the question: What is Machine Learning ? from the perspective of a software developer. So, with some help from the book “An Introduction to Statistical Learning” and without too much math, I will try to define what machine learning is.

If your focus is on applying Machine Learning, it pretty much comes down to predicting things.

So in essence it boils down to a function that takes a set of input values and returns a value, the prediction. The mathematicians write it like this:

Y = f(X)

Where X is all the inputs, f is the function and Y is the predicted value. That’s all there is to it…

Let’s take a small example that most people can relate to. Let’s say we want to predict the wage of a number of people. To be able to make a reasonable prediction we need some information about the people like their seniority, education and where they live. As a programmer the challenge is to write a function that takes a set of inputs (ie. a persons seniority and education plus the zip and country code of where they live) and returns the persons predicted yearly income. If you think back to your early days in school you might remember this function:

Y = a·X + b

This is the function for a straight line. Let us take this function as our very first and very naive attempt on a wage prediction. If x is the seniority(ie. number of year they have been working) then the challenge is to figure out the value of a and b. Figuring out a and b is done during the training step of setting up a prediction function with machine learning. I will get back to the training step in a later post. To get the function a little bit closer to our wage example it would look like this:

Y = a·seniority + b·education + c·zip + d·country + e

In fact this type of function is a specific type known as “Linear Regression” in machine learning. As you might imagine there are many different types of functions, but more on that in a later post. First let us understand the function above a little better. If you were given the values of a, b, c, d and e, how would you write the function ? The first part “a times seniority” is easy, but what about the next part ? some value times education ?? ie. 2.23 times M.CS., that doesn’t really make sense. Let us examine the data types of the input values.

What are the possible input values to a Machine Learning function ?

From a computer science point of view there are 3 different types of input variables to a machine learning function: Numbers, Booleans and Enumerations. Numbers are types like int, long, decimal and float. Booleans are True/False or 1/0 as any programmer would expect.

Enumerations

Enumerations are fundamentally values of type enum to a C# or Java programmer. In reality these values are often of type string, but inside the prediction function they are treated as enums. In our wage example education, zip and country code are examples of this type. Exactly how an enum is used internally in a prediction function is not important from a computer science point of view.

The fact that you can only use numbers, booleans and enumerations might seem like a limitation, but when you start working with it you’ll see that is it not that bad 🙂

The point of Machine Learning

The really interesting point about machine learning is that you don’t have to write the prediction function. There are already libraries with a number of different types of functions out there for you to use. The challenge when building a solution that uses Machine Learning is to select the right function and configure it in an optimal way.

In my next post I will describe a set of steps involved in building a Machine Learning solution.

Leave a Reply

Your email address will not be published. Required fields are marked *