In 2018 AI has been expanding its reach, touching numerous different industries. Gartner found the AI industry revenue will increase 70% compared to the previous years to reach $1.2 trillion by the end of 2018. This clearly shows the growth of everything that’s ‘AI-enabled’ both in terms of products and new business models.
As we learn from past experiences, machines follow set of instructions written by us. But, what if we can tune machines to learn from the past like we do? For machines, the past experiences have a name, and is called “Data”. This data is called “Machine Learning”.
Machine Learning is a lot more that just learning, it’s also about understanding and reasoning, so along this article, we are going to talk about the basics of machine learning.
Let's say that we are studying housing market, and we want to predict the price of the house giving its size, having previous experience (data) for some houses sizes and prices, using this set of data, putting it in a grid where the x-axis is the house size and the y-axis is the price of that house, and the blue dots are the previous data we collected.
We can see these blue points kind of form a line (the red line) that best fits the data. That red line we can say that our best guess for the price of the house. This method is known as “Linear Regression”. So, how did we find this line? Well, we can find it just by looking at it. But what about computers? They can’t really eyeball the line between the dots! So at first, it will be just a random line.
So, in order to see how bad the line is, we are going to calculate the error, which is represented by the red lines between our random line and the 3 blue dots (AKA the length of the distances from the line to the three points).
Error = length of red line 1 + length of red line 2 + length of red line 3
Then, we are going to move around to see if we can decrease that error and keep moving up and down till we finally arrive at a good solution. This general procedure known as “Gradient Descent”. Well, actually, this method is much stronger, if the data doesn’t form a line, there is a very similar method that can draw a circle through it or a parabola or even a higher degree curve.
Next time, we are going to talk more about machine learning algorithms, such as how does machine learning detect spam emails using “Naive Bayes” and much more. So, follow our blog, stay tuned, and wait for it!