Deep Learning is one of the exciting & fast growing fields within data science.
One question that practicing data scientists regularly face from business users is – “What is Deep Learning and how is it different from the traditional data science algorithms that some companies have already been using for some time”.
Let’s tackle the latter question first using an example – one of the more standard datasets used to teach in the industry – MNIST (a hand written digit dataset).
In order to understand how deep learning would be improving upon the traditional machine learning algorithms, let’s see how we would recognize digits based on traditional algorithms.
MNIST dataset contains pictures of various hand written digits as follows:
Given that it is an image, we’ll be considering the pixel values of the image. In the above case, as humans we can clearly see that both the images represent the same number – 3.
However, when the machine reads the pixel values of both the images – pixel values do not match identically with each other – moreover both the images are tilted in the opposite directions – left image tilted towards right & right tilted left.
We need to note that not only the pixel values are different – but also the content in the image is also different – Vishy is smiling in one and was in a more serious pose in the other one.
Deep Learning comes to the rescue.
One way to avoid the problem of differing pixel values at some of the pixels for the same label is by understanding the orientation of various strokes within the number – i.e., in the case of digit recognition, irrespective of whether the number 3 exists in the middle of image or at the top right corner, it would have two curves that look like a mirror image of letter “c” & also the mirror image of letter “c” is made up of 2 close to straight lines at the top and bottom & a curve in the middle.
So, how does deep learning infer these various strokes/ curves of an image and come up with a prediction?
The above question helps us advance to the starting question of what is deep learning & how does it work – in this specific example, we’ll consider one of the methods within deep learning family called Convolutional Neural Networks (CNN).
A CNN is a special architecture of an artificial neural network that is not only doing the regular neural network task of matrix multiplication, but also performing a specialized function called pooling.
This additional feature of pooling helps ensure that adjacency of pixels is taken care of – i.e., if pixel 20 is highlighted in the first image & pixel 21 is highlighted in the second image – pooling helps establish a relation in such a way that it considers both images to be likely similar.
Moreover, CNN does the process of convolution (matrix multiplication) & pooling in such a way that we would be left with a bunch of strokes that help in determining the label.
This scenario is best illustrated by the following example:
More details on how to come up with optimal weights in the network above shall be posted in a later blog.
After building the model – i.e., coming up with optimal weights, for a new image of label 7 – the CNN would search for matches with the strokes (components) generated & if the match happens it shall output the class.
This way, the CNN (Deep Learning model) is learning the minute details in an image by passing it through multiple filters & further by pooling (looking for adjacency in highlighted pixels).
Thus, deep learning is helping in identifying the underlying components of an image – thereby eliminating the process of creating hand crafted variables.
While we talked about CNN in this example, there are other variants of deep learning algorithms that help in memorizing history and re-produce in order to come up with the final features.
Thus, deep learning comes to rescue for the more un-structured data which have traditionally not been analyzed due to the limitations of the existing algorithms – like natural language processing, image/ video analysis. The growth projections of data availability over the next few years is given below:
Given the unprecedented growth in unstructured data collection, Deep Learning comes to the forefront in solving some of the new problems that could never have been done before and hence is one of the top skills to have as it opens up the rewarding opportunity to create exciting and impactful products.