Taking a Deep Learning dive with The Fifth Elephant


Mumbai: There is tremendous buzz around machine learning, broadly described as a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. However, despite an exponential increase in power, computers have typically proved incompetent at things that are really simple to human beings—like recognizing the dog in a picture containing a dog, or understanding speech.

The trend, however, is changing. Consider ‘Deep Learning’, which describes a collection of techniques that allow computational tasks that were previously thought impossible. Facebook Inc, for instance, uses it to identify faces, and when Google Inc recently announced that their algorithms could not only ‘see’ a dog but also identify it as a Pomeranian, they heralded the maturity of Deep Learning techniques.

While Natural Language Processing (NLP) has seen steady progress since the 80s, Deep Learning has resulted in new advances there as well, a local example being ‘conversational search’ from Bengaluru-based start-up, Infilect Technologies Pvt. Ltd.

Even though Deep Learning is based on concepts and algorithms that were discovered decades ago, it took off only in the mid-2000s as computer hardware (especially graphic cards which are now called Graphics Processing Units, or GPUs) became powerful enough and the amount and quality of data exploded due to the internet.

These algorithms come with a bonus—automated feature engineering. As Cisco Systems’ senior statistician Bargava Subramanian puts it, “Traditional Machine Learning relies on the analyst telling the computer what information (called features) should be used to build the predictive model. The problem? The model’s efficiency is only as good as how insightful the analyst is. Deep Learning, a step towards AI, can learn the features themselves from the raw data.”

But Deep Learning is not a magic wand. In the words of Searchmetrics Inc.’s senior data scientist Abhishek Thakur, “Deep Learning is hard and there are no hard and fast rules for designing a deep network. However, there are certain practices one can follow and these will come out only after dealing with many different datasets and trying out different methods.”

HasGeek, a Bangalore-based company that’s behind events like The Fifth Elephant, is organizing a two-day conference on data and machine learning this month to make these concepts easy to understand and implement.

The speakers include the likes of Arjun Jain, co-founder of Perceptive Code, who will speak about a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field, showing how this architecture can be applied to articulated human pose estimation in monocular images.

Jain did post-doctoral research at the Computer Science department at New York University’s Courant Institute, and worked as a developer for New Zealand-based Weta Digital’s vision-based motion capture system that has been used in many feature films. He was credited for his work in Steven Spielberg’s The Adventures of Tintin.

Other speakers include Anand Chandrasekaran, chief technology officer (CTO) and co-founder of Mad Street Den015—an artificial intelligence (AI) company specializing in computer vision–who will briefly describe the history and origin of deep learning; how it contrasts with ‘traditional’ machine learning; and touch upon the dominant concepts that shape its practice today. Chandrasekaran has been a member of teams working on DARPA projects in cognition and vision.

Sundara Nagalingam, the head of Manufacturing and Energy businesses for NVIDIA India, will showcase his company’s advances in developing a comprehensive software development kit, aimed at helping developers train deep neural networks at speeds that consistently beat previous records.

The event’s second day will take place at Google’s Bangalore campus with Chandrasekaran talking about deep learning in computer vision, and Nischal H.P. and Ragotham S.—co-founders of Unnati Data Labs—giving an introduction to its application in NLP covering common architectures.