"Machine Learning is not a magic potion"

Sudhir Raikar, IIFL | Mumbai | March 09, 2016 11:24 IST

Over-fitting due to noisy data is a common occurrence. The more the uncertainty in the prediction criteria, the higher is the amount of noise in the data. The trick is in separating the noise from the signal.

Ashutosh Bijoor - Chief Technology Officer at Pittsburgh-based Accion Labs (http://accionlabs.com/) - is not your everyday software pro. A compulsive outdoor sports practitioner and a passionate history buff, he owes his clarity of thought and purpose to his affinity for Mathematics and ability in terms of visual articulation. With over two decades of rock-solid tech experience across verticals including High tech, Engineering, Software, Insurance, Banking, Chemicals, Pharmaceuticals, Healthcare, Media and Entertainment, Bijoor has led and managed cross-functional teams for several organizations worldwide – from mammoth MNCs to up-and-coming start-ups. His core competence spans emerging technologies, software architectures, framework design and agile process definition, enterprise solutions and off-the-shelf products in domains ranging from Big Data, Business Intelligence and Advanced Text Search & Analytics to Graphics & Image Processing and Sound & Video Processing. Bijoor spoke to IIFL’s Sudhir Raikar on a host of issues including Big Data, Analytics, Machine learning, Holistic Education and last but not the least, his love for mathematics and passion for outdoor sport. Excerpts from the Q & A interaction…
 


What in your reckoning are the biggest Big Data challenges - both biz and tech?
The central business challenge while adopting any new technology is to be able to suitably transform legacy business processes in order to truly leverage its capabilities. As Eli Goldratt, father of the Theory of Constraints aptly remarks, any technology can bring about benefits if and only if it diminishes a limitation. In the absence of the new technology, he contends, organizations develop modes of behavior, measurements and policies that inadvertently tend to accommodate the limitation. That’s precisely why the new technology will invariably fail to deliver benefits unless the organization acknowledges and acts upon the need to change their modes of behavior, measurements and policies. Goldratt’s astute observation applies to Big Data technologies as well.
 
Big Data technologies eliminate the limitations of not being able to store and analyze data that is either large in volume, velocity or variety.
 For example, it’s now possible to analyze reams of data about individual customer behavior when measured through digital and mobile technology. Now, there are organizations that use Big Data to measure effectiveness of conventional marketing campaigns that were designed only to “spray and pray”. Obviously there will be no benefits. Before they expect big data to deliver on its promise, they need to design intelligent, dynamically targeted marketing models that leverage the capabilities of Big Data - and that is not easy. Umpteen Big Data implementations prove self defeating on this very count.
 
Is there a need to bring Machine Learning out of its conventional confines to be able to solve problems like over-fitting and noise with a more holistic perspective? Will collaborations with mathematicians and economists help in this context?
Yes absolutely. Machine Learning is not a magic potion that can solve all problems automatically. Over-fitting due to noisy data is a common occurrence. The more the uncertainty in the prediction criteria, the higher is the amount of noise in the data. The trick is in separating the noise from the signal. For Machine Learning to be useful, it requires data scientists to have a more holistic understanding of the problem domain. Without that, the temptation is to design overly complex models that are susceptible to problems such as over-fitting. With holistic understanding, the models become more lean and thereby more accurate in their predictions. Whether this holistic understanding can be achieved by collaboration between mathematicians and computer scientists with domain experts such as economists remains to be verified.
 
In the same context, is there a need to design a more holistic and close-knit syllabus to help the graduates of tomorrow become adept at working on the cusp of several disciplines?
A holistic educational syllabus will go a long way in enabling data scientists to have a multi-disciplinary understanding of their chosen domains. We are already seeing the emergence of curricula such as liberal arts where students can combine subjects from multiple disciplines across business, science and commerce. Going forward, educational institutions would need to find ways to move away from their industrial roots to address a larger number of segments that are prevalent, and even dominant, in today’s global market place. Education cannot be limited to a point-in-time learning process - it is by its very nature continuous and ongoing. Hence, we need to create and institutionalize educational models that enable such continued learning with the same efficacy that that conventional 10+2+3 learning delivers today.
 
Considering the huge import of Big Data across spheres, will data cleaning and curation become even more significant with the scalability and usability in mind, not just error-detection?
As the volume of data increases, common erroneous data points lose their significance. Analytical models are supposed to account for such anomalies and hence lean models that are built with knowledge of the subject matter are less susceptible to problems like over fitting. I do not think that an increase in volume of data will make data cleaning or curation more important or significant.
 
How has the BI, Analytics and data mining spaces evolved over the years? Has Big data engulfed them, replaced them or all three now complement each other.
Each of these technologies and methodologies has a distinct role to play. They complement each other. There are some overlaps of course, but we often come across situations where the best solution to the problem may not necessarily be warranted unless the value delivered is sufficient to justify the investment. None of these technologies are cost-effective or easy to implement. They all involve change management and that is far more difficult and expensive than implementing the technical solution. Hence it is best to avoid getting into a hype trap and make an objective decision about which solution adequately addresses the given problem or need.
 
 

Do you agree with Pedro Domingos, author of "The Master Algorithm" who talks of a rather Utopian unification of competing tribes of Machine Learning paving the way for a new 'Theory of Everything'
At heart, I am a very simple technology person. Occasionally, I do enjoy reading about grand ideas (Asimov with his psychohistory is my all time favorite) but my head is invariably over occupied with solution-finding to problems and challenges that I can address in my lifetime. Customers today demand continuous integration and deployment of their solutions, which means innovation cycles are very short and demand delivery of workable results almost immediately. So I would steer clear of debating on what’s possible and what’s not, whether in the realm of Machine Learning or any other Learning. 
 
Tell us about your love for Mathematics? After your Masters, what proved the trigger for your shift towards technology over a more mainstream career in Mathematics?
I hated Mathematics in school. I got 3 marks out of 100 in class 7 after which my parents gently coaxed me into joining a boarding school where I sort of wizened up. But my real exposure to the Mathematics as I see it today came only during my first year of B.Sc. That’s when I really found that Mathematics was not just fiddling with funny formulae but a structured way of rational thought. I started learning Mathematics in the form of pictures and diagrams. I loved it so much that I thought I should do something more concrete with it. Having cleared IIT JEE, I chose to do a five-year integrated course in Mathematics from IIT Kanpur. That was where I was exposed to the fascinating world of Image Processing and Pattern Recognition under the guidance of Prof. R.K.S Rathore. (http://home.iitk.ac.in/~rksr/). He gave fresh insights into the wonderful chemistry between Mathematics and images.
 
After my graduation, I had to make the customary choice between doing further studies or taking up a job. Neither seemed attractive to me. I craved for doing something that used Mathematics in the real world. I took a bank loan and started my own business and created a few products in the Image Processing space before moving towards various exciting technologies to land up finally in Accion Labs. Thanks to Mathematics and my forte in visual thought process, I am able to easily relate to emerging technologies.
 
Accion Lab’s service spectrum - of catering to distinct client pools - start-up, growth and mature organizations - is undoubtedly one of your unique value props. How was the idea of this segmentation conceived?
The business needs of a technology-driven company change drastically as they evolve from the start-up and growth phases to finally mature into a full-fledged organization. The segmentation hence is quite natural in the first place. We tailored our service portfolios and organizational structures for each of these segments individually thereby tuning the service offerings, teams, technologies, pricing and key business drivers to suit the given segment. Besides, we continually keep up with evolving and emerging needs on their behalf. The personalised and proactive approach helps create long term relationships with each of our clients.
 
Could you share a few of Accion's accomplishments and breakthroughs?
Accion Labs is one of the most exciting technology services companies. That’s because it focuses on emerging technologies and hence gets to work with some of the most exciting product ideas that utilize emerging trends such as Rich Internet Applications, Micro Services, Polyglot Persistence, Big Data Analytics and Automated CI/CD or DevOps. Accion works primarily with technology companies - both startups that are working on cool new ideas as well as growth stage and mature companies that are reengineering their products to leverage the benefits of the emerging technology trends. If you want to work on emerging technologies, then you have only two choices. One - If you have the business acumen, run your own business. Two, join a company like Accion where you get to work on exciting ideas using other people’s money.
 
Your non-technology exploits are equally prolific.  How do you find time for them?
I discovered my passion for outdoor sport quite late in life as I was busy running my own business. It was only after I crossed 40 that I suddenly realized how much fun it was to be outdoors in tune with nature. I am now outdoors almost every weekend trekking and cycling extensively. I have recently completed my first Olympic level triathlon. I think I am healthier now than what I was in my 30s. I want to urge young people to take up some outdoor sport of their choice. We are so preoccupied with our digital selves that we end up ignoring our real selves all the time…Start now coz it’s never too late!

 

Advertisements

  • Save upto Rs.2.67 lakh with Pradhan Mantri Awas Yojana ...Know more
  • Now Save Rs.3150 on your Demat Account ...Click here
  • Now get IIFL Personal Loan in just 8* hours...APPLY NOW!
  • Get the most detailed result analysis on the web - Real Fast!
  • Actionable & Award-Winning Research on 500 Listed Indian Companies.