A three-time world champion, who is considered to be one of the best gymnasts ever. A social entrepreneur, who provides clean and safe drinking water to rural communities in war-ravaged Liberia. A YouTube star, who, according to TIME magazine, is helping Germans accept Syrian refugees “one smile at a time” through his viral videos.
Simone Biles, Saran Kaba Jones, Firas Alshater — that’s quite a list to be part of. Even more so when you are the only Indian to make it to TIME magazine’s 2016 list of “10 millennials who are changing the world.” Umesh Sachdev, the co-founder and CEO of Chennai–based Uniphore, is on the coveted list for “building a phone that can understand almost any language.”
Power of speech
Sachdev and his classmate Ravi Saraogi, however, didn’t get off to a great start. Their first venture, to track lost mobile phones, in 2006, failed to take off. They, then, moved to Chennai in 2007 to be a part of the Rural Technology and Business Incubator (RTBI) at IIT-Madras. This time, the duo wanted to focus on the problem before building a solution and looking for a product-market fit. But they were clear they wanted to focus on mobile technology.
Keeping with the incubator’s objective, the plan was to bridge the digital and information divide prevalent in rural India. Then, about 95% of mobile subscribers were only using voice. The solution, thus, had to be built on the medium. “Our focus was on speech because that’s the only technology that could have solved some of the problems we were looking at. But it is a very difficult science to crack given that there are so many different languages and variations in terms of dialect,” says Sachdev.
To test their hypothesis, they set up a call centre that provided information on subjects such as agriculture, health, education, employment and entertainment. They put up posters in three districts in Tamil Nadu, asking people to call in for information. “We wanted to know what kind of information they were looking for, how they were transacting and what modes they were using to transact,” says Saraogi, the COO of the company.
In just three months, they got 10,000 calls, with maximum queries related to agriculture, employment and financial services. The call centre was able to answer 90% of them relying on the Internet, but it wasn’t a cost-effective solution. They had to automate. And more importantly, the responses had to be in local languages…
Sachdev and Saraogi set out to build a mobile platform based on speech recognition. The first offering was based on interactive voice response (IVR), especially developed for low-end phones. With the smartphone wave, however, Uniphore’s offerings too have come of age. At the very core is the engine that allows software applications to understand and respond to speech, thus enabling humans to engage and instruct machines, and deliver customised information on a variety of subjects.
As of now, the company has three products based on the speech recognition engine — Akeira, a virtual assistant that can process more than 25 global languages and 150 dialects across sectors. The product, which can be deployed on entrepreneurial websites, mobile apps or even web-enabled kiosks, processes queries using natural language processing, and instantly provides an answer in the native language of the speaker. “It was a blessing that we started in India because we have so many different languages and dialects. So we are able to address a larger international market,” opines Saraogi.
But catering to multiple sectors meant the company had to innovate a lot when it came to machine learning. That’s because each industry, be it banking, insurance, travel or farming, has its set of terms and knowledge set. The process is not very different from human learning though — it involves words, phonetics, dialects and the capability to derive the meaning depending on the context, besides domain knowledge. However, the one plus is that Akeira can go through 10,000-15,000 pages of reading material and can be ready to answer any question within hours.
Besides the ‘geeky’ Akeira, Uniphore also offers a voice-based biometric system that authenticates the customer’s voice before every transaction by verifying the voiceprints — which are as unique as fingerprints — within 15 seconds. Little wonder then, that banks, microfinance institutions and insurance companies have been using Uniphore’s system to authenticate customers, enabling them to access online services using their native language.
The company already has six patents related to speech recognition and voice biometrics. Its latest offering, a patented speech analytics product auMina, however, came together when someone at American Express, who was part of the IIT Alumni network, approached them. “You do a lot of work around speech recognition technology. We have a lot of data in our call centres and wanted to know if you can mine the data to help us understand what our consumers want and improve our overall customer experience,” he said. That’s when, Sachdev says, they realised that there was a huge opportunity out there and that they already had the technology to leverage it.
About 52 million hours of data, he says, gets generated every day at call centres, but less than 1% is used for checking quality, more like a post-mortem analysis. The duo, thus, developed their own patented speech analytics software that helps companies predict consumer sentiment on a real-time basis. It can identify new selling opportunities or help take corrective action in case of unhappy customers, adds the co-founder. “So far, companies have viewed call centres as cost centres. With the help of this software, they can increase revenue, reduce costs, retain customers and mitigate risks,” says Saraogi, adding that it can also help detect fraud and errors early by highlighting high-risk transactions.
While the advent of Apple’s Siri and Google Talk has increased awareness about speech recognition, enterprises are just warming up to the idea. “When you are working in a new area, you are building something that doesn’t exist. The primary challenge is how do you build the technology. And once you have built it, how do you get people to know about it,” says Sachdev. Currently, Uniphore has to do a couple of pilots to show clients the possible outcomes and the benefits of the platform. It offers solutions on a SaaS (Software as a Service) model; a subscription model wherein companies pay based on usage and the number of consumers.
But Uniphore’s advantage, Sachdev says, lies in the fact that there are not many companies offering similar products. While there are companies in the US, Europe and Israel that offer one of the three products, he says, there is no one that offers the entire spectrum of speech analytics, a virtual assistant and voice biometrics. “The uniqueness of being one vendor that can offer all three gives us a lot of edge.”
He adds, “It takes time to do incremental stuff even for deep-pocketed companies in the Valley. You can’t throw 100 smart engineers at the problem. Just like a child would learn a language, the computers have to go through the same learning. You can build a better, faster algorithm, but it takes time for machines to go through the data-intensive learning. Besides, we are the only one offering this in more than 25 global languages and 150 dialects. That creates a huge entry barrier.”
No wonder then that it has piqued the interest of some marquee investors such as Kris Gopalakrishnan, Rajan Anandan (as part of Indian Angel Network), IDG Ventures, YourNest Angel Fund and Ray Stata, co-founder of Analog Devices in the US as part of pre-series A and Series A rounds in the last two years. “In India, there are only 250 million people who understand English so there was a definite need to bridge the information divide. Speech recognition is a difficult technology to crack given the various nuances. Uniphore’s ability to solve the language problem makes it one of a kind not just in India but in the world,” says Nagaraja Prakasam of Indian Angel Network, who led the pre-series A round.
Without disclosing the exact figure, Sachdev says that the funding is around $5-7 million. He adds that having people like Gopalakrishnan, who have built a multi-billion dollar company, has helped Uniphore gain a better long-term perspective.
The speech recognition market is estimated to be about $4 billion. “In the next couple of years, speech recognition is going to be become pervasive, with IoT becoming more mainstream. Uniphore is best placed to leverage on this multi-billion dollar opportunity,” says Ranjith Menon, executive director, IDG Ventures. Companies have already started looking at English speaking markets for new customers. That plays right into the hands of Uniphore given its expertise in multiple languages and dialects. The growing adoption of mobile transactions in developing countries and the need to reduce instances of fraud in financial transactions and healthcare is likely to drive the growth of voice biometrics. Uniphore, definitely, will be at the forefront of that change too.