Advertisement
X

Changemakers 2024: How Sarvam is Pioneering a Truly Desi AI

A tech duo refuses to play by rules of Big Tech players operating from the West

Pratyush Kumar (left) and Vivek Raghavan

After OpenAI launched ChatGPT in 2022, big tech firms like Meta, Google and some heavily funded start-ups in the US and Europe started training their own large language models (LLMs). It was evident that just as the West had won the age of the internet because of its ability to burn cash, the age of artificial intelligence (AI) too would be theirs to rule.

Advertisement

At the time, experts said Indian start-ups and companies should not think about building their own LLMs—the foundation of generative AI applications—and instead focus on creating applications for health care, agriculture or education. The rationale was: India does not have the depth of capital necessary to build LLMs ground up.

But accepting this rationale would mean Indian innovators would always remain hostage to rent-seeking big tech firms of the West. “All those big companies have invested billions of dollars in this technology. Once they realise they need to recoup the cost, they will raise the prices. All that will happen in dollars,” says Tanuj Bhojwani, head of People+AI, a community attempting to use AI for social good. He explains that companies like Microsoft and Amazon are investing in LLM start-ups such as OpenAI and Anthropic and these AI services will be sold in a bundled form with the tech giants’ cloud offerings. “And then you get locked to pay for both,” he adds.

Advertisement

Enter Sarvam

Just as the Indian technology ecosystem was grappling with the consequences of banking on foreign LLMs, Vivek Raghavan and Pratyush Kumar founded Sarvam AI to build India’s own LLM.

Raghavan or Kumar did not fit the founder archetype that venture capital (VC) firms are always on the lookout for. They are neither kids fresh out of India’s top engineering colleges nor battle-hardened senior executives of tech unicorns.

They are, however, scholars with a wealth of experience—armed with PhDs and having founded multiple tech companies in the past.

When Sarvam launched its first multilingual LLM in October, it outperformed its peers Meta’s Llama and Hugging Face’s Gemma that were trained on bigger datasets

Raghavan had previously made important contributions to the semiconductor industry and sold two companies to American chip-design major Synopsis. For the last decade-and-a-half, he has been involved with developing the India Stack—comprising digital public infrastructure like Aadhaar and the Unified Payments Interface (UPI). Kumar, on the other hand, has been involved with systems research—from graphics processing unit (GPU) design and programming to language models at Microsoft Research, IIT Madras (as faculty) and IBM Research.

Advertisement

Harshit Sethji, managing director at VC firm Peak XV that has backed Sarvam, says, “We came across Kumar’s work in AI4Bharat [a research lab in IIT Madras] and were lost in the depths of his blog on spiritualism and AI. In him, we noticed a rare combination of a world-class researcher with boundless ambition and an unusual clarity of thought about how to serve India’s AI revolution.”

Sarvam has a deep understanding of the nuance of both Indian languages and Indian use cases, says Hemant Mohapatra, partner at venture capital firm Lightspeed which led the start-up’s Series A funding round of $41mn last year. He adds, “We believe India will need a unique approach to AI that is tailored to the needs and scale of the population.”

Secret Sauce

For the past two years, the mantra in the world of generative AI has been ‘bigger is better’—bigger the amount of data you use to train your models, the better the results. Sarvam spotted a problem in this approach.

Advertisement

Firstly, using more data meant driving up costs—of data collection and processing. This would inevitably reflect later in the high prices that have to be charged to recoup costs. In the Indian context, it would mean making AI less affordable.

Secondly, datasets that Western companies are using to train their AI models are mostly in English. Very little, around 1% of the data, is in Indian languages. Consequently, there is a significant drop in accuracy and efficiency, in terms of costs, in processing queries in Indian languages. In this, Sarvam saw an opportunity. By using a smaller corpus of data focused on Indian languages, it could help create a more accurate and affordable AI software for Indians.

When Sarvam launched its first multilingual LLM in October (it had launched a Hindi LLM last year), the model outperformed its peers such as Meta’s Llama and Hugging Face’s Gemma, which were trained on bigger datasets, in Indian languages. Sarvam is taking a full stack route. It is also building end products based on its LLMs that can be used by enterprises and consumers.

Advertisement

It has also created a voice-based bot called Bulbul in 10 Indian languages that can be used in myriad applications from ecommerce to online education. At an Nvidia conference last month, the start-up said it was working with the Unique Identification Authority of India to build an AI box that can be set up within the biometric data organisation’s premises for its use. One might say the pioneers of a truly Indian AI have arrived.

Show comments