Indian AI Models Critical for Strategic, Cultural Reasons: Bharat GenAI

As a sovereign country, it is important for India to have a large language model that it has fully visibility and control over, says Rishi Bal, Executive Vice President of Bharat GenAI

Shruti Tripathi

Updated on: 19 November 2025 4:30 pm

Indian AI Models Critical for Strategic, Cultural Reasons: Bharat GenAI

Summary of this article

India needs incentives to boost adoption of indigenous AI models, says Bharat GenAI’s EVP Rishi Bal, stressing that startups must be encouraged to choose local LLMs over foreign ones.
Bharat GenAI, led by IIT Bombay under NM-ICPS, is building a multimodal LLM for Indian languages with support from IITs and IIM Indore.
Bal highlights challenges in dataset creation, the rise of smaller task-specific models, and India’s urgency to build foundational capabilities to compete globally.

As India accelerates efforts to build its AI ecosystem, the country must incentivize the use of homegrown large language models to ensure startups adopt indigenous solutions over foreign alternatives, according to Rishi Bal, Executive Vice President of Bharat GenAI.

Bharat GenAI is a government-funded Multimodal Large Language Model (LLM) project for Indian languages, led by IIT Bombay, under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS), and involves collaboration with other academic institutions like IITs and IIM Indore.

1 January 2026

Get the latest issue of Outlook Business

In an exclusive interview with Outlook Business, Bal shared key highlights around developing foundational models in AI, from the challenges of sourcing training datasets to the growing relevance of smaller models, and how India can keep pace with global leaders in the AI ecosystem.

Can you walk me through the complete model-building lifecycle step by step, from problem definition through deployment and ongoing maintenance? Please explain each stage in simple terms.

The AI you see in applications is essentially the front-end; the intelligence comes from the underlying models. Building these models begins with deciding what we train them with, and for us that starts with India’s goals. As an Indian AI initiative, sovereignty is central: we want the “engine and steering wheel” of AI to be built here, ensuring India always has access to and control over its models.

The second pillar is Indianness; our models must not only carry global knowledge but also reflect India’s languages, culture, and perspectives. India’s linguistic diversity is vast, with dialects and accents shifting every few hundred kilometers; cultural context matters too where a global model might say “call 911” if there’s a stranger at the door, an Indian instinct might be to invite them in and offer water. Even on sensitive issues like Kashmir, our models must reflect India’s perspective clearly.

The third pillar is accessibility: AI should not be limited to urban tech workers but must serve farmers, teachers, and small businesses.

That’s why we have created domain-specific models like AgriParam for agriculture and AyurParam for Ayurveda, while also prioritizing voice-first interactions to make AI more inclusive for a mobile-first population.

Bharat GenAI is developing models ranging from 2B to 1T parameters. What is the strategic rationale for covering such a wide spectrum? Are the smaller models intended for edge devices, while the larger ones are designed for enterprise and national-scale use cases?

We have thought carefully about balancing small and large models. Smaller models are cheaper to run and vital for real-world use, but the best way to create them is by first building larger ones. Intelligence truly emerges at scale.

A around the 100-billion-parameter mark, models start to demonstrate reasoning abilities, much like the leap from a simple brain to a complex one. That’s why our roadmap is progressive: wWe begin with 3B, then 7B, 30B, and 100B models. From there, we can distill them into lighter, more efficient versions optimized for cost, accessibility, and deployment in diverse environments.

The IT Minister has highlighted the importance of building smaller, use-case specific models alongside larger ones. From a commercial standpoint in India, do you see these smaller models as more practical and more widely useful than the larger ones?

Cost is critical for accessibility, especially in India, where revenue per user is lower than in the US, making affordable AI essential. At the same time, India’s needs are diverse: AI will touch governance, education, healthcare, and agriculture, requiring not just one model but an entire ecosystem of models; some general-purpose, others specialized.

Bharat GenAI’s role goes beyond building models; it aims to grow the broader AI ecosystem. This includes open-sourcing models so that startups can build on them, collaborating with a consortium of seven academic institutes and 75+ students and researchers to develop talent, and offering services like free OCR (Optical Character Recognition) to digitize heritage texts, which in turn enrich AI training. Our focus is on nurturing the complete ecosystem talent, tools, data, and startups to make AI practical, inclusive, and sustainable.

Since Bharat GenAI is developing multilingual models, how challenging is it to source datasets for multiple Indian languages? Hindi has comparatively rich resources, but what about other regional languages?

Data scarcity is a major challenge. even though Hindi has far less data available compared to English. To address this, we have taken a grassroots approach: collaborating with publishers who contribute data while ensuring it is not republished, partnering with local radio stations, and running OCR on scanned heritage books. This foundational work focuses on building datasets through community participation, ensuring that our models genuinely reflect the diversity and richness of India.

Bharat GenAI emphasizes an open-source approach, but open-source initiatives often struggle with monetization and long-term sustainability. How does the consortium plan to ensure the project remains viable and impactful over time?

Sustainability is key. While we value government support, we are also exploring alternative funding through corporate CSR contributions and donations, as well as commercial services such as customized enterprise models, company-specific tweaks, and support contracts similar to Red Hat’s approach with Linux.

The open models will remain freely available, but enterprises requiring reliability, SLAs, and tailored solutions will pay, providing a sustainable model to support the project’s long-term growth.

Many Indian startups currently build applications using foreign AI models like OpenAI, despite the national security implications. How can Bharat GenAI encourage them to adopt Indian-made models instead?

It comes down to two factors: quality and cost. We need to offer models that match or surpass global alternatives, while remaining affordable for startups. Our advantage is proximity with teams based in Indian cities, we can work closely with startups to tailor models to their specific needs. On the policy side, India could consider incentives for using sovereign AI and keeping data local. Striking the right balance between access to global models and nurturing Indian alternatives is key.

Finally, what is the expected timeline for the development and rollout of Bharat GenAI models?

We are releasing models on Hugging Face every month, starting with smaller models now and gradually rolling out larger ones as we secure more compute resources. Since over 90% of the IndiaAI Mission’s funding goes toward compute, which is the largest expense, this approach ensures a steady stream of models that will grow in both size and capability over the coming months.

Published At: 19 November 2025 4:24 pm