You must have heard about DeepSeek, a Chinese AI start-up that is making waves in the technology sector, particularly in the field of Generative AI. DeepSeek has made a strong impact for two key reasons. First, it is the first LLM from outside the U.S. to match or even surpass OpenAI’s models in certain benchmarks. Second, it is significantly more cost-effective than other models.
An LLM (Large Language Model) is a type of AI that can understand and generate human-like text. Think of it as a super-smart autocomplete system that has read and learned from massive amounts of text like books, websites, articles, and more.
India’s LLM
India’s tech ecosystem is falling behind in the global AI race. Legacy tech companies like Infosys are not focusing on building foundational LLMs. In their opinion, India’s goal should not be to build one more LLM. The country should let the big boys in Silicon Valley do it by spending billions of dollars. They believe that India must use these models to create synthetic data, build small language models quickly, and train them using appropriate data.
Here is a list of some Indian Start-ups who have built their foundational LLMs.
Sarvam AI's Sarvam 2B
Founded by Vivek Raghavan and Pratyush Kumar, Sarvam AI developed Sarvam 2B, an open-source language model trained on 4 trillion tokens. It supports 10 Indian languages, including Hindi, Tamil, and Telugu, and focuses on translation, summarisation, and other (Natural Language Processing) NLP tasks. The model is developed in collaboration with NVIDIA.
When Sarvam launched its first multilingual LLM, it outperformed its peers like Meta’s Llama and Hugging Face’s Gemma that were trained on bigger datasets. It aims to democratise AI in India by making language models accessible to businesses, researchers, and developers.
Tech Mahindra's Project Indus
Project Indus was developed by Tech Mahindra's R&D team to address the growing need for AI models focused on Indian languages. As an open-source model, it is available on platforms like Hugging Face and is designed to improve enterprise AI solutions for customer support, chatbots, and automation.
Although Project Indus’ reach has been limited, it represents a significant effort by an Indian IT giant to develop homegrown AI capabilities.
Gyan AI's Paramanu
Gyan AI, founded by Venkat Srinivasan, built Paramanu, a family of lightweight AI models with sizes ranging from 13mn to 367mn parameters. Optimised for Indian languages like Assamese, Bangla, Hindi, and Tamil, these models require fewer computational resources while outperforming larger models like GPT-3.5-Turbo in human evaluations.
The model claims to be a hallucination-free, auto-curating and self-organizing research engine built on a transparent and fully auditable language model. Paramanu is particularly useful for startups, government agencies, and businesses that need cost-effective AI solutions for Indian-language processing.
Yellow.ai's YellowG
Yellow.ai developed YellowG, a proprietary AI model focused on enterprise automation. YellowG powers customer service AI, chatbots, and voice assistants that help businesses automate interactions while maintaining a human-like conversational experience.
YellowG is used in banking, e-commerce, and telecom industries, enabling round-the-clock customer engagement across multiple communication channels.
Uniphore's Conversational AI
Uniphore’s Conversational AI is an LLM that specialises in AI-driven speech recognition and automation. Its conversational AI models integrate natural language processing (NLP) and machine learning (ML) to improve call center efficiency, reduce customer wait times, and enhance response quality.
Uniphore’s technology supports multiple Indian and global languages, making it useful for customer service to enhance communication, automate responses, and support human agents in real time.
Ola's Krutrim
Founded by Bhavish Aggarwal, Krutrim is India’s first homegrown large language model, designed to rival global AI systems. It is trained on a vast dataset of Indian languages and contextual data, enabling accurate text generation and conversational AI.
Krutrim uses NLP to understand human language and help with tasks like writing emails, planning travel, and learning new skills. It is being integrated into Ola’s ecosystem, including mobility, financial services, and customer support, while also serving third-party businesses that require Indianised AI solutions.
Hanooman AI's Hanooman
Developed by a consortium led by IIT Bombay and supported by Reliance Jio, Hanooman is a multimodal AI model capable of handling text, speech, and vision-based tasks in multiple Indian languages. Designed for chatbots, search engines, and video analysis, Hanooman aims to make AI more accessible to Indian businesses.
With Reliance's backing, it is expected to scale rapidly and integrate into various telecom, retail, and digital services.
CoRover's BharatGPT
Founded by Ankush Sabharwal, BharatGPT is an AI-powered multilingual virtual assistant designed specifically for Indian government services, banking, and e-commerce. It supports regional Indian languages, helping businesses deploy chatbots and AI-driven customer support for users who prefer vernacular languages over English. BharatGPT plays a crucial role in digital inclusion by enabling seamless communication for non-English speakers across various sectors.
Krutrim VS DeepSeek: Head-to-Head Comparison
Krutrim AI specialises in NLP, enhancing the understanding of local languages and dialects. DeepSeek excels in processing massive datasets, particularly in surveillance and facial recognition technologies.
DeepSeek's latest model, DeepSeek R-1, outperforms Krutrim's most recent model, Krutrim Pro, in several critical metrics, especially in data processing speed, accuracy, and scalability. With 500 billion parameters, DeepSeek R-1 efficiently handles large-scale datasets, processing up to 10 million tokens per second. In comparison, Krutrim Pro, with 150 billion parameters, processes 3 million tokens per second.
DeepSeek’s strengths include image recognition and surveillance, achieving high accuracy rates in facial recognition (97%) and object detection (92%). It is capable of processing 50 terabytes of data daily, excelling in real-time data analysis and large-scale deep learning applications. Meanwhile, Krutrim performs well in sentiment analysis (89%) and text classification (85%), processing 20 terabytes of data daily. Additionally, Krutrim has a 95% accuracy rate in understanding regional languages.
DeepSeek R-1 also surpasses Krutrim Pro in scalability, supporting distributed systems for handling vast datasets, while Krutrim Pro is limited to language-specific tasks. DeepSeek R-1 adapts quickly to new data, training on visual datasets within weeks, while Krutrim Pro requires several months to fine-tune for specific dialects.
Unlike legacy tech companies, new-age AI start-ups in India are taking initiatives to build foundational LLMs within India, focussing on vernacular languages and Indian customer base. However, these start-ups are struggling to secure substantial investments for development. In 2024, Indian AI start-ups raised a total of $166mn, significantly lower than the $518.2mn raised in 2022, according to Business Standard.
On the other hand, the US is investing $500bn to build a robust AI infrastructure in the country to tackle any foreign competition in the AI space. Britain’s government has launched an AI opportunities action plan in the country, investing around $14bn in AI development. China has also come up with a powerful global AI competitor- DeepSeek.