Advertisement
X

Abu Dhabi’s G42 Launches Largest Hindi Language AI Model, NANDA 87B

An 87B-parameter Hindi-centric AI model pushing the boundaries of regional language intelligence.

Manu Jain, CEO of G42 India
Summary
  1. G42 has released NANDA 87B, an open-source Hindi–English large language model developed with MBZUAI, Inception, and Cerebras.

  2. The 87-billion-parameter model is built on Llama-3.1 and trained on over 65 billion Hindi tokens.

  3. It supports Hindi, Hinglish, and bilingual use cases, including translation, summarisation, and instruction-following.

  4. NANDA 87B is available as an open-weight model via MBZUAI’s Hugging Face repository.

Advertisement

Abu Dhabi-based technology conglomerate G42 on Monday announced the release of NANDA 87B, an open-source Hindi-English large language model, expanding its work in regional language AI. The model has been developed by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in collaboration with Inception, a G42 subsidiary, and AI hardware company Cerebras.

The release comes amid growing interest in regional language AI in India, as the country’s internet user base continues to expand beyond English-speaking audiences. Industry estimates indicate that a majority of new internet users in India prefer content in local languages, increasing demand for language models tailored to Indian linguistic contexts.

NANDA 87B contains 87 billion parameters and is built on the Llama-3.1 70B architecture. 

According to G42, the model was trained on a curated bilingual dataset with more than 65 billion Hindi tokens, making it one of the largest Hindi-centric language models released with open weights.

The model uses a Hindi-focused tokenizer designed to improve efficiency and reduce computational overhead during both training and inference. It is designed to handle formal Hindi written in Devanagari script, conversational Hindi, and Hinglish, reflecting common language usage across digital platforms in India. G42 said the model performs tasks such as translation, summarisation, transliteration, and instruction-following.

Advertisement

Commenting on the release, Manu Jain, CEO of G42 India, said the updated model builds on the first NANDA release announced last year. “This version represents a step forward in scale and capability compared with the earlier model,” Jain said, adding that the company is continuing to expand its AI operations in India. He said potential use cases could include education, media, and enterprise applications.

From the research perspective, MBZUAI claims the project aligns with its focus on foundation models for underrepresented languages. Richard Morton, Executive Director of the Institute of Foundation Models at MBZUAI, said the work reflects an effort to make high-quality language models more widely accessible. “Our aim is to support large linguistic communities through open-access language technology and collaborative research,” Morton said.

Inception, which focuses on AI product development within G42, said the open-weight nature of the model allows developers and organisations to adapt it for different use cases. Ashish Koshy, CEO of Inception, said the model is intended for developers, educators, content creators, and enterprises working across India’s digital ecosystem.

Advertisement

The model was trained on Condor Galaxy, a large-scale AI supercomputer built by G42 and Cerebras for training and inference workloads. Cerebras provides specialised hardware designed for large language model training at scale.

NANDA 87B has been released as an open-weight model and is available through MBZUAI’s Hugging Face repository, allowing researchers, developers, and enterprises to access and deploy the model. G42 said the release is part of its broader work on multilingual AI systems focused on non-English languages.

Show comments