Outlook Business Desk
AI chipmarker Nvidia unveiled Nemotron 3 Super, a 120‑billion‑parameter AI model, designed to power advanced reasoning and scale multi-agent workflows efficiently for developers and researchers.
The model comes with open-weights under a permissive licence, allowing deployment on workstations, data centres or cloud platforms with full customization and fine-tuning.
According to the company, Nemotron 3 Super is trained entirely on synthetic data, with Nvidia publishing 10 trillion tokens and detailed reinforcement learning methods for research and development.
Designed to handle multi-agent workflows, Nemotron 3 Super uses a 1‑million-token context window, retaining full workflow state and preventing goal drift across tasks.
Nemotron 3 Super uses a hybrid mixture-of-experts architecture, activating 12 billion of its 120 billion parameters selectively. This improves inference speed, accuracy and optimises memory and compute resources.
On the other hand, Multi-Token Prediction allows the model to predict multiple future words simultaneously, resulting in 3x faster inference while maintaining high accuracy for AI agent tasks.
Running on Nvidia’s Blackwell GPUs with NVFP4 precision, Nemotron 3 Super cuts memory usage and speeds up inference up to 4x, supporting high-performance workflows efficiently.
The model can also handle complex subtasks such as loading entire codebases, processing thousands of report pages, and performing accurate tool-calling for tasks like cybersecurity.
Nemotron 3 Super is accessible via Perplexity, OpenRouter, Hugging Face and build.nvidia.com, and is integrated into AI agents like CodeRabbit, Factory and Greptile for enhanced efficiency and accuracy.