News

DeepSeek Rolls Out V3 Model Updates, Strengthen Programming Capabilities to Outpace OpenAI

Chinese AI startup DeepSeek upgrades its V3 model with the V3‑0324 update, enhancing programming capabilities and shifting to the MIT license. This strategic move aims to outpace competitors and drive cost-effective AI innovation

Deepseek AI
info_icon

Chinese AI start‑up DeepSeek has added updates to its V3 model aimed at boosting its programming capabilities in an effort to outpace competitors, Bloomberg reported.

The V3‑0324 update was posted on AI community platform Hugging Face without a public statement about the same. The update was also published on GitHub and the model’s license was changed to the MIT license, a well‑known open‑source standard from the Massachusetts Institute of Technology.

DeepSeek's update follows the unveiling of its R1 model in late January, which appeared to match or even outperform its competitors while using outdated hardware and only a fraction of their budgets. The V3 is an older model from DeepSeek, which has sparked a debate on whether cutting‑edge platforms can be developed for considerably less than the billions of dollars that US companies are spending on data centre infrastructure.

What is DeepSeek‑V3?

DeepSeek‑V3 is a mixture‑of‑experts (MoE) language model with 671 bn parameters, 37 bn of which are activated per token. It is trained on 14.8trn high‑quality tokens, enabling it to handle complex tasks in coding, mathematics and reasoning.

The model is built upon an innovative architecture. It includes the MLA (Multi‑Head latent Efficiency) mechanism that enhances the model’s inference efficiency, the MoE model which helps in load balancing and avoiding auxiliary loss and MTP (Multi‑Token Prediction) which helps in solving complex prompts.

DeepSeek Wave

The Chinese AI start‑up made a buzz around the AI industry recently as its large language models (LLMs) reportedly outperformed the pioneer OpenAI’s top models. The start‑up’s R1 model, launched in November 2024, and the latest V3 model beat OpenAI’s o1 Preview and GPT‑4o on multiple benchmarks.

DeepSeek recently revealed some cost and revenue‑related financial numbers that displayed its ‘theoretical profit’ to be more than five times the cost. The data associated with DeepSeek’s V‑3 and R‑1 models claimed a cost‑profit ratio of up to 545% per day.

The start‑up, however, clarified that these are hypothetical numbers and that the actual revenue could be significantly lower. This is because it has monetised only a small set of its services and offers discounts during off‑peak hours. The costs also don’t factor in all the R&D and training expenses for building its models.

×