Microsoft Unveils Phi-4-Reasoning Small Language Models to Rival OpenAI’s o3-Mini

Microsoft on Wednesday unveiled three small language models: Phi-4-reasoning, Phi-4-reasoning-plus and Phi-4-mini-reasoning. The most capable of these models demonstrates performance comparable to OpenAI’s o3-mini on at least one benchmark.

“We are excited to introduce Phi-4-reasoning, Phi-4-reasoning-plus and Phi-4-mini-reasoning—marking a new era for small language models and once again redefining what is possible with small and efficient AI,” Microsoft stated in a blog post on its website.

As their names suggest, the new permissively licensed models—Phi-4-mini-reasoning, Phi-4-reasoning and Phi-4-reasoning-plus—are reasoning models designed to spend more time verifying answers to complex queries. They expand Microsoft’s Phi small-model family, introduced a year ago to provide a framework for AI developers creating edge applications.

Gen Z Is Ghosting Degrees and Day Jobs to Go All in on the Creator Economy in India

By Nabodita Ganguly

Phi-4-reasoning and Phi-4-reasoning-plus

Phi-4-reasoning is a 14-billion-parameter open-weight reasoning model that rivals much larger models on complex reasoning tasks. Trained through supervised fine-tuning on carefully selected reasoning demonstrations from OpenAI’s o3-mini, it constructs detailed reasoning chains, effectively leveraging additional inference-time computation.

The model demonstrates that meticulous data curation and high-quality synthetic datasets enable smaller models to compete with larger ones.

Phi-4-reasoning-plus enhances Phi-4-reasoning’s capabilities by incorporating reinforcement learning to utilise more inference-time computation—using 1.5 times more tokens than Phi-4-reasoning—to achieve greater accuracy.

Despite their significantly smaller size, both Phi-4-reasoning and Phi-4-reasoning-plus surpass OpenAI’s o1-mini and DeepSeek-R1-Distill-Llama-70B on most benchmarks, including mathematical reasoning and PhD-level science problems.

They also outperform the entire DeepSeek-R1 model (with 671 billion parameters) on the AIME 2025 test, the 2025 qualifier for the US Math Olympiad. Both models are accessible through Azure AI Foundry and HuggingFace.

Why Analysts Expect More Pain for IndusInd Amid Grant Thornton’s Bombshell Findings

By Vikash Tripathi

Phi-4-mini-reasoning

Phi-4-mini-reasoning is designed to meet the demand for a compact reasoning model. This transformer-based language model excels in mathematical reasoning, providing high-quality, step-by-step problem-solving in scenarios with limited computational resources or low latency.

Fine-tuned using synthetic data generated by the DeepSeek-R1 model, Phi-4-mini-reasoning balances efficiency with advanced reasoning capabilities.

It is ideal for educational applications, embedded tutoring and lightweight deployment on edge or mobile systems, having been trained on over 1 million diverse math problems ranging from middle-school to PhD level. The model is available for testing on Azure AI Foundry or HuggingFace.

In Depth

Videos

Start-Up

Planet

Initiatives

Microsoft Unveils Phi-4-Reasoning Small Language Models to Rival OpenAI’s o3-Mini

Phi-4-reasoning and Phi-4-reasoning-plus

Phi-4-mini-reasoning