Microsoft Restores Azure Services After Major Global Outage — Here’s What Went Wrong

Outlook Business Desk

Global Azure Outage

Tech giant Microsoft confirmed a worldwide Azure cloud outage on October 29 that disrupted key services including Microsoft 365, Outlook, Xbox Live, and Copilot. The company later confirmed it had fully restored all affected services after hours of disruption across multiple industries.

DNS Failure

The company said the outage was triggered by a Domain Name System failure in Azure Front Door (AFD), its key global content delivery and routing service, between 15:45 UTC on October 29 and 00:05 UTC on October 30.

Services Affected

The disruption impacted major Azure-based services, including App Service, Azure Databricks, Healthcare APIs, Azure SQL Database, Virtual Desktop, and Microsoft’s Copilot and Sentinel tools, affecting both consumers and enterprises worldwide.

Global Industries Affected

The outage also affected key sectors worldwide. Alaska Airlines reported disruptions to its website, while Vodafone UK and Heathrow Airport also faced interruptions tied to Azure downtime, slowing access to critical business systems and tools.

Wikipedia

Updates Paused

Microsoft temporarily blocked all customer configuration changes to Azure Front Door while recovery continued. The company said a small number of users might still face issues, with updates shared through Azure Service Health.

Configuration Error

The company said a faulty tenant configuration in Azure Front Door caused invalid settings across several nodes, disrupting content delivery and routing. This led to slower performance, connection errors, and service failures across multiple dependent platforms.

Controlled System Recovery

To stabilise the system, Microsoft rolled out its last known stable configuration and slowly redirected global traffic. This step-by-step recovery helped restore full operations while preventing network overload and ensuring the issue would not happen again.

Software Defect Found

Microsoft found that a software defect in its validation system allowed a faulty configuration to slip through safety checks. The company has now strengthened safeguards, improved rollback systems, and added extra validation steps to prevent similar issues in the future.

Starlink India Set to Begin Demo Runs in Mumbai — Check Details, Spectrum Plans & Launch Readiness

Read More