Heather Oliver is a Technical Writer for Constellix and DNS Made Easy, subsidiaries of Tiggee LLC. She’s fascinated by technology and loves adding a little spark to complex topics. Want to connect? Find her on LinkedIn.
Connect with LinkedIn
Overloaded Azure DNS Servers to Blame For Microsoft Outage
After experiencing a massive DNS-related outage on April Fools’ day, Microsoft has since made a full recovery and released a detailed incident report explaining what went wrong.
Spikes in DNS Queries Overwhelm Azure’s System
According to Azure, its servers received an atypical surge of global DNS queries that were directed at some of the domains hosted on its platform. The company claims that its layers of caches would usually be able to handle incidents of this nature, but a specific sequence during the event revealed a code defect that allowed the system to be overloaded. The overload was largely due to the number of retry requests made by DNS clients, which was still considered genuine DNS traffic by Azure’s mitigation systems.
Microsoft Scrambles to Fix Code Defect After Outage
Since Microsoft domains, including Xbox Live and Office 365, use Azure’s DNS, the entire product line was affected to one degree or another. For preventative measures, Microsoft is working to correct the code defect and improving the way it monitors usual traffic.
Even if Microsoft does manage to improve its systems, one has to wonder why domains still rely on Azure DNS as their sole provider in the first place—or even why Microsoft does. With Secondary DNS, incidents like this can easily be avoided. Rather than creating services and cloud environments that purposefully or inadvertently lead to vendor lock-in, companies should be putting the needs of their clients and their client’s customers first.
Infrastructure, peering, and transit capacity play a major role in the ability to keep DNS functioning smoothly and efficiently. But as Microsoft’s recent outage demonstrates, it takes even more than that. One huge downside to companies that provide multiple cloud services is their lack of dedicated DNS management. With all services running on the same network, performance suffers, and the likelihood of servers becoming overloaded increases.
As customers continue to voice their dissatisfaction when outages occur, it is our hope that our industry will begin working to improve internet experiences as a whole, for everyone, and not just themselves. No matter how big a provider is, it is impossible to carry the weight of the entire world’s domains on its infrastructure—and to pretend otherwise is just ridiculous.
azure, microsoft azure, microsoft outage, dns outage, dns provider, 2021 outage, azure dns, microsoft dns
Simplify & automate your DNS management. Learn how we can help with a customized demo.