About 23.25pm BST on 28 September, the company tweeted that it was investigating an issue affecting access to multiple Microsoft 365 services. “We’re working to identify the full impact and will provide more information shortly,” it wrote in the MSFT365Status tweet.
Microsoft confirmed it was investigating availability of the Azure AD service, which organisations use to authenticate onto Microsoft 365. “Customers using Azure Active Directory may experience HTTP 503 errors when accessing the Azure portal,” it said.
However, some people on Twitter observed that the outage had implications that extended beyond the AD portal. One IT admin tweeted: “None of my clients with Azure AD-backed applications can log in or authenticate right now.”
Another noted that the corporate application was down because Azure AD authentication was unavailable. Some people complained they could not use Teams or other Microsoft online services such as Outlook 365.
The company initially said it had identified a recent change that appeared to be the source of the issue and announced it had rolled back the change to mitigate the impact. While monitoring the IT environment, Microsoft admitted it had not observed an increase in successful connections after rolling back the recent change. “We’re working to evaluate additional mitigation solutions while we investigate the root cause,” it said.
This mitigation involved rerouting network traffic to alternate IT infrastructure, which Microsoft said would improve the user experience while it continued to investigate the issue.
In 2019, Mark Russinovich, chief technology officer at Microsoft Azure, described how Azure AD had been architected so that it had no single point of failure (SPOF). He said Azure AD was a global service with multiple levels of internal redundancy and automatic recoverability and was deployed in over 30 datacentres around the world using Azure Availability Zones.