Upon investigation we found several nodes in a partially unhealthy state which appeared to be the underlying cause.
The nature of their presence was counting towards overall capacity, preventing new nodes being added automatically. Additional capacity was requested manually which resolved the problems caused by lack of capacity.
The problematic nodes were then removed through a mix of manual and automated means, returning the cluster to full health.
Service was impacted for the US data centre from 08:05 to 08:35.
Posted Dec 27, 2019 - 09:24 GMT
Steps have been taken to resolve the issue, we are monitoring the system.
Posted Dec 27, 2019 - 08:40 GMT
Failed responses and degraded performance in the US data centre.
Posted Dec 27, 2019 - 08:22 GMT
This incident affected: API, Background Processing, and Web App.