US data centre outage

Incident Report for Cronofy

Resolved

Performance has returned to normal.

Upon investigation we found several nodes in a partially unhealthy state which appeared to be the underlying cause.

The nature of their presence was counting towards overall capacity, preventing new nodes being added automatically. Additional capacity was requested manually which resolved the problems caused by lack of capacity.

The problematic nodes were then removed through a mix of manual and automated means, returning the cluster to full health.

Service was impacted for the US data centre from 08:05 to 08:35.

Posted Dec 27, 2019 - 09:24 UTC

Monitoring

Steps have been taken to resolve the issue, we are monitoring the system.

Posted Dec 27, 2019 - 08:40 UTC

Investigating

Failed responses and degraded performance in the US data centre.

Posted Dec 27, 2019 - 08:22 UTC

This incident affected: API, Background Processing, and Developer Dashboard.