EventScribe service disruption
Incident Report for Cadmium
Postmortem

This incident occurred when an influx of usage in the eventScribe.net server pool created a CPU spike and the eventScribe.net servers scaled up more quickly than planned. Typically, a new server is put into service from the standby “warm pool,” which contains servers ready for use. In this case, CPU levels increased across the cluster of servers faster than could be supported by servers in the warm pool. As a result, servers not ready were put into service, resulting in errors. Once the servers became ready the issue was resolved. We are currently working to implement changes to prevent this from happening.

Posted Apr 30, 2024 - 09:00 EDT

Resolved
The issue has been resolved.
Posted Apr 29, 2024 - 08:38 EDT
Monitoring
A fix has been put in place, and service has been restored. Our team is continuing to monitor the system. If you are still experiencing issues, please notify your project manager.
Posted Apr 26, 2024 - 12:26 EDT
Investigating
Our team is aware there is a service disruption with eventScribe website for some users. We are working swiftly to identify the root cause.
Posted Apr 26, 2024 - 11:46 EDT
This incident affected: eventScribe Website.