On October 15, 2025, at approximately 4:20 PM UTC, the US cloud platform became almost unavailable due to Redis service overload caused by badly managed big batch deletions of assets. The incident was declared at 6:01 PM UTC and customer impact ended by 6:19 AM UTC on October 16. The issue was fully resolved and closed on October 17, 8:08 AM UTC.
Incident Detection & Customer Impact Start
Incident Declaration
Stable State Achieved
Incident Resolution
Users experienced US cloud being almost unavailable for a total of 13 hours and 59 minutes, from October 15, 4:20 PM UTC to October 16, 6:19 AM UTC. This primarily impacted workflow V1 projects.
The US cloud was almost unavailable because our Redis service was overloaded. This was caused by big batch deletions of assets that were badly managed on our side, causing too many commands to be sent to Redis during big batch deletion of assets. Additionally, our PostgreSQL database was also impacted by these big batches of deletion due to very long transactions.
Immediate mitigation: Services were stabilized to restore normal operations
Long-term mitigations:
We sincerely apologize for the inconvenience caused by this incident and its impact.
Thank you for your patience and continued trust.
The Kili Team