[Europe] Production incident - Assets loading issue
Incident Report for Kili
Postmortem

[Europe] Production issue September 26th, 2024 

Summary

An issue with our Asset distribution system occurred on September 26th, preventing users from loading the queue page. 

Incident Timing (UTC)

This impact on customers lasted from Sep 25, 2024 06:30 am UTC to Sep 25, 2024 02:35 Pm UTC.

Incident Timeline (UTC)

  • September 26th, 6:00am 

    • First alert internally
  • September 26th, 6:30am 

    • First alert from the customer
  • September 26th, 6:45 am 

    • Incident internally created
    • First investigations 
  • September 26th, 7:25 am 

    • The issue has been identified and a fix is being implemented. 
  • September 26th, 2:35 pm 

    • A fix has been implemented
    • Incident resolved: This incident has been resolved.

End-User Impact

Infinite loading when the user tries to load the queue page for big projects   

What caused the incident?

We set limits on memory consumption for some services, which prevent them from executing heavy tasks.

Corrective elements put in place to ensure that this does not happen again

We decreased the size of the request sent to our asset distribution service.

We Increased the limit of asset distribution service.

Posted Oct 08, 2024 - 12:44 UTC

Resolved
An issue with our Asset distribution system occurred on September 26th, preventing users from loading the queue page.
Posted Sep 26, 2024 - 06:30 UTC