Frontend delays
Incident Report for Kili
Postmortem

We are very sorry for the slowness that you experienced in production this morning.

This was due to a sudden increase in users activity.

Please find below the actions taken to prevent such events in the future.

Explanation

The button “Start labeling“ was not available because the underlying SQL query preparing the labeling queue was too slow.

This was due to a bug in the management of the user session.

Remediation plan

Short term

We also increased the memory available for the backend to ease horizontal scalability.

We solved it by cleaning this table, dividing by 40 the response time.

Long term

Further actions are taken in the coming weeks:

  • Improve the asset servicing (new service + signed urls)
  • Increase the size of the DB
  • Decommission the internal scheduler service
Posted Dec 20, 2021 - 10:38 UTC

Resolved
This incident has been resolved.
Posted Dec 16, 2021 - 09:15 UTC
Investigating
We're experiencing a degradation of frontend user experience and are currently looking into the issue.
Posted Dec 16, 2021 - 02:30 UTC
This incident affected: Europe (Kili API).