Frontend delays

Incident Report for Kili

Postmortem

We are very sorry for the slowness that you experienced in production this morning.

This was due to a sudden increase in users activity.

Please find below the actions taken to prevent such events in the future.

Explanation

The button “Start labeling“ was not available because the underlying SQL query preparing the labeling queue was too slow.

This was due to a bug in the management of the user session.

Remediation plan

Short term

We also increased the memory available for the backend to ease horizontal scalability.

We solved it by cleaning this table, dividing by 40 the response time.

Long term

Further actions are taken in the coming weeks:

  • Improve the asset servicing (new service + signed urls)
  • Increase the size of the DB
  • Decommission the internal scheduler service
Posted Dec 20, 2021 - 10:38 UTC

Resolved

This incident has been resolved.
Posted Dec 16, 2021 - 09:15 UTC

Investigating

We're experiencing a degradation of frontend user experience and are currently looking into the issue.
Posted Dec 16, 2021 - 02:30 UTC
This incident affected: Europe (Europe - Kili API).