Assignment Canvas Crash
Incident Report for Todd Rylaarsdam
Postmortem

Changes to Assignment Canvas (AC) had been deployed to integrate LogRocket’s services into AC. Upon deployment, the PM2 daemon that contains the main nodejs app started to error out and restart. This resulted in nginx being unable to forward traffic to expressjs, so users during this time were served a 504 error.

PM2 triggered an automatic rollback after 3 minutes of these errors, and normal service was restored after GitHub CD completed deployment of a fixed and stable version of AC to production.

Posted Mar 18, 2021 - 10:40 CDT

Resolved
This incident has been resolved.
Posted Mar 18, 2021 - 10:32 CDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Mar 18, 2021 - 10:31 CDT
Update
A fix is being deployed to production.
Posted Mar 18, 2021 - 10:30 CDT
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 18, 2021 - 10:29 CDT
Investigating
We are currently investigating this issue.
Posted Mar 18, 2021 - 10:25 CDT
This incident affected: Assignment Canvas.