Eptura Asset Detailed Root Cause Analysis (RCA) – Severity 2 Event May 2, 2024
We are profoundly grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident.
Description:
On May 2nd, 2024, 6:02 AM MST we received reports that customers were not able to print. This occurred shortly after a standard maintenance release. Our team attempted a server configuration change in an effort to improve overall BI module performance. The configuration change was rolled back due to the errors encountered.
Type of Event:
S2 event - Service disruption. BI module was down.
Services\Modules Impacted:
BI Module
Remediation:
Once DevOps realized the new configuration was not working, they immediately initiated the roll back.
Timeline:
5-2-24 3:00 AM - BI configuration upgrade started during standard maintenance window
5-2-24 5:35 AM – End of Maintenance window
5-2-24 5:51 AM – Roll back initiated.
5-2-24 6:02 AM – First client reported that they were not able to print BI reports.
5-2-24 6:12 AM - Fire alarm initiated by the support team.
5-2-24 8:11 AM – DevOps completed the roll back and customers confirmed the module was back online
5-2-24 8:11 AM - Issue resolved, All Clear
Total Duration of Event:
2 hours 36 minutes
Root Cause Analysis:
The BI temp folder needed to be cleaned and the service restarted after the roll back procedure.
Preventative Action:
Efforts to test and deploy configuration changes will be performed in a controlled environment before releasing to production in the future.