Issue:
Customers experienced 503 Service Unavailable errors, impacting On-site functionality. The issue was identified during routine alert monitoring and in response to reports of degraded service.
Root Cause:
The issue stemmed from the Ingress configuration not being properly applied after an update to the NGINX controller. Specifically, the upgrade to NGINX version 1.12.1 as part of a critical security vulnerability update led to a misapplication of the configuration. The security evaluation process on the configuration snippets caused delays in loading, resulting in traffic being directed to default configurations, which led to service disruptions.
Actions Taken:
Next Steps:
All necessary actions to address the vulnerability have been completed, and no further steps are required at this time. Moving forward, we’ve enhanced our testing process to ensure that if this error is encountered, we can take appropriate actions prior to any impact to the Rebuy services.