SD-WAN Service Interruption
Updates
SD-WAN Service Interruption
At approximately 11:00am on both Wednesday 30/07/25 and Friday 25/07/25, several SD-WAN services began going offline. Our engineers identified an unresponsive host, which was promptly restarted to restore service. The issue was then escalated to our software vendor to investigate the root cause of the outage.
The vendor identified a failure in core services on the host device. The failover system, which should have shifted traffic to backup servers, did not activate correctly, resulting in a loss of connectivity. After the reboot, services were redistributed across multiple servers to reduce load and maintain stability.
Findings from Vendor and TechPath Engineers:
• One customer had extremely high bandwidth usage at the time, generating a large number of outbound connections to an external service.
• A CPU processing issue related to hyperthreading, which allows multiple tasks to run simultaneously, may have contributed. Disabling this feature has been recommended to improve stability.
Actions Taken:
• Load balancing has been applied across the aggregator routers to reduce system strain.
• Monitoring has been adjusted to detect and manage high-impact customer traffic that could affect router performance.
• Ongoing investigation into the CPU processing issue, with plans to make further adjustments if required.
• Additional hardware has been purchased and will be installed immediately upon arrival to further improve resilience.
We appreciate your patience and understanding while we work to rectify this issue.
← Back