TechPath SD-Wan incident

Minor incident TechPath SD-WAN
02-05-25 10:42 AM AEST · 12 minutes

Updates

Issue

TechPath SD-WAN Platform – Outage Report

Incident Summary:

On May 2nd, between 10:42-10:54am, a subset of customers on the TechPath SD-WAN platform experienced an unexpected loss of connectivity. After an extensive investigation, it was identified that one of the platform’s aggregators became unresponsive to the routing engine responsible for managing customer traffic.

Root Cause Analysis:

The issue was traced to a failure in one of the SD-WAN platform aggregators, which stopped processing customer traffic. Although the routing engine ceased to forward traffic, the platform’s monitoring and stability detection mechanisms did not identify the aggregator as impaired. As a result, affected customer services were not automatically failed over to their secondary aggregator, prolonging the connectivity disruption.

Resolution and Remediation:

TechPath conducted a thorough investigation in collaboration with the software vendor, analyzing hundreds of thousands of log entries to pinpoint the root cause. As a result of these investigations, the vendor has committed to exploring and implementing additional monitoring and health check mechanisms within the software stack to better detect similar aggregator failures and trigger timely failover responses in the future.

Customer Impact:

Only a small subset of customers were affected by this incident. We understand that even limited disruptions can have significant impacts, and we are committed to improving system resilience and response.

Next Steps:

TechPath will work closely with the software vendor to ensure improved failure detection and automated response capabilities are implemented.
Internal monitoring thresholds and detection logic will be reviewed and updated where applicable to provide earlier visibility into aggregator-level faults.

Apology and Acknowledgment:

We sincerely apologise for the delay in releasing this outage report. Due to the complexity of the incident, an in-depth investigation was necessary to ensure accurate findings and actionable outcomes. We appreciate your patience and understanding.

If you have any further questions or concerns, please contact our support team on 1300 033 300 or via support@techpath.com.au

May 29, 2025 · 09:41 AM AEST
Monitoring

Engineers have confirmed services have been restored. Issue has been resolved on reboot of the impacted device and we will continue to monitor. Investigation into the outage will continue and status updated once fully resolved.

May 2, 2025 · 12:59 AM AEST
Issue

TechPath connectivity engineers are currently investigating a possible issue impacting one of our core SD-WAN routers. Update to this notification scheduled for 11:00am.

May 2, 2025 · 12:54 AM AEST

← Back