RCA - TRAC-3353 - IOT Connectivity - Critical - IoT Connectivity Core Network Service Disruption
Aeris understands and regrets the significant service impact experienced by our partners and customers. Please be assured that Aeris is fully committed to making the necessary improvements to prevent a recurrence. We are conducting a comprehensive review of our IoT Connectivity offering, service platforms, and supporting infrastructure, from engineering through operations, to ensure all risks are addressed. A clear action plan will be shared and implemented.
We value our partnership with you and remain dedicated to restoring and maintaining full confidence in our service delivery.
Event Start Date/Time (UTC): 22 June 2026, 13:40
Event End Date/Time (UTC): 23 June 2026, 20:17
Date/Time Reported (UTC): 22 June 2026, 13:40
Severity: Critical (Severity 1)
Services Affected: IOT Connectivity Services
Case Number: TRAC-3353 || ZD#177393
Duration: Up to 31 hours and 13 minutes depending on the service
Description of Failure:
Aeris experienced a multi-stage service disruption affecting components of the IoT Core Network, including Data, and SMS services. The issue was first identified on June 22, 2026, at approximately 13:40 UTC, when Aeris monitoring systems detected alarms on the core network.
Impact varied by product and occurred as sustained service degradation with intermittent recovery, rather than a single continuous outage. Services were fully restored on June 23, 2026, at 20:17 UTC. Post‑restoration monitoring confirmed that key service performance indicators returned to baseline levels.
During the incident, data services experienced intermittent session establishment failures and increased latency, while SMS services experienced delayed delivery and intermittent message failures. Impact severity fluctuated over time and was most pronounced during peak traffic intervals.
The initial trigger for the incident was a fiber cut in the primary transport circuit on June 19th at approximately 20:20 UTC. Following the fiber cut, traffic was rerouted automatically to a backup path, with no immediate service impact observed. However, traffic volumes increased during peak business hours on June 22nd and exposed internal capacity limitations on this backup path. This resulted in packet loss, increased latency, and downstream service degradation across data and SMS services.
The Aeris engineering teams and leadership were engaged continuously around the clock throughout the incident, coordinating across network, platform, and vendor teams. Aeris deployed engineers and data‑center technicians onsite and worked directly with the circuit provider and multiple technology vendors to diagnose the issue and implement mitigation and recovery actions, ultimately restoring full service.
The incident was prolonged during the investigation and restoration phase due, in part, to a software defect in a vendor network component which prevented throughput and congestion alerts from being reported. As a result, neither the Vendor nor Aeris teams were able to promptly observe or identify congestion on the backup path. This lack of visibility delayed fault isolation and corrective actions, thereby increasing the overall restoration time.
Ultimately, the root cause is that Aeris did not adequately update the capacity available on this backup path to meet growing demand in this data center.
Service Impact Summary:
Global GSM / Fusion Global:
Data:
Data:
SMS:
Dual-Mode A-LH Service:
Data:
SMS (MO‑SMS only):
Fusion NA:
Data:
SMS (MO, MT, and delivery receipts):
Impairment Cause:
The failover in traffic routing to the backup network path caused congestion on core network nodes during peak traffic, which led to the service impairment.
Impairment Resolution:
The service issue was resolved through a series of corrective actions. Steps were taken to stabilize the network, including adjusting hardware settings, adding capacity where needed, and improving network configurations. Service stability was restored by June 23, 2026, at 20:17 UTC for the last affected service.
Immediately after our vendor fixed the fiber cut, the primary physical link was restored during a maintenance activity (177725 / IOTCHG‑10256) on June 25, 2026.
Corrective Action Items:
Action 1:
Backup Path Capacity Audit:
Conduct a comprehensive review to ensure backup capacity meets current demand
Action 1 Completion Date: In Progress / July 10th 2026
Action 2:
Implement Corrective Actions from Capacity Audit
Action 2 Completion Date: Target August 15th for all critical actions
Action 3:
Monitoring and Diagnostics:
Conduct audit of monitoring along the backup path and identify improvements to ensure necessary logs and metrics are available
Action 3 Completion Date: In Progress / July 17th 2026
Action 4:
Establish regular testing of the backup path
Action 4 Completion Date: Planned delivery - Aug 30, 2026