Routine software upgrade sunk the network

[ad_1]

The spokesperson added that Optus had changed the peering network to avoid the problem happening again, and would continue to work with international vendors and partners to increase the resilience of its network.

According to publicly listed information (which may not be exhaustive), the part of Optus’ network affected by last Wednesday’s outage peers with parent company Singtel’s network in Singapore; China Telecom; the US-headquartered global content delivery network Akamai; and Global Cloud Xchange, owned by Jersey-based 3i Infrastructure and formerly known as Flag Telecom.

This masthead revealed on Saturday that a senior Optus executive phoned an Akamai counterpart about 9am on the day of the outage believing Akamai may have been one of the peers that contributed to the outage. However, Akamai said on Saturday that there was “no present indication that this incident is related to an issue with Akamai”.

On Monday night it was more definitive: “Akamai did not trigger the outage,” an Akamai spokesperson said. “We stand ready to support Optus and our partners at all times.”

Optus pledged to fully co-operate with the reviews into the outage being undertaken by the government and the Senate.

It had previously been coy about the root cause of the outage, with CEO Kelly Bayer Rosmarin telling this masthead last week that the failure was “a network event” that “triggered a cascading failure which resulted in the shutdown of services to our customers”.

The under-fire telco is offering free data to disgruntled customers – but some commentators say it needs to do more.Credit: Louise Kennerley

Wednesday’s outage not only paralysed the nation’s telecommunication networks, but prompted long queues at Telstra and Vodafone retail stores as customers looked to shift providers.

It also affected other providers using the Optus network, including Amaysim, Vaya, Aussie Broadband, Moose Mobile, Coles Mobile, Spintel, Southern Phone, Gomo and Dodo Mobile.

Narelle Clark, who formerly worked at Optus and is now chief executive of the Internet Association of Australia, said Optus should have had in place router rules that dismissed the third-party’s update that exceeded its router’s preset safety levels.

She observed that she had, over the span of her career, seen many incidents where routing updates sent between external parties had crashed individual routers. A simple typo in a “route map” when redistributed between internal networks can similarly overload routers.

It was “so easy” to accidentally share a significant update that causes problems, as the default configuration in routing updates is “send all, even today”, she said.

“This is exactly why it is important to ensure filtering is in place on the receiving end, so that the offending session is dropped rather than the update being passed on at all.

“At anytime you have to assume that everybody who’s sending you routing information is prone to error. That’s why you always set those sorts of protective filters in place,” Clark said.

Matt Tett, the managing director of Enex TestLab, which assesses everything from toasters and the internet to traffic systems, said: “At the end of the day Optus may need to shoulder some responsibility, rather than pointing a finger at an unnamed peering partner.

“What processes failed internally to allow this to occur?” he asked, “and if it was never registered as a potential risk or point of failure then what mitigation strategy have Optus now implemented to ensure it will not reoccur? What are the lessons learned and steps taken?”

He said if responsibility sat solely with a third-party supplier, Optus would’ve named who it was, like when the ABS named IBM during its 2016 Census collection failure.

As previously reported by this masthead, Optus is offering aggrieved customers a free data top-up, but the industry watchdog says it is prepared to force the telecommunications company to offer large compensation payments (up to $100,000 for a business that could prove a loss and up to $1500 for individuals with a claim) if it refuses to settle customers’ claims.

“If you can see a customer has clearly been impacted, we’d be encouraging them to really own the complaint and deal with it,” telecommunications industry ombudsman Cynthia Gebert said.

Loading

“But if we need to take a strong line with Optus to get the right outcome for their customers, that’s what we will do.”

Optus’ offer was immediately slammed by Greens communications spokesman Sarah Hanson-Young, who said the “PR play” was not enough, and tech analyst Foad Fadaghi, who said “knee-jerk offers” could prompt more customers to ditch the business.

Embattled chief executive Bayer Rosmarin is due to front a Senate inquiry into the 16-hour outage, while also answering to a separate government inquiry announced by Communications Minister Michelle Rowland.

The Senate inquiry kicks off this Friday.

The outage came a year after Optus suffered a massive data breach, in which more than 9 million current and former customers had their records accessed.

The Business Briefing newsletter delivers major stories, exclusive coverage and expert opinion. Sign up to get it every weekday morning.

[ad_2]