
Google is blaming load miscalculations during routing server
upgrades for the widespread failure of its email serviceGmailfor almost two hours last night.
Millions of users of the free Gmail and paid for business
version of the service were faced with an "Unable to reach Gmail"
error message as their computers failed to connect to the
service.
In February, Google blamed a similar failure of its Gmail
service on routine maintenance in a European datacentre.
Ben Treynor, Google's engineering vice-president, apologised to
users for the latest outage in a
blog post, saying the outage was a "big deal" and Google was
treating it as such.
According to Treynor, Google took a small number of Gmail
servers offline to perform routing upgrades, but underestimated the
load that it put on routers that direct web queries to the
appropriate Gmail servers.
"At about 12:30pm Pacific a few of the request routers became
overloaded and in effect told the rest of the system 'stop sending
us traffic, we're too slow!'," he wrote.
The load was transferred to the remaining request routers,
causing a few more of them to also become overloaded, and within
minutes nearly all of the request routers were overloaded.
"As a result, people couldn't access Gmail via the web interface
because their requests couldn't be routed to a Gmail server," said
Treynor.
However, he said IMAP/POP access and mail processing continued
to work normally because these requests do not use the same
routers.
According to Treynor, the Gmail engineering team was alerted to
the failure within seconds and brought additional request routers
online to restore the Gmail service.
"We've turned our full attention to helping ensure this kind of
event doesn't happen again," said Treynor.
Google has already increased request router capacity beyond peak
demand and plans to implement further reliability improvements in
the coming weeks.
"Gmail remains more than 99.9% available to all users, and we're
committed to keeping events like today's notable for their rarity,"
said Treynor.