Major Second Life outage
Filed under: Bugs, Server downtime, News items, Second Life
We're getting hundreds of reports of failures with Second Life, starting about 9:15AM SLT (US Pacific). Problems reported include all transactions, inventory, map problems, profiles, balances, teleportation, sim border crossings, IMs and ... well, pretty much every service. Users are reporting that logins are not only failing, but crashing the viewer.
This includes the Second Life website, which is non-responsive, though the Linden blog is live (it is not hosted with the rest of Linden Lab's equipment).
Update: Linden Lab has spotted the trouble too.
Update: After about 1 hour and ten minutes, the problem has been resolved. No information appears to be forthcoming on the cause or the cure. Thus we do not know if we can expect this to happen again today.



















Reader Comments (Page 1 of 1)
Goldie Katsu said on 1:40PM 4-17-2008
From the "Updated - Linden Lab has spotted the problem" it really sounds like they need some classic systems monitoring ala a Network Management System. I would bet that there are some signs of problems developing long before it snowballs to this level. Databases have monitoring tools, network connections can be monitored, systems can be monitored and if built sensibly software can have monitoring components. Systems do have problems but this "everyone knows about them hours before Linden Lab" syndrome seems a bit...silly.
Reply
Tateru Nino said on 1:58PM 4-17-2008
Well, to be fair - they posted about it about 60 seconds after I hit the publish button myself, and I was in the thick of it when things started to fall apart.
Marianne McCann said on 5:31PM 4-17-2008
This month is definitely rivaling the end of 2006 for performance troubles. I really wonder jes what is goin' on to cause all this -- an more importantly, is it correctable in some sorta long-term fashion.
Reply
Pavig Lok said on 1:17AM 4-19-2008
I put a lot of this down to the het-grid changes to grid architecture (breaking the single monolithic grid into related subsystems that scale better). A few months ago I was jumping for joy with how well grid changes were rolling out, as I expected these kind of issues to pop up sooner.
The bad thing is they're replacing their old problems with new ones. The good thing is that the old problems were such as they were built broken, the new problems are built solvable (for the most part). Even so it's not surprising that they occasionally run into unforseen issues as they tweak this server code, as the gochas are predominantly new and untested instead of old and familiar.
Just my two cents :P
Reply