Downtime Postmortem

Downtime Postmortem

At about 7:20pm Pacific Time, I got a notification from our uptime monitoring that the site was down. After a minute or two I realized the problem - our main data disk had gotten filled to the point where the database started refusing connections. Disk capacity alerts were not in place. We removed the old sheets data, and then we took the site down to copy our databases to a much larger disk (2.5x the size). After rebuilding, the integrity of this new disk was no good and a few minutes after coming back up it tanked. After some additional fix attempts, we needed to bring up yet another disk. Again, it took some additional fix attempts, but as of 11:30 Pacific this one looks to be working fine.

I apologize for the downtime. We'll get alerts added and a real 'read-only' system is in the works so that we can limp by on a failing disk in the future without having to go totally down. Old sheets data will be restored as soon as we can get to it.

Great work guys! As always, thanks for everything!

Perhaps it's time to start cleaning out failed / long dead games ? XP

Good job!

I am just happy everything is ok!

Storage is (comparatively) cheap, so deletion is usually not necessary. But upgrading size does come up every few years I'm sure.

Nice work on the site as usual, it has an excellent uptime ratio considering the size and volunteer nature of the staff.

Originally Posted by Pipster View Post
Perhaps it's time to start cleaning out failed / long dead games ? XP

Good job!
Yeah, I am wondering why archived games aren't completely wiped after some set amount of time? Like, a month or so after archiving?

Archived games aren't deleted because there's no reason to - the storage for them is pretty minimal. Additionally, it's not much of an archive if it's missing history. People come back and need/want things from archived games all the time. Game info takes about 50MB total, and our entire history of over 10 million posts is really just about 7GB of storage.

The real kicker is sheets - over 30GB for those. And (I forget) another 10-20GB for the old sheets.

The main page still says its down for me. Sometimes. All other pages are fine, and when I go from the Gaming Discussion and then to the main page, its all good. Enter it on the address bar, though, and its back to the maintenance screen.

EDIT:Main page works, as long as I don't enter the name on the address bar, or click the logo. Go through any other way, (the link near the top of the page showing the folder tree,) its all good. Weird.


Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2017, vBulletin Solutions, Inc.

Last Database Backup 2017-09-25 09:00:06am local time
Myth-Weavers Status