- Joined
- Feb 24, 2010
- Messages
- 6,052
- Country Flag
Not sure if you noticed but it's been a bit quite 'round these parts?
Short story: Server died. We're slowly getting it back to normal again.
Long story:
On Friday morning I awoke to around 300 emails from the server complaining that there was a hardware fault. The emails were coming through think and fast.
We identified a failed hard drive in the server which the hosting company then replaced for me.
At this point I had two options - rebuild the disk with the server offline or online. Offline would have taken around 8 or nine hours to rebuild the data. Online was going to be a lot longer and that the server was going to be very busy and slow. I opted for this because having a slow server is better than no server.
The rebuild started and the server got really busy - it was ranging from 1700 to 2400% busy (so it was 240 times as busy as it could be). Things started going bad so I started turning off bits and pieces to try and sort the load out and keep it running (I have a number of other websites and email systems on the server which are business related).
Eventually brute-force turning off this site and the other websites allowed it to cope with email and rebuilding the data.
This morning things looked like they have calmed down a lot and while the server is still over 100% (hovering around 170%) it is functioning well and rebuilding the data at the same time.
I've now enabled the site for logged in users so they can at least use it. Registrations are also closed (remember that we also get about 50k fake registration attempts a day from hackers) until things have settled down. At the moment we're 300GB into nearly a terrabyte of data to be rebuilt. The ETA is 3-4 hours but it's been saying that for the last 3 hours
I continue to monitor it all and if things go the other way again then I'll have to disable the site to ease the load. The ability to rebuild the data correctly trumps all else.
I'l update as and when I can
Cheers,
Crispin
Short story: Server died. We're slowly getting it back to normal again.
Long story:
On Friday morning I awoke to around 300 emails from the server complaining that there was a hardware fault. The emails were coming through think and fast.
We identified a failed hard drive in the server which the hosting company then replaced for me.
At this point I had two options - rebuild the disk with the server offline or online. Offline would have taken around 8 or nine hours to rebuild the data. Online was going to be a lot longer and that the server was going to be very busy and slow. I opted for this because having a slow server is better than no server.
The rebuild started and the server got really busy - it was ranging from 1700 to 2400% busy (so it was 240 times as busy as it could be). Things started going bad so I started turning off bits and pieces to try and sort the load out and keep it running (I have a number of other websites and email systems on the server which are business related).
Eventually brute-force turning off this site and the other websites allowed it to cope with email and rebuilding the data.
This morning things looked like they have calmed down a lot and while the server is still over 100% (hovering around 170%) it is functioning well and rebuilding the data at the same time.
I've now enabled the site for logged in users so they can at least use it. Registrations are also closed (remember that we also get about 50k fake registration attempts a day from hackers) until things have settled down. At the moment we're 300GB into nearly a terrabyte of data to be rebuilt. The ETA is 3-4 hours but it's been saying that for the last 3 hours
I continue to monitor it all and if things go the other way again then I'll have to disable the site to ease the load. The ability to rebuild the data correctly trumps all else.
I'l update as and when I can
Cheers,
Crispin