For a long while one of my servers “crashed” weekly. Every saturday at 3am the load would just spiral up and everything would become unresponsive (still working but slow) – everything was waiting on disk I/O. I kept putting off checking why this happened, and just rebooted the machine manually every saturday (lazy lazy). After a while I found out it was at 3am, so it had to be a cron job. I then noticed that one of my other servers also had a high load at 3am, but less high and it recovered.
Today I did some thinking and checking, and to blame was.. me! I had configured smartmontools to do a weekly long selfcheck at 3am on all 4 disks in the software RAID5 at the same time. This apparently took so much speed away from the disks that the machine didn’t cope.. Woopsie.
Note: hobby environment of course, real production I would have spent more time figuring it out
Incoming Links (via Technorati):
- An Interview With Dr. Wallace J Nichols, Blue Marbles Project Founder
Technorati's exclusive interview with Ocean Biologist Dr. Wallace J. Nichols, founder of the Blue Marbles Project. - About Antioxidants
Antioxidants, ORACs and free-radicals, oh my! - Ministers Caught Watching Porn in the Legislative Assembly
Two Karnataka BJP ministers caught watching porn film in Legislative assembly - From Titanic to Concordia
A Century After the Titanic Disaster, A Rush to Compare the Costa Concordia's Fate - South's Tallest Skyscraper Sold At Auction Today
BofA Plaza Not in Distress, but Disgusted

