"Linux Gazette...making Linux just a little more fun!"


Linux Primer Series

By Ron Jenkins


Introduction to Disaster Recovery

Copyright 1998 - 1999 Ron Jenkins. All Rights Reserved.
P.O. Box 229, Kirbyville, MO, 65679

Introduction
The Linux Primer Series, FKA the Linux Installation Primer, is a body of work designed to provide the reader with clear, concise information about the Linux Operating System and it's many powerful features.

Disclaimer: While the author takes every precaution to insure the accuracy of the information contained herein, the author assumes no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.

Disclosure: Here I will declare any affiliations or business relationships I have, as soon as I get some, (hint, hint.)

New versions of this document
You can view the latest version of this document via the URL:
http://www.grapevine.net/~jenkinsr/primer/  (Not up Yet, hopefully by the end of the month.
I encourage you to mail any questions or comments about this document to Ron Jenkins, [email protected].

Recent Changes and News
News -
I had intended to address printing issues this month, but after getting all my goodies in the post - I'm not making any money yet, but I'm getting free stuff - always a good sign.

Caldera and Debian have just put the latest release of their distribution into the fray, and RedHat announced 6.0 today (although so far I've played h*ll getting into the FTP site, and Slackware draws ever closer to it's next release (with KDE for those that have to have it!)

So, I will hold off a bit until I can make sure the changes in the new distributions won't alter the printing functions.

So, I've decided to discuss that nasty thing we all hate called disaster recovery.

I have received hundreds of email messages from people inquiring about assistance in resolving a problem they might have.

As always, I am happy to help whenever I can, in any small way that I can, however, I get 50-100 messages per day, so please be patient with me, I will answer messages in the order they are received.

Finally, there have been several instances where I have replied to a request, only to have my mail bounced back to me by the remote server due to "excessive spamming" from netscape.net.

This is silly and completely unnecessary. Only unskilled or lazy admins take the approach of denying an entire domain based on the action of a few bad individuals.

So, if you do not receive a reply from me within a week, it is likely you have a lazy or unskilled admin at your ISP. I suggest you call them and request they deny service properly, on a case by case basis. If they are unwilling or unable to do this, get another ISP.

Changes - none

Before You Start
Sit down for a minute, and run through your disaster recovery plan in your mind. Get out a pencil and paper, and let me share a few of my mistakes with you, hopefully preventing you from making them all over again.

Hardware and Software, and Wetware Requirements
While I will discuss various hardware and software solutions, as well as some things you probably may have overlooked - (don't feel bad, I did too and hosed some machines.)

The most important quality, or requirement, is the diligence to make a plan, stick to it, always, and constantly strive to update and improve it.

As we move more and more to an information dependent society, a critical failure can cause heinous, cascading effects that could conceivably deny you your basic human rights, like your phone line and power for your computer and modem, and oh yeah civilization as we know it.

And most important of all, learn Ron's two rules of Disaster Recovery:
1. FEAR IS GOOD.
2. PARANOIA IS BETTER.
 

Disaster Recovery and it's place in the information age

Overview
The purpose of this column will be to give you some ideas you may not have thought of before, remind you of a few we all skip all too often.

By the time you are finished with this column, you should have a broad grasp of the spectrum of tasks and problems grouped under the heading "Disaster Recovery," and know where to find further information and tools to develop your own unique implementation suitable to your installation.

Basic Tasks
Environmental Concerns
Electrical Concerns
Uninterruptable Power Supply (UPS)
Logical Diversity
Physical Diversity
Geographical Diversity

Detailed Tasks
Environmental Concerns
Computers, like all Electronic devices are sensitive to heat, dust, and gunk.

While it may not be practical for a you to build a cold room for your home computer, there are still things you can do to minimize the wear and tear on your system:

Get an air cleaner to remove the dust from the room.

Either place your computer in a cool room in the house, or possibly get a
window unit.

Don't smoke in the same room as the computer. You have no idea the nasty goop I've found inside a customer's case.

Electrical Concerns
Make sure you have sufficient power to operate your gigafloppin', numbercrunchin' game mother. And printer. And scanner. And fax. And so on.

Surge suppressers are a big market item, and deservedly so, as far as they go. However, to protect your data whether at home or at work (sometimes these are the same places,) it is critical to insure your equipment is provided a clean, filtered, consistent, and constant source of power.

Surge suppressers merely attempt to reduce the amount of excess energy, or spike from damaging your computer.

They do this by placing a Metal Oxide Varistor (MOV) between the energy source and the computer, sort of like a fuse. (this is not really how it works, but you get the idea.)

However, they have two great failings - most of them don't inform you when the MOV is no longer functional, and they provide no power themselves to allow the user to perform a clean shutdown of he system.

"Harshing" a UNIX box, or any box for that matter, can do really nasty things to the file system.

Uninterruptable Power Supply (UPS)
Enter the UPS. The UPS provides surge protection for the line cord, and the phone line in some cases, contains a battery to provide the clean shutdown capability, and can usually be configured to shutdown your machine without human intervention.

There are many good companies out there, producing many good UPS's. I can only say that I use Tripp-Lite and APS products exclusively, for two reasons - one, they are damn good units, and two, I have had occasion to put them to the test on the "Lifetime Warranty" claim, and both companies have come through and went out of their way to get me back up and running.

Logical Diversity
Here comes the dirty word - BACKUPS!

We all know we should do them, and we have all been caught without them. But with the nature of business today, a backup failure can literally mean life or death for a company.

I'm not going to get into the specifics of the best device, best programs, etc.

Rather, I will try to inculcate (scare) you into adopting a plan, and STICKING WITH IT!

Whatever the media you have or plan to get, it is important to develop a written backup plan, coupled with a hard copy backup log.

These two tools are essential to make sure the backup was performed, verified, and labeled and stored in the proper place. It is also a good idea to have the backup operator sign in and out of the log.

How often to back up?
A good question with no set answer.

What I usually tell my clients is to backup anything critical to operations daily and weekly using a minimum of 16 media if they are a Monday - Friday shop, 18 if they are a Monday - Saturday shop.

This is implemented as follows:
5 tapes per week lasting two weeks 10 tapes total for incremental backups.

2 tapes per week lasting three weeks 6 tapes total for full backups.

This gives you a three week rollback capability for when someone in a suit comes in freaking out about a report they deleted a week ago. (This could also be a good time for salary negotiation.)

Physical Diversity
This is the idea of storing the same data in or on more than one physical device. Some common implementations of this idea are disk mirroring, striping, and other hardware and software solutions grouped under the term Redundant Array of Inexpensive Devices (RAID.)

Linux comes with md raid utilities, on board, and I understand these are expanded in the new releases using the 2.2 kernel. Stay tuned.

Geographical Diversity
Remember the backup tapes you make? make copies and put them somewhere other than your primary installation.

If they contain sensitive or proprietary data, a safety deposit box is a good idea.

All done now right? Oh, no grass hopper. Now send one to a different part of the country than where you reside.

Can you say FLOOD? EARTHQUAKE? TORNADO? Sure, I knew you could.

Now that I've hopefully scared the heck out of you, or before you start laughing at the skinny, crippled UNIX nut that gets paid obscene fees, think for a minute about two things - What if your company lost all it's data irrevocably. Could you still stay in business? Ah, now you see why the cryp makes the bucks.

Finishing Up
All kidding aside, I hope I have impressed on you in some small way, the importance of a thorough, comprehensive plan to deal with data Disasters.

I have seen companies go out of business because of this. I wrote this to try to keep you from being one of them.

Other sources of Information

General Linux references
http://www.redhat.com/
http://www.slackware.com/
http://www.calderasystems.com/
http://www.suse.com/

Topic specific references
Network Administrator's guide
System Administrator's Guide
UPS HOWTO

To learn more
http://www.ugu.com/
http://www.webzone.net/jimm/dr_links.htm
 


Previous ``Linux Primer'' Columns

Linux Primer #1, September 1998
Linux Primer #2, October 1998
Linux Primer #3, November 1998
Linux Primer #4, December 1998
Linux Primer #5, January 1999
Linux Primer #6, February 1999
Linux Primer #7, March 1999
Linux Primer #8, March 1999


Copyright © 1999, Ron Jenkins
Published in Issue 41 of Linux Gazette, May 1999


[ TABLE OF CONTENTS ] [ FRONT PAGE ]  Back  Next