Highlights in backup and recovery

comment No Comments Written by Anders on April 28, 2008 – 6:42 am

After all the work and planning for your new backup and recovery system, you finally get it installed and configured. You start running backups. Life is good. But wait, someone just called and said there was a problem. How could there be a problem? You have been so careful and thorough. This can’t be happening! Well, it really does happen. In fact, if you look at your backup and recovery architecture, you will notice that a simple NetBackup backup of a client touches a significant amount of your enterprise.

This backup starts by having a process run on the master server that determines if it is the proper time to do a backup. Then there are communications between the master server and the media server, followed by communications between the media server and the client. The media server communicates with a robotic library and requests a specific tape be loaded into a specific drive. The client starts a process to read the data from its disk and starts sending the data across the network to the media server. The media server receives this data and passes it through shared memory to a tape drive while passing the meta data back to the master server, where it is stored in a catalog on a disk on the master server. When the backup is finished, the media server closes the tape and asks the library to take the tape from the drive and put it back in a specific slot.

I have exercised disks on a couple of different systems, exercised the network, used a robotic library, written to a drive-I’ve really exercised a good portion of the total enterprise. This makes backup and recovery one of the best enterprise-wide diagnostic tools. It also makes troubleshooting backup and recovery problems all that more important and difficult, since you could be troubleshooting network issues, client system issues, hardware issues at several different places, and overall operating system issues, just to mention a few. I give you a basic idea of how to approach these problems and identify some of the tools you will need.

The most important part of troubleshooting is to understand how your particular application works. Functionally, what services, daemons, or processes are started as a result of a backup job being executed? Backup applications are like avalanches; once the backup job starts, it tends to spawn several other supporting processes to accomplish the task. Therefore, being able to track the relationship of these services, processes, and daemons is the first step in your journey to troubleshooting your environment. If you don’t have a good solid understanding of the backup architecture and functional overview, your troubleshooting will be hit-or-miss. I have seen many backup administrators troubleshoot a problem from the entirely wrong direction, thus losing valuable time. If your software vendor hasn’t published such an overview, contact them and request it. Ask as many people as you can to get this information, because without it, you won’t have all the right tools necessary to maintain your environment properly.

Whatever the case, you should document for yourself how the software is deployed on your environment. If this is an inherited environment, you should have either an outside consultant or internal IT staff perform a site assessment to document your current state. It’s always better to do this assessment before you have the trouble looming over your head so you can be prepared and proactive.

Bookmark or Share:
  • E-mail this story to a friend!
  • Technorati
  • StumbleUpon
  • Facebook
  • Google
  • del.icio.us
  • Digg
  • Slashdot
If you enjoyed the article, why not subscribe?

Browse Timeline

Post a Comment

About The Author: Anders

Anders is a freelance graphic designer. He specializes in CSS/XHTML web design and design of print materials including business cards, brochures and flyer’s. You can view his portfolio at andershaig.com.

Want to subscribe?

SEO blog and web design related issues. Subscribe in a reader Or, subscribe via email:
Enter your email address: