NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Posts Tagged ‘library performance’

Meet your library

Posted by Preston on 2009-04-26

For larger sites in particular, we frequently end up in situations where backup or system administrators are sufficiently remote from the datacentre that they rarely interact with servers “face to face”. As remote management features continue to advance, allowing interaction with pseudo-shutdown servers and devices, this will only increase.

This level of remoteness can create unrealistic expectation of operation performance, particularly when the chips are down and something (e.g., a recovery) needs to be done urgently.

So there’s something very important you should do with your tape libraries – you should meet them. By meeting them, I mean the following:

  • Sit in front of them, with a laptop or console.
  • Make sure you can hear the library in operation.
  • Run at least the following commands:
    • Load;
    • Unload;
    • Relabel;
    • Inventory;
    • Import;
    • Device clean;
    • Export.
  • If possible, also do the following:
    • Monitor how long it takes media to rewind and become available for eject once EOM is reached;
    • Generate a SCSI bus reset while media is being read from to and observe how long it takes the library to recover;
    • Generate a SCSI bus reset while media is being written to and observe how long it takes the library to recover.

Knowing how long these operations take to complete fulfill two important (and overlapping) functions:

  1. You now have a timeframe for common activities to rely on when you’re otherwise stressed;
  2. You’re less likely to panic and intervene because something seems to be taking too long, when in actual fact you just don’t normally note how long an operation takes.

This is pretty important – I’ve seen a lot of important recoveries go from say, stressful to full panic when excessive intervention is taken on a tape library and it isn’t given appropriate time to “recover” from errors or interrupts.

Meeting brings understanding, understanding brings patience, patience brings success.

Posted in General thoughts, Policies | Tagged: , | Comments Off on Meet your library