NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.
  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Twitter

    Error: Twitter did not respond. Please wait a few minutes and refresh this page.

HSM implications for backup

Posted by Preston on 2009-06-11

If you’ve been following this blog for a while, you’ll know that one key ongoing performance issue I refer to is that created by costs associated with walking dense filesystems as part of backups.

One area that people sometimes don’t take into consideration is the implications of backing up filesystems that use HSM – Hierarchical Storage Management. In a HSM environment, files are migrated from primary to secondary (or even tertiary) storage based on age and access times. In order to make this seamless to the user, a small stub file with the same name is left behind on the filesystem. Therefore if a user attempts to access the file, they trigger a read from HSM storage.

So, in order to free up space (for more storage) on primary disk, big files are migrated, with tiny files being left behind. Over time, more big files are removed, and more tiny files left behind. You may understand where I’m heading now: this high number of little files can result in performance issues for the backup. Obviously, HSM systems are configured so that they recognise backup agents, and the stub is backed up rather than the original file being pulled back, so we’re not concerned about say, backing up 4TB for a 1TB filesystem with HSM; instead, our concern is that the cost of walking a big filesystem with an inordinately large number of small files will seriously impede the backup process.

If you’re planning HSM, think very carefully about how you’re going to backup the resulting filesystem.

(Coming soon: demonstrations of the impact of dense filesystems on backup performance.)

Advertisements

2 Responses to “HSM implications for backup”

  1. Roger said

    Hi

    Could not agree more. Backing up file stubs is a royal pain.

    Disk Extender and others (Storage Migrator etc) that “require” you to backup these stubs should reconsider how things are done.

    Having worked with other HSM systems for a long time I know that there are other ways to accomplish this. Take SAM-FS (LSC|SUN|IBM?) for example.

    Being a filesystem it has control of a lot of aspects that the HSM systems that are layered on top of another filesystem cannot have. Therefore doing backup of a SAM-FS filesystem is done by running the filesystem “dump” (actually samfsdump) command to do a backup. Send that dump to appropriate media, tape – NFS mount or whatever it is just a file. And not unreasonably large one at that. And quite fast.

    So when HSM is “done right” it can actually work.

  2. brerrabbit said

    Preston, when you look into doing your demos, I’d be interested in your thoughts regarding the HSM implications for NDMP backups. We’re a Celerra shop doing NDMP backups and are close to implementing an archive solution with Centera and the Rainfinity FMA.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: