NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Posts Tagged ‘disk backup’

EMC, Data Domain, VTLs and Disk Backup

Posted by Preston on 2009-11-30

With their recent acquisition of Data Domain, some people at EMC have become table thumping experts overnight on why you it’s absolutely imperative that you backup to Data Domain boxes as disk backup over NAS, rather than a fibre-channel connected VTL.

Their argument seems to come from the numbers – the wrong numbers.

The numbers constantly quoted are number of sales of disk backup Data Domain vs VTL Data Domain. That is, some EMC and Data Domain reps will confidently assert that by the numbers, a significantly higher percentage of Data Domain for Disk Backup has been sold than Data Domain with VTL. That’s like saying that Windows is superior to Mac OS X because it sells more. Or to perhaps pick a little less controversial topic, it’s like saying that DDS is better than LTO because there’s been more DDS drives and tapes sold than there’s ever been LTO drives and tapes.

I.e., an argument by those numbers doesn’t wash. It rarely has, it rarely will, and nor should it. (Otherwise we’d all be afraid of sailing too far from shore because that’s how it had always been done before…)

Let’s look at the reality of how disk backup currently stacks up in NetWorker. And let’s preface this by saying that if backup products actually started using disk backup properly tomorrow, I would be the first to shout “Don’t let the door hit your butt on the way out” to every VTL on the planet. As a concept, I wish VTLs didn’t have to exist, but in the practical real world, I recognise their need and their current ascendency over ADV_FILE. I have, almost literally at times, been dragged kicking and screaming to that conclusion.

Disk Backup, using ADV_FILE type devices in NetWorker:

  • Can’t move a saveset from a full disk backup unit to a non-full one; you have to clear the space first.
  • Can’t simultaneously clone from, stage from, backup to and recover from a disk backup unit. No, you can’t do that with tape either, but when disk backup units are typically in the order of several terabytes, and virtual tapes are in the order of maybe 50-200 GB, that’s a heck of a lot less contention time for any one backup.
  • Use tape/tape drive selection algorithms for deciding which disk backup unit gets used in which order, resulting in worst case capacity usage scenarios in almost all instances.
  • Can’t accept a saveset bigger than the disk backup unit. (It’s like, “Hello, AMANDA, I borrowed some ideas from you!”)
  • Can’t be part-replicated between sites. If you’ve got two VTLs and you really need to do back-end replication, you can replicate individual pieces of media between sites – again, significantly smaller than entire disk backup units. When you define disk backup units in NetWorker, that’s the “smallest” media you get.
  • Are traditionally space wasteful. NetWorker’s limited staging routines encourages clumps of disk backup space by destination pool – e.g., “here’s my daily disk backup units, I use them 30 days out of 31, and those over there that occupy the same amount of space (practically) are my monthly disk backup units, I use them 1 day out of 31. The rest of the time they sit idle.”
  • Have poor staging options (I’ll do another post this week on one way to improve on this).

If you get a table thumping sales person trying to tell you that you should buy Data Domain for Disk Backup for NetWorker, I’d suggest thumping the table back – you want the VTL option instead, and you want EMC to fix ADV_FILE.

Honestly EMC, I’ll lead the charge once ADV_FILE is fixed. I’ll champion it until I’m blue in the face, then suck from an oxygen tank and keep going – like I used to, before the inadequacies got too much. Until then though, I’ll keep skewering that argument of superiority by sales numbers.

Posted in Architecture, NetWorker | Tagged: , , , , , , | 3 Comments »

Quibbles – The maddening shortfall of ADV_FILE

Posted by Preston on 2009-11-25

Everyone who has worked with ADV_FILE devices knows this situation: a disk backup unit fills, and the saveset(s) being written hang until you clear up space, because as we know savesets in progress can’t be moved from one device to another:

Savesets hung on full ADV_FILE device until space is cleared

Honestly, what makes me really angry (I’m talking Marvin the Martian really angry here) is that if a tape device fills and another tape of the same pool is currently mounted, NetWorker will continue to write the saveset on the next available device:

Saveset moving from one tape device to another

What’s more, if it fills and there’s a drive that currently does have a tape mounted, NetWorker will mount a new tape in that drive and continue the backup in preference to dismounting the full tape and reloading a volume in the current drive.

There’s an expression for the behavioural discrepancy here: That sucks.

If anyone wonders why I say VTLs shouldn’t need to exist, but I still go and recommend them and use them, that’s your number one reason.

Posted in NetWorker, Quibbles | Tagged: , , , , , | 2 Comments »

When will tape die?

Posted by Preston on 2009-08-10

As you may have noticed, I have a great deal of disrespect for “tape is dead” stories. To be blunt, I think they’re about as plausible as theories that the moon landing was faked.

So I thought I might list the criteria I think will have to happen in order for tape to die:

  1. SSD will need to offer the same capacity, shelf-life and price as equivalent storage tape.

There’s been a lot of talk lately of MAIDs – Massive Arrays of Idle Disks – being the successor/killer to tape, on the premise that such arrays would allow large amounts of either snapshotted or deduplicated data to be kept online, replicated into multiple locations, and otherwise in a night-perfect nearline state.

This isn’t the way of the future. Like VTL, MAIDs are a stop-gap measure that will fulfill specific issues to do with tape, but not replace tape. Like VTLs, if the building is burning down you can’t rush into the computer room, grab the MAID and run out like you can with a handful of tapes. Equally similarly to VTLs and disk backup units, it’s entirely conceivable of a targetted virus/trojan (or even a mistake) wiping out the content of a MAID.

No, we won’t get to the point where tape can “die” until such time as there is a high speed, safe, and comparatively cheap removable format/media that offers the same level of true offline protection.

The trouble with this is simple – it’s a constantly moving goalpost. Restricting ourselves to just LTO for the purposes of this discussion, it’s conceivable that SSDs might, in a few years, catch up with LTO-4; however, with LTO-5 due out “soon”, and LTO-6 on the roadmap, SSDs don’t need to catch up with a static format, they need to catch up with a format that is continuing to improve and expand, both in speed and capacity.

So perhaps, instead of being so narrow as to suggest that tape might die when SSDs catch up, it might be more accurate to suggest that tape may have a chance of being replaced when some new technology evolves with sufficient density, price-point, performance and portability that it makes like-for-like replacement possible.

There are “old timers” in the computer industry who can tell me stories of punch card systems and valve computers. I’m a “medium timer” so to speak in that I can tell stories to more youthful people in computing about working with printer-terminals, programming in RPG and reel-to-reel tape. So, do I envisage in 10-20 years time trying to explain what “tape” was to people just starting in the industry?

No.

Posted in Architecture, Backup theory, NetWorker | Tagged: , , , , | 7 Comments »

Should you use NAS for disk backup?

Posted by Preston on 2009-05-29

Over at SearchStorage, there’s an article at the moment about using NAS disk as a disk backup target – i.e., where (in NetWorker), the ADV_FILE device would be created.

I have to say, I strongly disagree with the notion of using NAS mounted filesystems for disk backup, even if NetWorker lets you. In short, it’s a very bad idea, and primarily for performance reasons.

Consider this – the optimal backup configuration for NAS is to use NDMP wherever possible; otherwise, if we backup the volume(s) as they are mounted on another host, every backup involves a double network transfer – once to retrieve the data from the NAS device to the mounter, and then a second transfer to have the backup product copy the data from the mounter to backup storage.

So, let me ask the obvious question – if performance issues act as a primary reason to not backup NAS via mounts, are there any compelling performance reasons why the reverse would be acceptable?

I don’t believe there are. If wishing to use array presented storage for disk backup, it would be far more advisable to use SAN storage, where the volume(s) are presented and attached as just another form of local storage.

Backing up to NAS is one of those activities that falls into the realm of “just because you can do something doesn’t mean you should do it.”

[Edit, 2009-11-15]

In recent discussions with a couple of vendors, I’m willing to entertain the notion that backing up to NAS may be acceptable in an enterprise environment, but my caveat would still be a dedicated 10 Gbit ethernet link between the NAS server and the backup server.

Posted in Architecture, Backup theory | Tagged: , , | 3 Comments »

Merits of Virtual Tape Libraries (VTLs)

Posted by Preston on 2009-05-06

Or, “In 5 years time will we reflect on VTLs as an example of a bad direction in data protection?”

Introduction

Many people are convinced that VTLs are the bees knees – they offer backup to disk while still working within the bounds of a tape library (or libraries), are frequently considered to be “easier to conceptualise”, and are generally considered by many to be a good thing.

Therefore I want to preface what I’m about to discuss with the following:

  • The company I work for sells VTLs
  • I have actively proposed and recommended VTLs in particular scenarios
  • I will continue to actively propose and recommend VTLs in particular scenarios

Thus, I am not “anti-VTL” as such – I see them as representing valid usage in today’s enterprise backup market, though I don’t see them as a be-all and end-all replacement to traditional backup to disk. I don’t see them going away within the next few years either.

What I do see them as is a solution to symptoms, not problems. Indeed, I see VTLs as fundamentally inelegant. This is not to say that backup to disk is anything but inelegant itself; rather, the former is an inelegant triage option, and the latter is an inelegant solution.

Or to put it another way, I believe the world would be a better place if VTL did not exist if and only if disk backup worked as it should. This is regardless of whichever backup product you’re working in.

What is missing in disk backup

To elucidate my point that VTL is a solution for symptoms, not problems, I first need to elaborate on why VTLs are sometimes currently required – and to do that, I need to explain what’s wrong with disk backup.

  1. Temptation:
    • Since 99% of solutions still require tape (nothing says “off-site, off-line copy” better than tape), there is a temptation to try to keep an entire solution as “all tape”.
    • It’s too easy to put all your eggs in one basket – far too often sites that deploy disk backup do so on their production array (even if it is a dedicated set of LUNs comprising of disks that aren’t used for production data); this introduces array-level performance issues as a secondary concern, but most importantly, introduces significant potential for cascading failures as a result of insufficient redundancy.
  2. Filesystem performance:
    • Depending on the operating system and file system used, fragmentation over time can cause performance issues.
    • Depending on the operating system and file system used, checking a multi-terabyte filesystem for consistency after crashes may be operationally unfeasible.
  3. Unintelligent management:
    • Media management has grown out of working mainly with tape; for instance, it took until 6.5 for NetBackup to support failing over a backup from one disk device to another (when the first one fills). NetWorker still isn’t there yet.
    • Disk is everywhere; indeed, spare disk is everywhere. Disk backup fails to take advantage of any distributed processing and storage that would be available within even a moderate organisation. I.e., for the average organisation, DAS isn’t going away any time soon. So why not actually intelligently make use of it?
    • Access is still available to the contents of the disk backup filesystem, both to other applications, and other users. Perhaps more frustratingly, this is still required, but equally creates problems that should not exist.
    • Disk systems remain fundamentally more flexible than they are being used for in backup. It’s like having a Ferrari, but only ever driving around in 1st gear – or saying you’re good at ten pin bowling, even though you never play without bumper guards on the lanes. (I’d suggest that deduplication backup systems are the first good example of making intelligent and original use of backup to disk.)
    • Clients are typically unable to retrieve the data stored on disk backup without the presence of the backup server. While there are authorisation/security issues that must be considered, it’s wrong that disk backup requires the backup server active to be able to readily retrieve the data. Furthermore, this creates operational demands on the backup server that should not exist.

What many of these issues come down to is the following:

  1. Use of traditional OS filesystems introduces fundamental limitations to disk backup.
  2. Coming at disk backup from the long-term perspective of tape adds what I’d call “programmer baggage”.
  3. Psychologically for some people it is easier to accept, “you need virtual tape and tape within your environment” than it is to accept “you need disk backup and tape within your environment“. Or rather, a quality and potentially expensive array that’s presenting itself as tape seems a better investment than a quality and potentially expensive array that should only be used by the backup system. Crazy, but true. (Even more so, reluctance to purchase disk backup that is highly redundant with RAID, hot spares, etc., occurs often. Purchase of VTLs as “black boxes” at stated capacity that employ the same, if not more levels of RAID and hot spares, is seen as “OK” by the same who would quibble about RAID and hot spares for disk backup.)

I would argue that the issues above are not with the theoretical architecture and usage of disk backup, but with the actual implemented architecture and treatment of disk backup.

How to move disk backup forward

It’s clear that disk backup as an implementation, regardless of backup system or platform, has issues that hopefully over time will be addressed. Note however that I say hopefully, not probably.

So what needs to be done with disk backup?

  • Culturally, those who would shy from purchasing arrays (particularly those with redundancy) for the purpose of disk backup, but would happily sign a cheque for a VTL with a given/stated capacity need to, ahem, get over it. Just as it was necessary 10 years ago for the cultural shift to accepting that backup is necessary, there needs to be a cultural shift to understanding that it’s “six of one, half a dozen of the other”.
  • Backup vendors need to:
    • Ditch antiquated models of media management that are inherited from dealing with tape when dealing with disk. I’d argue that the deduplication products are the first true sign that this can be done.
    • Side-step the inherent limitations of filesystems and either implement their own, or come up with suitable raw-disk options that include appropriate accessibility tools (or liaise with operating system vendors to get filesystem variants designed exclusively for mass data storage needs).
    • Rearchitect their products to support massively distributed disk backup media. I liken this almost to the inverse of “the cloud”. Products like Mozy for instance (good, stable products for home users) backup to the cloud – the internet. The future of backup for enterprise though is not in the cloud, but in the earth. Let’s call earth based storage a paradigm where storage transcends individual operating system and filesystem boundaries and makes use of capacity no matter where it is within the logical bounds of an organisation.
  • Administrators and managers need to stop treating disk backup as “regular storage” that can be pinched and borrowed from. Did your backups fail because someone dropped a 1TB copy of a database onto the backup-to-disk area just because it was a nice big area? Guess what, that’s not the fault of the disk backup system.

What that means for VTL

Ultimately what this means for VTL (in my opinion), is that VTL is a solution to the problems inherent with the current state of implemented architecture for disk backup, not an alternative or better solution to the theoretical architecture of disk backup.

If disk backup were enhanced to reach a level of intelligent management and control that it is fundamentally capable of (being disk) it should erase the need for VTLs.

Back to the original question

Back then to our original question – will we, in 5 years time, reflect on VTLs as an example of a bad direction in data protection?

Yes, and no. Yes, we will because there’ll be a better understanding by that stage that VTLs are about triage. No, we won’t, because I don’t see disk backup architecturally reaching a point in 5 years time that it achieves everything it needs to in order to erase the need for VTLs.

Check back in 10 years though.

Posted in Architecture, Backup theory, General thoughts | Tagged: , , | 1 Comment »