NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.
  • Advertisements
  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Twitter

    Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Posts Tagged ‘VTL’

November’s top article

Posted by Preston on 2009-12-01

November saw the article, “Carry a jukebox with you (if you’re using Linux)” remain the top read story for another month. This details how to use the LinuxVTL open source software with NetWorker.

For those of you interested in setting this up for testing purposes, I’d also recommend reading the follow-up article I wrote this month, “NetWorker and LinuxVTL, redux“, which details recent advances Mark Harvey made in the code to allow NetWorker to use multiple virtual tape drives in the VTL. This makes LinuxVTL very capable as a supplement to a test or lab environment.

(As an aside, if you haven’t yet visited my new blog, I am the Anti-Cloud, you may want to flag it for reading. At Anti-Cloud, my goal is to point out the inadequacies of current attitudes by Public Cloud providers towards their customers, deflate some of the ridiculous hype that has grown out of Cloud Buzzword levels, and point out that not all of the revolutionary features are all that new, or revolutionary.)

Advertisements

Posted in Aside, NetWorker | Tagged: , , | Comments Off on November’s top article

EMC, Data Domain, VTLs and Disk Backup

Posted by Preston on 2009-11-30

With their recent acquisition of Data Domain, some people at EMC have become table thumping experts overnight on why you it’s absolutely imperative that you backup to Data Domain boxes as disk backup over NAS, rather than a fibre-channel connected VTL.

Their argument seems to come from the numbers – the wrong numbers.

The numbers constantly quoted are number of sales of disk backup Data Domain vs VTL Data Domain. That is, some EMC and Data Domain reps will confidently assert that by the numbers, a significantly higher percentage of Data Domain for Disk Backup has been sold than Data Domain with VTL. That’s like saying that Windows is superior to Mac OS X because it sells more. Or to perhaps pick a little less controversial topic, it’s like saying that DDS is better than LTO because there’s been more DDS drives and tapes sold than there’s ever been LTO drives and tapes.

I.e., an argument by those numbers doesn’t wash. It rarely has, it rarely will, and nor should it. (Otherwise we’d all be afraid of sailing too far from shore because that’s how it had always been done before…)

Let’s look at the reality of how disk backup currently stacks up in NetWorker. And let’s preface this by saying that if backup products actually started using disk backup properly tomorrow, I would be the first to shout “Don’t let the door hit your butt on the way out” to every VTL on the planet. As a concept, I wish VTLs didn’t have to exist, but in the practical real world, I recognise their need and their current ascendency over ADV_FILE. I have, almost literally at times, been dragged kicking and screaming to that conclusion.

Disk Backup, using ADV_FILE type devices in NetWorker:

  • Can’t move a saveset from a full disk backup unit to a non-full one; you have to clear the space first.
  • Can’t simultaneously clone from, stage from, backup to and recover from a disk backup unit. No, you can’t do that with tape either, but when disk backup units are typically in the order of several terabytes, and virtual tapes are in the order of maybe 50-200 GB, that’s a heck of a lot less contention time for any one backup.
  • Use tape/tape drive selection algorithms for deciding which disk backup unit gets used in which order, resulting in worst case capacity usage scenarios in almost all instances.
  • Can’t accept a saveset bigger than the disk backup unit. (It’s like, “Hello, AMANDA, I borrowed some ideas from you!”)
  • Can’t be part-replicated between sites. If you’ve got two VTLs and you really need to do back-end replication, you can replicate individual pieces of media between sites – again, significantly smaller than entire disk backup units. When you define disk backup units in NetWorker, that’s the “smallest” media you get.
  • Are traditionally space wasteful. NetWorker’s limited staging routines encourages clumps of disk backup space by destination pool – e.g., “here’s my daily disk backup units, I use them 30 days out of 31, and those over there that occupy the same amount of space (practically) are my monthly disk backup units, I use them 1 day out of 31. The rest of the time they sit idle.”
  • Have poor staging options (I’ll do another post this week on one way to improve on this).

If you get a table thumping sales person trying to tell you that you should buy Data Domain for Disk Backup for NetWorker, I’d suggest thumping the table back – you want the VTL option instead, and you want EMC to fix ADV_FILE.

Honestly EMC, I’ll lead the charge once ADV_FILE is fixed. I’ll champion it until I’m blue in the face, then suck from an oxygen tank and keep going – like I used to, before the inadequacies got too much. Until then though, I’ll keep skewering that argument of superiority by sales numbers.

Posted in Architecture, NetWorker | Tagged: , , , , , , | 3 Comments »

Quibbles – The maddening shortfall of ADV_FILE

Posted by Preston on 2009-11-25

Everyone who has worked with ADV_FILE devices knows this situation: a disk backup unit fills, and the saveset(s) being written hang until you clear up space, because as we know savesets in progress can’t be moved from one device to another:

Savesets hung on full ADV_FILE device until space is cleared

Honestly, what makes me really angry (I’m talking Marvin the Martian really angry here) is that if a tape device fills and another tape of the same pool is currently mounted, NetWorker will continue to write the saveset on the next available device:

Saveset moving from one tape device to another

What’s more, if it fills and there’s a drive that currently does have a tape mounted, NetWorker will mount a new tape in that drive and continue the backup in preference to dismounting the full tape and reloading a volume in the current drive.

There’s an expression for the behavioural discrepancy here: That sucks.

If anyone wonders why I say VTLs shouldn’t need to exist, but I still go and recommend them and use them, that’s your number one reason.

Posted in NetWorker, Quibbles | Tagged: , , , , , | 2 Comments »

NetWorker and linuxvtl, Redux

Posted by Preston on 2009-11-14

Some time ago, I posted a blog entry titled Carry a Jukebox with you, if you’re using Linux, which referred to using linuxvtl with NetWorker. The linuxvtl project is run by my friend Mark Harvey, who has been working with enterprise backup products as long as me.

At the time I blogged, the key problem with the LinuxVTL implementation was that NetWorker didn’t recognise the alternate device IDs generated by the code – it relied on WWNN’s, which were the same for each device.

I was over the moon when I received an email from Mark a short while ago saying he’s now got multiple devices working in a way that is compatible with NetWorker. This is a huge step forward for Linux VTL.

So, what’s changed?

While I’ve not had confirmation from Mark, I’m working on the basis that you do need the latest source code (mhvtl-2009-11-10.tgz as of the time of writing).

The next step, to quote Mark, is that we need to step away from StorageTek and define the library as SpectraLogic:

p.s. The “fix” is to define the robot as a Spectralogic NOT an L700.
The STK L700 does not follow the SMC standards too well. It looks like
NetWorker uses the ‘L700’ version and not the standards.
The Spectralogic follows the SMC standards (or at least their
interruption is the same as mine :) )

The final part is to update the configuration files to include details that allow the VTL code to generate unique WWNNs for NetWorker’s use.

Starting out with just 2 devices, here’s what my inquire output now looks like:

[root@tara ~]# inquire -l

-l flag found: searching all LUNs, which may take over 10 minutes per adapter
	for some fibre channel adapters.  Please be patient.

scsidev@0.0.0:SPECTRA PYTHON    5500|Autochanger (Jukebox), /dev/sg2
			        S/N:	XYZZY
			        ATNN=SPECTRA PYTHON          XYZZY
			        WWNN=11223344ABCDEF00
scsidev@0.1.0:QUANTUM SDLT600   5500|Tape, /dev/nst0
			        S/N:	ZF7584364
			        ATNN=QUANTUM SDLT600         ZF7584364
			        WWNN=11223344ABCDEF01
scsidev@0.2.0:QUANTUM SDLT600   5500|Tape, /dev/nst1
			        S/N:	ZF7584366
			        ATNN=QUANTUM SDLT600         ZF7584366
			        WWNN=11223344ABCDEF02

As you can see – each device has a different WWNN now, which is instrumental for NetWorker. (Note, I have adjusted the spacing slightly to make sure it fits in.)

Finally, here’s what my /etc/mhvtl/device.conf and /etc/mhvtl/library_contents files now look like:

[root@tara mhvtl]# cat device.conf 

VERSION: 2

# VPD page format:
# <page #> <Length> <x> <x+1>... <x+n>

# NOTE: The order of records is IMPORTANT...
# The 'Unit serial number:' should be last (except for VPD data)
# i.e.
# Order is : Vendor ID, Product ID, Product Rev and serial number finally
# Zero, one or more VPD entries.
#
# Each 'record' is sperated by one (or more) blank lines.
# Each 'record' starts at column 1

Library: 0 CHANNEL: 0 TARGET: 0 LUN: 0
 Vendor identification: SPECTRA
 Product identification: PYTHON
 Product revision level: 5500
 Unit serial number: XYZZY
 NAA: 11:22:33:44:ab:cd:ef:00

Drive: 1 CHANNEL: 0 TARGET: 1 LUN: 0
 Vendor identification: QUANTUM
 Product identification: SDLT600
 Product revision level: 5500
 Max density: 0x46
 NAA: 11:22:33:44:ab:cd:ef:01
 Unit serial number: ZF7584364
 VPD: b0 04 00 02 01 00

Drive: 2 CHANNEL: 0 TARGET: 2 LUN: 0
 Vendor identification: QUANTUM
 Product identification: SDLT600
 Product revision level: 5500
 Max density: 0x46
 NAA: 11:22:33:44:ab:cd:ef:02
 Unit serial number: ZF7584366
 VPD: b0 04 00 02 01 00

[root@tara mhvtl]# cat library_contents
# Define how many tape drives you want in the vtl..
# The ‘XYZZY_…’ is the serial number assigned to
# this tape device.
Drive 1: ZF7584364
Drive 2: ZF7584366
# Place holder for the robotic arm. Not really used.
Picker 1:
# Media Access Port
# (mailslots, Cartridge Access Port, <insert your favourate name here>)
# Again, define how many MAPs this vtl will contain.
MAP 1:
MAP 2:
MAP 3:
MAP 4:
# And the ‘big’ on, define your media and in which slot contains media.
# When the rc script is started, all media listed here will be created
# using the default media capacity.
Slot 1: 800843S3
Slot 2: 800844S3
Slot 3: 800845S3
Slot 4: 800846S3
Slot 5: 800847S3
Slot 6: 800848S3
Slot 7: 800849S3
Slot 8: 800850S3
Slot 9: 800851S3
Slot 10: 800852S3
Slot 11: 800853S3
Slot 12: 800854S3
Slot 13: 800855S3
Slot 14: 800856S3
Slot 15: 800857S3
Slot 16: 800858S3
Slot 17: 800859S3
Slot 18: 800860S3
Slot 19: 800861S3
Slot 20: 800862S3
Slot 21: BIG990S3
Slot 22: BIG991S3
Slot 23: BIG992S3
Slot 24: BIG993S3
Slot 25: BIG994S3
Slot 26: BIG995S3
Slot 27: BIG996S3
Slot 28: BIG997S3
Slot 29: BIG998S3
Slot 30: BIG999S3
Slot 31: CLN001L1
Slot 32: CLN002L1

NOTE in the “device.conf” file the NAA entries – these are key!

With these changes done, jbconfig worked without missing a beat, and suddenly I had a 2 drive VTL running.

Great going, Mark!

While I’ve not yet tested, I suspect this fix will also ensure that the VTL can be configured on multiple storage nodes, which will be a fantastic improvement for library support work as well.

[Edit, 2009-11-18]

I’m pleased to say that the changes that have been made allow for the VTL to be created on more than one storage node. This presents excellent opportunities for debugging, testing and training:

LinuxVTL on server and storage node

Posted in Linux, NetWorker | Tagged: , , | 5 Comments »

Merits of target based deduplication

Posted by Preston on 2009-11-12

It goes without a doubt that we have to get smarter about storage. While I’m probably somewhat excessive in my personal storage requirements, I currently have 13TB of storage attached to my desktop machine alone. If I can do that at the desktop, think of what it means at the server level…

As disk capacities continue to increase, we have to work more towards intelligent use of storage rather than continuing the practice of just bolting on extra TBs whenever we want because it’s “easier”.

One of the things that we can do to more intelligently manage storage requirements for either operational or support production systems is to deploy deduplication where it makes sense.

That being said, the real merits of target based deduplication become most apparent when we compare it to source based deduplication, which is where the majority of this article will now take us.

A lot of people are really excited about source level deduplication, but like so many areas in backup, it’s not a magic bullet. In particular, I see proponents of source based deduplication start waving magic wands consisting of:

  1. “It will reduce the amount of data you transmit across the network!”
  2. “It’s good for WAN backups!”
  3. “Your total backup storage is much smaller!”

While each of these facts are true, they all come with big buts. From the outset, I don’t want it said that I’m vehemently opposed to source based deduplication; however, I will say that target based deduplication often has greater merits.

For the first item, this shouldn’t always be seen as a glowing recommendation. Indeed, it should only come into play if the network is a primary bottleneck – and that’s more likely going to be the case if doing WAN based backups as opposed to regular backups.

In regular backups while there may be some benefit to reducing the amount of data transmitted, what you’re often not told is that this reduction comes at a cost – that being increased processor and/or memory load on the clients. Source based deduplication naturally has to shift some of the processing load back across to the client – otherwise the data will be transmitted and thrown away. (And otherwise proponents wouldn’t argue that you’ll transmit less data by using source based backup.)

So number one, if someone is blithely telling you that you’ll push less data across your network, ask yourself the following questions:

(a) Do I really need to push less data across the network? (I.e., is the network the bottleneck at all?)

(b) Can my clients sustain a 10% to 15% load increase in processing requirements during backup activities?

This makes the first advantage of source based deduplication somewhat less tangible than it normally comes across as.

Onto the second proposed advantage of source based deduplication – faster WAN based backups. Undoubtedly, this is true, since we don’t have to ship anywhere near as much data across the network. However, consider that we backup in order to recover. You may be able to reduce the amount of data you send across the WAN to backup, but unless you plan very carefully you may put yourself into a situation where recoveries aren’t all that useful. That is – you need to be careful to avoid trickle based recoveries. This often means that it’s necessary to put a source based deduplication node in each WAN connected site, with those nodes replicating to a central location. What’s the problem with this? Well, none from a recovery perspective – but it can considerably blow out the cost. Again, informed decisions are very important to counter-balance source based deduplication hyperbole.

Finally – “your total backup storage is much smaller!”. This is true, but it’s equally an advantage of target based deduplication as well; while the rates may have some variance the savings are still great regardless.

Now let’s look at a couple of other factors of source based deduplication that aren’t always discussed:

  1. Depending on the product you choose, you may get less OS and database support than you’re getting from your current backup product.
  2. The backup processes and clients will change. Sometimes quite considerably, depending on whether your vendor supports integration of deduplication backup with your current backup environment, or whether you need to change the product entirely.

When we look at those above two concerns is when target based deduplication really starts to shine. You still get deduplication, but with significantly less interruption to your environment and your processes.

Regardless of whether target based deduplication is integrated into the backup environment as a VTL, or whether it’s integrated as a traditional backup to disk device, you’re not changing how the clients work. That means whatever operating systems and databases you’re currently backing up you’ll be able to continue to backup, and you won’t end up in the (rather unpleasant) situation of having different products for different parts of your backup environment. That’s hardly a holistic approach. It may also be the case that the hosts where you’d get the most out of deduplication aren’t eligible for it – again, something that won’t happen with target based deduplication.

The changes for integrating target based deduplication in your environment are quite small –  you just change where you’re sending your backups to, and let the device(s) handle the deduplication, regardless of what operating system or database or application or type of data is being sent. Now that’s seamless.

Equally so, you don’t need to change your backup processes for your current clients – if it’s not broken, don’t fix it, as the saying goes. While this can be seen by some as an argument for stagnation, it’s not; change for the sake of change is not always appropriate, whereas predictability and reliability are very important factors to consider in a data protection environment.

Overall, I prefer target based deduplication. It integrates better with existing backup products, reduces the number of changes required, and does not place restrictions on the data you’re currently backing up.

Posted in Backup theory, NetWorker | Tagged: , , , , | 5 Comments »

Most popular in August

Posted by Preston on 2009-09-01

The most visited post in August was again, Carry a jukebox with you (if you’re using Linux). I think part of this must be attributed to the linkage of Linux with Free. I.e., because Linux is seen as low cost (or no cost), there’s a core group, particularly of open source fans, who want to come up with a totally free solution for their environment, no matter what environment that is.

However, I don’t think that’s all that can be attributed to why this article keep on drawing people in. Despite my reservations about VTL, a lot of people are interested in deploying them. It’s important to stress again – I don’t dislike VTLs, I just wish we didn’t need them. Recognising though that we do need them, I can appreciate the management benefits that they bring to an environment.

From a support perspective of course I’m a big fan – with a VTL I can carry a jukebox around wherever I go.

The Linux VTL post even beat out old standards – the parallelism and NSR peer information related posts, which normally win hands down every month.

(From a policy and procedural perspective though, it was good to see that the introductory post to zero error policies, What is a Zero Error Policy?, got the next most attention. I can’t really stress enough how important I think zero error policies are to systems management in general, and backup/data protection specifically.)

Posted in Aside, NetWorker | Tagged: , , | Comments Off on Most popular in August

When will tape die?

Posted by Preston on 2009-08-10

As you may have noticed, I have a great deal of disrespect for “tape is dead” stories. To be blunt, I think they’re about as plausible as theories that the moon landing was faked.

So I thought I might list the criteria I think will have to happen in order for tape to die:

  1. SSD will need to offer the same capacity, shelf-life and price as equivalent storage tape.

There’s been a lot of talk lately of MAIDs – Massive Arrays of Idle Disks – being the successor/killer to tape, on the premise that such arrays would allow large amounts of either snapshotted or deduplicated data to be kept online, replicated into multiple locations, and otherwise in a night-perfect nearline state.

This isn’t the way of the future. Like VTL, MAIDs are a stop-gap measure that will fulfill specific issues to do with tape, but not replace tape. Like VTLs, if the building is burning down you can’t rush into the computer room, grab the MAID and run out like you can with a handful of tapes. Equally similarly to VTLs and disk backup units, it’s entirely conceivable of a targetted virus/trojan (or even a mistake) wiping out the content of a MAID.

No, we won’t get to the point where tape can “die” until such time as there is a high speed, safe, and comparatively cheap removable format/media that offers the same level of true offline protection.

The trouble with this is simple – it’s a constantly moving goalpost. Restricting ourselves to just LTO for the purposes of this discussion, it’s conceivable that SSDs might, in a few years, catch up with LTO-4; however, with LTO-5 due out “soon”, and LTO-6 on the roadmap, SSDs don’t need to catch up with a static format, they need to catch up with a format that is continuing to improve and expand, both in speed and capacity.

So perhaps, instead of being so narrow as to suggest that tape might die when SSDs catch up, it might be more accurate to suggest that tape may have a chance of being replaced when some new technology evolves with sufficient density, price-point, performance and portability that it makes like-for-like replacement possible.

There are “old timers” in the computer industry who can tell me stories of punch card systems and valve computers. I’m a “medium timer” so to speak in that I can tell stories to more youthful people in computing about working with printer-terminals, programming in RPG and reel-to-reel tape. So, do I envisage in 10-20 years time trying to explain what “tape” was to people just starting in the industry?

No.

Posted in Architecture, Backup theory, NetWorker | Tagged: , , , , | 7 Comments »

VTLs and media load/unload times

Posted by Preston on 2009-08-08

VTLs are fast, right? There’s no physical media loads or unloads associated with tape loads and unloads, after all.

That’s the way the problem normally starts. I’ve periodically seen companies with VTLs make the assumption that just because there’s no correlation between tape load/unload operations and physical media operations, it’s safe to dial down the autochanger sleep times for load and unload operations.

If you’re not sure what I’m talking about, they’re here:

Autochanger load and unload sleep settings

Autochanger load and unload sleep settings

So, given there’s no physical media to be loaded/unloaded, or robot head to do the loading/unloading, the temptation is to dial down the sleep timers to 1, or even 0.

The problem with this is the assumption that being all software, a VTL is so insanely fast that it doesn’t need any timers associated with its operations.

So, inevitably, what I’ve seen when the load/unload sleep timers are dialled down too low, is that odd autochanger errors start to creep into operations – typically when there’s a bunch of virtual tapes requiring labelling/recycling, or there’s a lot of virtual tapes being loaded/unloaded during busy backup operations.

I’d therefore make the following recommendations:

  • Never set the load or unload sleep timers to 1 or 0, even if basic testing shows it to be OK.
  • To determine appropriate settings, drop the timers from their default of 5 to 4 and see how backups run for a few days. If there are no issues you can repeat down to 3 seconds, then 2 seconds, but as per the above, don’t go below 2.

While backup performance is (as much as anything) about shaving off critical seconds here and there, making those time savings at the risk of introducing issues, particularly issues that come up most under load, should be avoided at all times.

Posted in NetWorker | Tagged: , , | Comments Off on VTLs and media load/unload times

Last month’s top story

Posted by Preston on 2009-08-01

I thought that I might from now on try to do a brief comment at the start of each month on what the most popular story of the previous month was.

There is one caveat – as I aluded to in the previous “Top 5” post, I have some posts that get hit an awful lot of times. So, the absolute most-referenced posts, being “Fixing NSR peer information”, “Parallelism in NetWorker” and “Changing saveset browse/retention time” are effectively disqualified for making it into consideration for a “Last month’s top story”.

For this inuagural entry, we have “Carry a jukebox with you (if you’re using Linux)“. Outside of the above 3 articles, this one was viewed the most – and, for what it’s worth, generated a lot of follow-through clicks going through to Mark Harvey’s linuxvtl web page.While not a production VTL, Mark’s Linux VTL software has already given me a great deal of efficiencies over this last month in my lab environment. I have versions of NetWorker on lab servers all with the VTL configured, making testing of a wide variety of options considerably easier than having physical tape libraries connected and powered on. I hope others are finding it similarly useful. One of the comments to the article was someone asking about a more complete set of instructions for getting the VTL up and running – I aim to have this done by the end of this weekend.

Posted in Aside, Linux | Tagged: , , | Comments Off on Last month’s top story

Carry a jukebox with you (if you’re using Linux)

Posted by Preston on 2009-07-13

Hi!

To read this article, please go to the maintained version at the NetWorker Information Hub.

Posted in Aside, NetWorker | Tagged: , , , , | 23 Comments »