NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Archive for the ‘NetWorker’ Category

Please change your channel: nsrd is moving!

Posted by Preston on 2009-12-22

I have greatly enjoyed running this site on WordPress for the last year, but I’ve decided I want to make a more complete site for NetWorker resources, and to do that, I need more than just a blog. So, as of this article, I’m moving the blog across to:

http://nsrd.info/blog

I’ll be leaving this blog around on WordPress in its current state for some time to come – there’s a lot of search engine results that link to it; however, if you’ve got links to this blog either in your bookmarks or on your website, I’d greatly appreciate a change of link across to nsrd.info/blog.

While I’m planning on hosting more than just a blog on the new site, for the moment I’ve focused on the blog transfer. The rest of the site is my summer holiday project*.


* Remember, I’m in the southern hemisphere, so that means something I’m actively working on now :-)

Posted in NetWorker | Tagged: | Comments Off on Please change your channel: nsrd is moving!

Show me the man pages

Posted by Preston on 2009-12-21

As a long term Unix admin, it’s frustrating when there are commands on my systems for which there aren’t man pages. As a long-term NetWorker user, it’s equally frustrating when there aren’t man pages for particular NetWorker commands.

When I’ve discussed this in the past, I’ve usually had a response of “that’s because you shouldn’t be running that command”. That’s a bad response. The correct response should be something along the lines of “oops, we’ll write a man page for the next release that states:

That command is for internal NetWorker use only. It does X. It should not be run manually.

Having undocumented commands that give no output, hang or produce strange results is just inviting frustration. Of just the nsr prefixed commands, on my current 7.6 lab server, the following commands are undocumented:

  • nsravamar
  • nsravtar
  • nsrbmr
  • nsrcatconfig
  • nsr_cp_install
  • nsrdmpix
  • nsrdsa_recover
  • nsrdsa_save
  • nsrfile
  • nsrfsra
  • nsrlmc
  • nsrndmp_2fh
  • nsrrcopy
  • nsrrcopy2
  • nsrvcbserv_tool

So out of the 55 nsr prefixed commands I have on my server, 15 (or 27%) are undocumented.

Note to EMC: This does not produce a healthy level of trust. Please – get some documentation on these commands, even if that documentation gives us a one line overview of where they’re used and tells us not to run them ourselves.

Posted in General Technology, General thoughts, NetWorker | Tagged: , , , | Comments Off on Show me the man pages

Fingers Crossed for some New Years Resolutions from EMC

Posted by Preston on 2009-12-15

Over at Storagebod’s Blog, Martin Glassborow has been providing a highly interesting and entertaining set of letters to Father Christmas, which I’d highly recommend reading.

So in the spirit of Martin’s postings, I’m going to do a slight variant – here’s a few things that I’ll be keeping my fingers crossed for that EMC will provide us in NetWorker in 2010.

Dear EMC,

Could you please add to your NetWorker New Years Resolution the following items?

  • Windows 2008 Service Pack 2 support as soon as possible!
  • Improvements for ADV_FILE devices:
    • Savesets can spill over from one disk backup unit to another should the first become full.
    • Better rotation/selection method for devices/volumes.
  • Tapes:
    • Ability to query the “offsite” mminfo flag.
  • Cloning:
    • Inline cloning (simultaneous clone + original generation).
    • Fixing the “validcopies” flag!
  • NetWorker Management Console:
    • Manual client backup and recovery operations integrated into the console.
    • Manual module backup and recovery operations integrated into the console.
  • Resources/Configuration:
    • Ability to rename clients (and other resources).
    • Syntax for including another directive within a directive.
    • PIDs for nsrmmds stored within the device resource for easy viewing.
  • Operations:
    • Client GUI manual backups to support pools other than Default.
    • More granular savesets.

I could think of a bunch of other enhancements, but these are the ones I’m hoping most are on your new years resolution list for NetWorker in 2010. I’m hoping I don’t have to put any of these on a wish list for your 2011 new years resolution list.

Thanks, and have a happy new year!

Preston de Guise.

Posted in NetWorker | Tagged: , , | 2 Comments »

The pros and cons of Legato License Manager

Posted by Preston on 2009-12-15

Bundled with NetWorker for some time now has been the peripheral product, Legato License Manager (LLM). (Now normally just referred to as “License Manager”.)

If you’ve never used LLM, you may wonder what use it serves. But to do that, we first need to look at the control zone, that region of space that encompasses one or more NetWorker datazones. This might resemble the following:

Control and datazonesBoth the NetWorker Management Console (NMC) server (typically the “gst” processes) and LLM reside in the control zone – that is, they exist to service multiple datazones. However, in the same way that many sites end up running the datazone and control zone (via NMC/GST) on the same backup server, there’s nothing preventing you from using LLM to manage licenses separately to the core NetWorker services.

Making this transition is relatively straight forward, and I’ll save doing an article on that aspect unless people would like to see one – instead, I’d like to discuss the pros and cons of having licenses outside of NetWorker but still referenced by a NetWorker server.

Advantages of LLM

It’s best to first understand what LLM brings to you. I’ll use the following keys beside the advantages to let you know where they’re applicable:

  • (D+) – Useful for multiple datazones.
  • (1D) – Useful for single datazones.
  • (M) – Marketing advantage; touted as a bonus by EMC/Legato, rarely if ever used.

So, some of the advantages are:

  • (D+) Licenses may be purchased and installed in a central location.
  • (D+) Licenses may be reallocated between NetWorker servers as resource requirements/capabilities change (i.e., released from one server, and snapped up by another).
  • (1D/D+) License authorisation codes are tied to the LLM server, not the NetWorker server. Therefore if you’ve got an environment where you’re planning on doing some NetWorker server migrations, you can move your licenses to LLM and not have to repeatedly do host transfers.
  • (M) You can buy “bulk” licenses. (E.g., 10 x 5 Client Connection Licenses). This advantage seems to be minimising the number of licenses you need to enter. While this sounds cute in theory, I think it actually adds a layer to license complexity.
  • (1D/D+) Licenses reported in NMC show the exact features they are being used for – e.g., instead of showing “NetWorker Server, Network Edition/125” to indicate that the server is licensed for 125 clients, you might see “NetWorker Server, Network Edition/93” to indicate that currently there are 93 client licenses being used.
  • (M) LLM is where the full version of NMC is licensed, so you only have to learn one licensing administration system.

Disadvantages of LLM

The advantages of LLM don’t come without some disadvantages too. These are:

  • Not all licenses work all the time with LLM. For example, historically there has been ongoing issues with creating dedicated storage nodes in an environment using LLM. Typically it’s been necessary to add both a dedicated and a full storage node license, create the dedicated storage node devices, then delete the full storage node license. (Messy.)
  • When working outside of NMC, the command line access to LLM licenses (via lgtolic) is even more esoteric and preposterous to use than nsrcap and nsradmin.

Should you use LLM?

It depends on your site. If you’re a fairly small environment, I can’t see any purpose for switching from NetWorker licensing to LLM; however, if your site has a larger number of clients and is reasonably dynamic in client allocations, LLM may give you that extra easy oversight to justify switching to it even if you’ve only got one datazone in your environment.

Alternatively, if you’re planning to transition your backup server a few times over a shorter period (e.g., migrating from current old hardware to interim hardware to full new hardware), then moving licenses out of NetWorker and into LLM may save you the hassle of getting them re-authorised at each step.

Should your LLM server be considered “production”?

If you’re using LLM, the obvious question that some will ask is – can this just be plonked on any old desktop PC? The answer is no. Well, to qualify that answer a bit – nothing prevents you from doing this except common sense. By all means have this running as a virtual machine somewhere, but it should still be considered a production machine in the same way that a backup server is a production machine: it’s part of support production rather than primary production, but it’s still production.

Posted in Licensing, NetWorker | Tagged: , , , , | 4 Comments »

Long term NetWare recovery

Posted by Preston on 2009-12-10

Are you still backing up Novell NetWare hosts? If you are, I hope you’re actively considering what you’re going to do in relation to NetWare recoveries in March 2010, when NetWare support ceases from both Novell and EMC.

I still have a lot of customers backing up NetWare hosts, and I’m sure my customer set isn’t unique. While Novell still tries to convince customers to switch from traditional NetWare services to NetWare on OES/SLES, a lot of companies are continuing to use NetWare until “the last minute”.

The “last minute” is of course, March 2010, when standard support for NetWare finishes.

Originally, NetWare support in NetWorker was scheduled to finish in March 2009, but partners and customers managed to convince EMC to extend the support to March 2010, to match Symantec and co-terminate with Novell’s end of standard support for NetWare as well.

Now it’s time we start considering what happens when that support finishes. Namely:

  1. How will you recover long term NetWare backups?
  2. How will you still run NetWare systems?
  3. How will you manage NetWorker upgrades?

These are all fairly important questions. While we’re hopeful we might get some options for recovering NetWare backups on OES systems (i.e., pseudo cross-platform recoveries), there’s obviously no guarantees of that as yet.

So the question is – if you’re still using NetWare, how do you go about guaranteeing you can recover NetWare backups once NetWare has been phased out of existence?

The initial recommendation from Novell on this topic is: keep a NetWare box around.

I think this is a short-sighted recommendation on their part, and shows that they haven’t properly managed (internally) the transition from traditional NetWare to NetWare on OES/SLES. This is perhaps why there isn’t a 100% transition from one NetWare platform to the other. Being faced with unpalatable transition options, some Novell customers are instead considering alternate transitionary options.

Unfortunately, in the short term, I don’t see there being many options. I’m therefore inclined to recommend that:

  1. Companies backing up traditional NetWare who only need to continue to recover a very small number of backups consider performing an old-school migration – recover the data to a host, and backup on an operating system that will continue to enjoy OS vendor and EMC support moving forward.
  2. Companies backing up larger amounts of traditional NetWare should consider virtualising at least one, preferably a few more NetWare systems before end of support, and keeping good archival VM backups (to avoid having to do a reinstall), using those systems as recovery points for older NetWare data.

The longer-term concern is that the NetWare client in NetWorker has always been … interesting. Once NetWare support vanishes, the primary consideration for newer versions of NetWorker will be whether those newer versions actually support the old 7.2 NetWare client for recovery purposes.

With this in mind, it will become even more important to carefully review release notes and conduct test upgrades when new releases of NetWorker come out to confirm whether newer versions of the server software actually support communicating with the increasingly older NetWare client until such time as recovery from those NetWare backups is no longer required.

You may think this is a bit extreme, but bear in mind we don’t often see entire operating systems get phased out of existence, so it’s not a common problem. To be sure, individual iterations or releases may drop out of support (e.g., Solaris 6), but the entire operating system platform (e.g., Solaris, or even more generally, Unix) tends to stay in some level of support. In fact, the last time I think I recall an entire OS platform slipping out of NetWorker support was Banyan Vines, and the last client version released for that was 3 point something. (Data General Unix (DGUX) may have ceased being supported more recently, but overall the Unix platform has remained in support.)

If you’re still backing up NetWare servers and you’re not yet considering how you’re going to recover NetWare backups post March 2010, it’s time to give serious consideration to it.

Posted in NetWorker | Tagged: , , , , , , , , , , | Comments Off on Long term NetWare recovery

Validcopies hazardous to your sanity

Posted by Preston on 2009-12-04

While much of NetWorker 7.6’s enhancements have been surrounding updates to virtualisation or (urgh) cloud, there remains a bunch of smaller updates that are of interest.

One of those new features is the validcopies flag, something I unfortunately failed to check out in beta testing. It looks like it could use some more work, but the theory is a good one. The idea behind validcopies is that we can use it in VTL style situations to determine not only whether we’ve got an appropriate number of copies, but they’re also valid – i.e., they’re usable by NetWorker for recovery purposes.

It’s a shame it’s too buggy to be used.

Here’s an example where I backup to an ADV_FILE type device:

[root@tara ~]# save -b Default -e "+3 weeks" -LL -q /usr/share
57777:save:Multiple client instances of tara.pmdg.lab, using the first entry
save: /usr/share  1244 MB 00:03:23  87843 files
completed savetime=1259366579

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1"
 volume        client       date      size   level  name
Default.001    tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share
Default.001.RO tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies
6095:mminfo: no matches found for the query

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1"
 volume        client       date      size   level  name
Default.001    tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share
Default.001.RO tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies
6095:mminfo: no matches found for the query

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies,copies
 validcopies copies
 2     2
 2     2

I have a few problems with the above output, and am working through the bugs in validcopies with EMC. Let’s look at each of those items and see what I’m concerned about:

  1. We don’t have more than one valid copy just because it’s sitting on an ADV_FILE device. If the purpose of the “validcopies” flag is to count the number of unique recoverable copies, we do not have 2 copies for each instance on ADV_FILE. There should be some logic there to not count copies on ADV_FILE devices twice for valid copy counts.
  2. As you can see from the last two commands, the results found differ depending on report options. This is inappropriate, to say the least. We’re getting no validcopies reported at all if we only look for validcopies, or 2 validcopies reported if we search for both validcopies and copies.

Verdict from the above:

  • Don’t use validcopies for disk backup units.
  • Don’t report on validcopies only, or you’ll skew your results.

Let’s move on to VTLs though – we’ll clone the saveset I just generated to the ADV_FILE type over to the VTL:

[root@tara ~]# mminfo -q "volume=Default.001.RO" -r ssid,cloneid
 ssid         clone id
4279265459  1259366578

[root@tara ~]# nsrclone -b "Big Clone" -v -S 4279265459/1259366578
5874:nsrclone: Automatically copying save sets(s) to other volume(s)
6216:nsrclone:
Starting cloning operation...
Nov 28 11:29:42 tara logger: NetWorker media: (waiting) Waiting for 1 writable volume(s)
to backup pool 'Big Clone' tape(s) or disk(s) on tara.pmdg.lab
5884:nsrclone: Successfully cloned all requested save sets
5886:nsrclone: Clones were written to the following volume(s):
 BIG998S3

[root@tara ~]# mminfo -q "ssid=4279265459" -r validcopies
 0

[root@tara ~]# mminfo -q "ssid=4279265459" -r copies,validcopies
 copies validcopies
 3          3
 3          3
 3          3

In the above instance, if we query just by the saveset ID for the number of valid copies, NetWorker happily tells us “0”. If we query for copies and validcopies, we get 3 of each.

So, what does this say to me? Steer away from ‘validcopies’ until it’s fixed.

(On a side note, why does the offsite parameter remain Write Only? We can’t query it through mminfo, and I’ve had an RFE in since the day the offsite option was introduced into nsrmm. Why this is “hard” or taking so long is beyond me.)

Posted in Features, NetWorker, Scripting | Tagged: , , , | Comments Off on Validcopies hazardous to your sanity

Recovery reporting comes to NetWorker

Posted by Preston on 2009-12-02

One of the areas where administrators have been rightly able to criticise NetWorker has been the lack of reporting or auditing options to do with recoveries. While some information has always been retrievable from the daemon logs, it’s been only basic and depends on keeping the logs. (Which you should of course always do.)

NetWorker 7.6 however does bring in recovery reporting, which starts to rectify those criticisms. Now in the enterprise reporting section, you’ll find the following section:

  • NetWorker Recover
    • Server Summary
    • Client Summary
    • Recover Details
    • Recover Summary over Time

Of these reporting options, I think the average administrator will want the bottom two the most, unless they operate in an environment where clients are billed for recoveries.

Let’s look at the Recover Summary over Time report:

Recover summary over time

This presents a fairly simple summary of the recoveries that have been done on a per-client basis, including the number of files recovered, the amount of data recovered and the breakdown of successful vs failed recovery actions.

I particularly like the Recover Details report though:

Recover Details report

(Click the picture to see the entire width.)

As you can see there, we get a per user breakdown of recovery activities, when they were started, how long they took, how much data was recovered, etc.

These reports are a brilliant and much needed addition to NetWorker reporting capabilities, and I’m pleased to see EMC has finally put them into the product.

There’s probably one thing still missing that I can see administrators wanting to see – file lists of recovery sessions. Hopefully 7.(6+x) would see that report option though.

Posted in NetWorker | Tagged: , , , , , , | 2 Comments »

November’s top article

Posted by Preston on 2009-12-01

November saw the article, “Carry a jukebox with you (if you’re using Linux)” remain the top read story for another month. This details how to use the LinuxVTL open source software with NetWorker.

For those of you interested in setting this up for testing purposes, I’d also recommend reading the follow-up article I wrote this month, “NetWorker and LinuxVTL, redux“, which details recent advances Mark Harvey made in the code to allow NetWorker to use multiple virtual tape drives in the VTL. This makes LinuxVTL very capable as a supplement to a test or lab environment.

(As an aside, if you haven’t yet visited my new blog, I am the Anti-Cloud, you may want to flag it for reading. At Anti-Cloud, my goal is to point out the inadequacies of current attitudes by Public Cloud providers towards their customers, deflate some of the ridiculous hype that has grown out of Cloud Buzzword levels, and point out that not all of the revolutionary features are all that new, or revolutionary.)

Posted in Aside, NetWorker | Tagged: , , | Comments Off on November’s top article

EMC, Data Domain, VTLs and Disk Backup

Posted by Preston on 2009-11-30

With their recent acquisition of Data Domain, some people at EMC have become table thumping experts overnight on why you it’s absolutely imperative that you backup to Data Domain boxes as disk backup over NAS, rather than a fibre-channel connected VTL.

Their argument seems to come from the numbers – the wrong numbers.

The numbers constantly quoted are number of sales of disk backup Data Domain vs VTL Data Domain. That is, some EMC and Data Domain reps will confidently assert that by the numbers, a significantly higher percentage of Data Domain for Disk Backup has been sold than Data Domain with VTL. That’s like saying that Windows is superior to Mac OS X because it sells more. Or to perhaps pick a little less controversial topic, it’s like saying that DDS is better than LTO because there’s been more DDS drives and tapes sold than there’s ever been LTO drives and tapes.

I.e., an argument by those numbers doesn’t wash. It rarely has, it rarely will, and nor should it. (Otherwise we’d all be afraid of sailing too far from shore because that’s how it had always been done before…)

Let’s look at the reality of how disk backup currently stacks up in NetWorker. And let’s preface this by saying that if backup products actually started using disk backup properly tomorrow, I would be the first to shout “Don’t let the door hit your butt on the way out” to every VTL on the planet. As a concept, I wish VTLs didn’t have to exist, but in the practical real world, I recognise their need and their current ascendency over ADV_FILE. I have, almost literally at times, been dragged kicking and screaming to that conclusion.

Disk Backup, using ADV_FILE type devices in NetWorker:

  • Can’t move a saveset from a full disk backup unit to a non-full one; you have to clear the space first.
  • Can’t simultaneously clone from, stage from, backup to and recover from a disk backup unit. No, you can’t do that with tape either, but when disk backup units are typically in the order of several terabytes, and virtual tapes are in the order of maybe 50-200 GB, that’s a heck of a lot less contention time for any one backup.
  • Use tape/tape drive selection algorithms for deciding which disk backup unit gets used in which order, resulting in worst case capacity usage scenarios in almost all instances.
  • Can’t accept a saveset bigger than the disk backup unit. (It’s like, “Hello, AMANDA, I borrowed some ideas from you!”)
  • Can’t be part-replicated between sites. If you’ve got two VTLs and you really need to do back-end replication, you can replicate individual pieces of media between sites – again, significantly smaller than entire disk backup units. When you define disk backup units in NetWorker, that’s the “smallest” media you get.
  • Are traditionally space wasteful. NetWorker’s limited staging routines encourages clumps of disk backup space by destination pool – e.g., “here’s my daily disk backup units, I use them 30 days out of 31, and those over there that occupy the same amount of space (practically) are my monthly disk backup units, I use them 1 day out of 31. The rest of the time they sit idle.”
  • Have poor staging options (I’ll do another post this week on one way to improve on this).

If you get a table thumping sales person trying to tell you that you should buy Data Domain for Disk Backup for NetWorker, I’d suggest thumping the table back – you want the VTL option instead, and you want EMC to fix ADV_FILE.

Honestly EMC, I’ll lead the charge once ADV_FILE is fixed. I’ll champion it until I’m blue in the face, then suck from an oxygen tank and keep going – like I used to, before the inadequacies got too much. Until then though, I’ll keep skewering that argument of superiority by sales numbers.

Posted in Architecture, NetWorker | Tagged: , , , , , , | 3 Comments »

Quibbles – The maddening shortfall of ADV_FILE

Posted by Preston on 2009-11-25

Everyone who has worked with ADV_FILE devices knows this situation: a disk backup unit fills, and the saveset(s) being written hang until you clear up space, because as we know savesets in progress can’t be moved from one device to another:

Savesets hung on full ADV_FILE device until space is cleared

Honestly, what makes me really angry (I’m talking Marvin the Martian really angry here) is that if a tape device fills and another tape of the same pool is currently mounted, NetWorker will continue to write the saveset on the next available device:

Saveset moving from one tape device to another

What’s more, if it fills and there’s a drive that currently does have a tape mounted, NetWorker will mount a new tape in that drive and continue the backup in preference to dismounting the full tape and reloading a volume in the current drive.

There’s an expression for the behavioural discrepancy here: That sucks.

If anyone wonders why I say VTLs shouldn’t need to exist, but I still go and recommend them and use them, that’s your number one reason.

Posted in NetWorker, Quibbles | Tagged: , , , , , | 2 Comments »