NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Maybe I’m just an old Unix hack…

Posted by Preston on 2009-12-13

In the land of Dilbert, I’d probably be obligated to wear suspenders and have my socks pulled up past my knees, but ultimately I think I’m becoming an old Unix hack. Why?

  • Not because of my disdain for Windows. (Though that probably helps.)
  • Not because of my passion for Linux. (I have little in that regard.)
  • Not because of my rigid adherence to a particular Unix platform. (Used to be Solaris, now Mac OS X.)
  • Because of my ongoing use of vi.

I’ve been using Mac OS X now since 2005. The date is fairly well fixed in my head simply because it happened about a month after 10.4 (Tiger) was released. It’s also fixed in my head since I’ve never been as productive on a computer as I am on a Mac.

The Mac has changed a lot of my workflows, but the one thing that it hasn’t changed is the absolute automatic way I lunge for vi whenever I need to edit text, source code, etc. Now, I’ll admit I have the absolutely fantastic BBEdit program from Bare Bones Software. I even use it a lot of the time for in-depth coding across a lot of files. I’d certainly recommend anyone doing lots of software development on the Mac outside of Xcode to buy a license.

But it’s never what I open first when I need to edit a file. There’s something so spartan and uncomplicated about vi. (Which incidentally is probably why emacs just never appealed. It was never spartan or uncomplicated – at least in my opinion.)

I know it’s arcane. The idea of a editor mode and a control mode freaks a lot of people out. The use of freaky control commands that make WordStar look like the Paragon of User Interface Design take a lot of getting used to. Yet, whenever I’m in Word, or OpenOffice*, or even BBEdit, I still find myself automatically trying to type in vi search and replace commands. (Hint to any Bare Bones product manager that stumbles across this. Please please, pretty please, can we get a “vi” mode in BBEdit?)

To me, and I know a lot of Mac users out there will probably have a conniption in response to what I’m going to say: vi is a lot like Mac OS X. It’s like a butler. It doesn’t jump up and down and pester you every 5 minutes (like Windows) about what you want to do, or that you’ve got an icon not being used on your desktop, or that a new network was found, or any other garbage like that. It just hangs back, lets you work, and jumps to your assistance when you want it.

Call me an old Unix hack if you want, but I can’t go a day without vi. Being able to do things such as the following:

(esc) :.,$s/^/insert into blah(x) values(‘/

(esc) :.,$s/$/’);/

Is for some reason vitally important to my ability to work productively. Heck, I even use vi in NetWorker, thanks to default editor settings and nsradmin‘s response to the keyword ‘edit’ on Unix platforms.

I think every technical person who works on heterogenous systems should learn vi. It’s pretty much the one interactive editor you can guarantee being available on every Unix system. (Discounting ‘ed’, and disrespecting emacs ;-) ) I can also guarantee that anyone who has used vi for more than 5 minutes and successfully saved a document can navigate around the user interface behaviours of the Windows default editor, ‘notepad’, or the Mac OS X default editor, ‘Text Edit’. The same isn’t in reverse, and I find that a lot of say, Windows admins who start doing bits and pieces of work on Unix systems are usually hampered by the entire vi experience. vi, it seems, is suitably foreign to people who grow up in GUI only environments that it taints the entire Unix interactivity process. However, being an old Unix hack, I don’t think this is vi‘s fault. Indeed, I’d suggest that anyone who can’t type “vi quick reference card” into Google and then use the results productively is doing themselves a disservice.

If you’re a Windows admin and you’ve just assumed I’m having a dig at you for not knowing vi, I’m not. Like knowing a cross platform scripting language (e.g,. perl), I merely recommend that administrators in heterogenous environments enjoy their job more, and can do their job more easily, if they know vi.

Oh, and as a final point, can someone please explain why almost everyone else on the planet except me seems to save and quit in vi either through multiple actions or more obscure commands (e.g., esc :wq) than just:

(esc) :x


* And if someone could explain the arrogance of having OpenOffice on the Mac takeover all possible document types whenever it is first run, I’ll be very interested in rebutting your arguments.

Posted in Aside, General Technology | Tagged: , , , | 3 Comments »

This holiday season, give your inner geek a gift

Posted by Preston on 2009-12-11

As it approaches that time for giving, it’s worth pointing out that with just a simple purchase, you can simultaneously give yourself and me a present. I’m assuming regular readers of the blog would like to thank me, and the best thanks I could get this year would be to get a nice spike in sales in my book before the end of the year.

Enterprise Systems Backup and Recovery: A corporate insurance policy” is a book aimed not just at companies only now starting to look at implementing a comprehensive backup system. It’s equally aimed at companies who are already doing enterprise backup and need that extra direction to move from a collection of backup products to an actual backup system.

What’s a backup system? At the most simple, it’s an environment that is geared towards recovery. However, it’s not just having the right software and the right hardware – it’s also about having:

  • The right policies
  • The right procedures
  • The right people
  • The right attitude

Most organisations actually do pretty well in relation to getting the right software and the right hardware. However, that’s only about 40% of achieving a backup system. It’s the human components – that last remaining 60% that’s far more challenging and important to get right. For instance, at your company:

  • Are backups seen as an “IT” function?
  • Are backups assigned to junior staff?
  • Are results not checked until there’s a recovery required?
  • Are backups only tested in an adhoc manner?
  • Are recurring errors that aren’t really errors tolerated?
  • Are procedures for requesting recoveries adhoc?
  • Are backups thought of after systems are added or expanded?
  • Are backups highly limited to “save space”?
  • Is the backup server seen as a “non-production” server?

If the answer to even a single one of those questions is yes, then your company doesn’t have a backup system, and your ability to guarantee recoverability is considerably diminished.

Backup systems, by integrating the technical and the human aspect of a company, provide a much better guarantee of recoverability than a collection of untested random copies that have no formal procedures for their creation and use.

And if the answer to even a single one of those questions is yes, you’ll get something useful and important out of my book.

So, if you’re interested in buying the book, you can grab it from Amazon using this link, or from the publisher, CRC press, using this link.

Posted in Backup theory, General thoughts | Tagged: , , | 4 Comments »

Long term NetWare recovery

Posted by Preston on 2009-12-10

Are you still backing up Novell NetWare hosts? If you are, I hope you’re actively considering what you’re going to do in relation to NetWare recoveries in March 2010, when NetWare support ceases from both Novell and EMC.

I still have a lot of customers backing up NetWare hosts, and I’m sure my customer set isn’t unique. While Novell still tries to convince customers to switch from traditional NetWare services to NetWare on OES/SLES, a lot of companies are continuing to use NetWare until “the last minute”.

The “last minute” is of course, March 2010, when standard support for NetWare finishes.

Originally, NetWare support in NetWorker was scheduled to finish in March 2009, but partners and customers managed to convince EMC to extend the support to March 2010, to match Symantec and co-terminate with Novell’s end of standard support for NetWare as well.

Now it’s time we start considering what happens when that support finishes. Namely:

  1. How will you recover long term NetWare backups?
  2. How will you still run NetWare systems?
  3. How will you manage NetWorker upgrades?

These are all fairly important questions. While we’re hopeful we might get some options for recovering NetWare backups on OES systems (i.e., pseudo cross-platform recoveries), there’s obviously no guarantees of that as yet.

So the question is – if you’re still using NetWare, how do you go about guaranteeing you can recover NetWare backups once NetWare has been phased out of existence?

The initial recommendation from Novell on this topic is: keep a NetWare box around.

I think this is a short-sighted recommendation on their part, and shows that they haven’t properly managed (internally) the transition from traditional NetWare to NetWare on OES/SLES. This is perhaps why there isn’t a 100% transition from one NetWare platform to the other. Being faced with unpalatable transition options, some Novell customers are instead considering alternate transitionary options.

Unfortunately, in the short term, I don’t see there being many options. I’m therefore inclined to recommend that:

  1. Companies backing up traditional NetWare who only need to continue to recover a very small number of backups consider performing an old-school migration – recover the data to a host, and backup on an operating system that will continue to enjoy OS vendor and EMC support moving forward.
  2. Companies backing up larger amounts of traditional NetWare should consider virtualising at least one, preferably a few more NetWare systems before end of support, and keeping good archival VM backups (to avoid having to do a reinstall), using those systems as recovery points for older NetWare data.

The longer-term concern is that the NetWare client in NetWorker has always been … interesting. Once NetWare support vanishes, the primary consideration for newer versions of NetWorker will be whether those newer versions actually support the old 7.2 NetWare client for recovery purposes.

With this in mind, it will become even more important to carefully review release notes and conduct test upgrades when new releases of NetWorker come out to confirm whether newer versions of the server software actually support communicating with the increasingly older NetWare client until such time as recovery from those NetWare backups is no longer required.

You may think this is a bit extreme, but bear in mind we don’t often see entire operating systems get phased out of existence, so it’s not a common problem. To be sure, individual iterations or releases may drop out of support (e.g., Solaris 6), but the entire operating system platform (e.g., Solaris, or even more generally, Unix) tends to stay in some level of support. In fact, the last time I think I recall an entire OS platform slipping out of NetWorker support was Banyan Vines, and the last client version released for that was 3 point something. (Data General Unix (DGUX) may have ceased being supported more recently, but overall the Unix platform has remained in support.)

If you’re still backing up NetWare servers and you’re not yet considering how you’re going to recover NetWare backups post March 2010, it’s time to give serious consideration to it.

Posted in NetWorker | Tagged: , , , , , , , , , , | Comments Off on Long term NetWare recovery

…And why I’ll stick to Parallels

Posted by Preston on 2009-12-09

So in an earlier post, I mentioned that I’d been looking at first comparisons between VMware Fusion 3.0 and Parallels Desktop 5 for Mac, and I thought it was time to follow-up with longer term impressions.

To be blunt, VMware Fusion 3 is unpolished and unpleasant to use on an almost continual basis. I’ll keep it around for only two reasons: (a) so I can run ESX/vSphere within a VM for testing purposes, and (b) I can periodically play with the demo/test images provided by EMC for particular products that won’t convert into Parallels images.

So what’s there to dislike about Fusion?

  • Unity. It’s like someone at VMware declared “Make it slow. Make it inefficient. Make it periodically take 10+ seconds to redraw windows. Make it work but glitchy enough that it makes the user grind their teeth in frustration.” Well, if someone did decree that as a product feature, they did a remarkably good job of achieving it. Here’s a tip, folks at VMware: Buy a copy of Parallels and see how professionals do an integrated windowing feature. Unity in Fusion v3 is worse than Coherence when it was first introduced (which was fine) – i.e., you have a long, long way to go.
  • Import another VM. What VM would you like to import? Parallels? Forget it. Why offer to import VMs from Parallels if every VM comes in unusable? (I’m sure other people must have better experiences than this, but I’m certainly not impressed.)
  • Performance. OK, so VMware Fusion performance isn’t atrocious – it’s actually OK. However, I’d been led to believe that VMware Fusion kicked Parallels Desktop out of the ballpark when it came to performance. I’ve not seen anything to indicate that it exceeds the performance of Parallels, and so I see that as a negative.
  • Quit. Don’t pester me, just suspend my VM.

As I said, I’ll be keeping Fusion around, but only for those situations where I can’t use Parallels.

Posted in General Technology | Tagged: , , , , | 4 Comments »

Funny attitude adjustments

Posted by Preston on 2009-12-08

It’s funny sometimes seeing attitude adjustments that come from various companies as they’re acquired by others.

One could never say that EMC has been a big fan of tape (I’ve long since given up any hopes of them actually telling the 100% data protection story and buying a tape company), but at least they’ve tended to admit that tape is necessary over the years.

So this time the attitude adjustment now seems to be coming from Data Domain as they merge into the backup and recovery division at EMC following the acquisition. Over at SearchStorage, we have an article by Christine Cignoli called “Data deduplication goes mainstream, but tape lives on“, which has this insightful quote:

Even Shane Jackson, director of product marketing at Data Domain, agrees. “We’ve never gone to the extreme of ‘tape is dead,'” he said. “As an archive medium, keeping data for seven years for HIPAA compliance in a box on a shelf is still a reasonable thing to do.”

That’s interesting, I could have sworn I have a Data Domain bumper sticker that says this:

Data Domain Bumper Sticker

Now don’t get me wrong, I’m not here to rub salt into Data Domain’s wounds, but I would like to take the opportunity to point out that tape has been killed more times than the iPhone, so next time an up and coming company trumpets their “tape-is-dead” story, and some bright eyed eager and naïve journalist reports on it, remember that they always come around … eventually.

Posted in Backup theory, General Technology | Tagged: , | 2 Comments »

How complex is your backup environment?

Posted by Preston on 2009-12-07

Something I’ve periodically mentioned to various people over the years is that when it comes to data protection, simplicity is King. This can be best summed up with the following rule to follow when designing a backup system:

If you can’t summarise your backup solution on the back of a napkin, it’s too complicated.

Now, the first reaction a lot of people have to that is “but if I do X and Y and Z and then A and B on top, then it’s not going to fit, but we don’t have a complex environment”.

Well, there’s two answers to that:

  1. We’re not talking a detailed technical summary of the environment, we’re talking a high level overview.
  2. If you still can’t give a high level overview on the back of a napkin, it is too complicated.

Another way to approach the complexity issue, if you happen to have a phobia about using the back of a napkin is – if you can’t give a 30 second elevator summary of your solution, it’s too complicated.

If you’re struggling to think of why it’s important you can summarise your solution in such a short period of time, or such limited space, I’ll give you a few examples:

  1. You need to summarise it in a meeting with senior management.
  2. You need to summarise it in a meeting with your management and a vendor.
  3. You’ve got 5 minutes or less to pitch getting an upgrade budget.
  4. You’ve got a new assistant starting and you’re about to go into a meeting.
  5. You’ve got a new assistant starting and you’re about to go on holiday.
  6. You’ve got consultant(s) (or contractors) coming in to do some work and you’re going to have to leave them on their own.
  7. The CIO asks “so what is it?” as a follow-up question when (s)he accosts you in the hallway and asks, “Do we have a backup policy?”

I can think of a variety of other reasons, but the point remains – a backup system should not be so complex that it can’t be easily described. That’s not to ever say that it can’t either (a) do complex tasks or (b) have complex components, but if the backup administrator can’t readily describe the functioning whole, then the chances are that there is no functioning whole, just a whole lot of mess.

Posted in Backup theory, General thoughts, Policies | Tagged: , , , , , | Comments Off on How complex is your backup environment?

Aside – This time, 10 years ago

Posted by Preston on 2009-12-07

We’re now rapidly heading towards 2010. The world did not collapse at the start of the year 2000, thanks in no small part to the efforts of developers and system administrators across the globe in mitigating Y2K risks.

So where was I, 10 years ago?

Well, I was still working for the most part as a system/backup administrator for a large resources company. I was neck deep in Y2K mitigation projects, notably:

  • Major efforts on a Tru64 environment where the core application could not be upgraded to a supported Y2K compliant version, so the surrounding OS and underlying database had to be upgraded instead to mitigate the risk.
  • Ensuring all my NetWorker servers were running 5.1 so that I could rest easy, knowing that anything that might fail could still be recovered.

The first project was the most frustrating for me. Not because of the work, but because of the “technical project manager” assigned to it. I knew I was in for a long hard haul the first time I had a conversation with the TPM and it boiled down to a 1 hour discussion where I kept on trying to explain why you had to add disks to a machine if you needed additional storage. From that point on my colleagues always knew when I was on the phone to that TPM due to the look of exasperated frustration I would wear throughout the conversation. It was even more maddening that the TPM had a laptop so old and clunky that he could run MS Project or Outlook, but never both at once. That would mean significant delays in responses to emails…

The funny thing is – when I reflect back on that project these days, I realise that it helped to turn me into a consultant. The standard engineer/sysadmin approach to such challenging people usually doesn’t work, so you have to learn to be a consultant to actually make any headway at all. So thankyou, challenging TPM. 10 years on and I don’t find myself silently screaming when I remember that project – instead I’m grateful that I was assigned such a complex project where self management became very important almost immediately out of the gates; it taught me that I was interested in far more than regular system administration. Instead of just being part of a team that did managed services and consulting, it made me want to actually be a consultant myself.

(As to Y2K itself, I spent around 8pm through to around 1am for the crossover at my desk, waiting for the world to fall down around our ears if we’d got it wrong. In a beautiful case of irony, the only major system that fell over during the Y2K transition was the Microsoft Access database designed by some psuedo-admin in another division of the company at the last minute to record Y2K failures…)

Posted in Aside | Tagged: , , , | Comments Off on Aside – This time, 10 years ago

Validcopies hazardous to your sanity

Posted by Preston on 2009-12-04

While much of NetWorker 7.6’s enhancements have been surrounding updates to virtualisation or (urgh) cloud, there remains a bunch of smaller updates that are of interest.

One of those new features is the validcopies flag, something I unfortunately failed to check out in beta testing. It looks like it could use some more work, but the theory is a good one. The idea behind validcopies is that we can use it in VTL style situations to determine not only whether we’ve got an appropriate number of copies, but they’re also valid – i.e., they’re usable by NetWorker for recovery purposes.

It’s a shame it’s too buggy to be used.

Here’s an example where I backup to an ADV_FILE type device:

[root@tara ~]# save -b Default -e "+3 weeks" -LL -q /usr/share
57777:save:Multiple client instances of tara.pmdg.lab, using the first entry
save: /usr/share  1244 MB 00:03:23  87843 files
completed savetime=1259366579

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1"
 volume        client       date      size   level  name
Default.001    tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share
Default.001.RO tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies
6095:mminfo: no matches found for the query

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1"
 volume        client       date      size   level  name
Default.001    tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share
Default.001.RO tara.pmdg.lab 11/28/2009 1244 MB manual /usr/share

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies
6095:mminfo: no matches found for the query

[root@tara ~]# mminfo -q "name=/usr/share,validcopies>1" -r validcopies,copies
 validcopies copies
 2     2
 2     2

I have a few problems with the above output, and am working through the bugs in validcopies with EMC. Let’s look at each of those items and see what I’m concerned about:

  1. We don’t have more than one valid copy just because it’s sitting on an ADV_FILE device. If the purpose of the “validcopies” flag is to count the number of unique recoverable copies, we do not have 2 copies for each instance on ADV_FILE. There should be some logic there to not count copies on ADV_FILE devices twice for valid copy counts.
  2. As you can see from the last two commands, the results found differ depending on report options. This is inappropriate, to say the least. We’re getting no validcopies reported at all if we only look for validcopies, or 2 validcopies reported if we search for both validcopies and copies.

Verdict from the above:

  • Don’t use validcopies for disk backup units.
  • Don’t report on validcopies only, or you’ll skew your results.

Let’s move on to VTLs though – we’ll clone the saveset I just generated to the ADV_FILE type over to the VTL:

[root@tara ~]# mminfo -q "volume=Default.001.RO" -r ssid,cloneid
 ssid         clone id
4279265459  1259366578

[root@tara ~]# nsrclone -b "Big Clone" -v -S 4279265459/1259366578
5874:nsrclone: Automatically copying save sets(s) to other volume(s)
6216:nsrclone:
Starting cloning operation...
Nov 28 11:29:42 tara logger: NetWorker media: (waiting) Waiting for 1 writable volume(s)
to backup pool 'Big Clone' tape(s) or disk(s) on tara.pmdg.lab
5884:nsrclone: Successfully cloned all requested save sets
5886:nsrclone: Clones were written to the following volume(s):
 BIG998S3

[root@tara ~]# mminfo -q "ssid=4279265459" -r validcopies
 0

[root@tara ~]# mminfo -q "ssid=4279265459" -r copies,validcopies
 copies validcopies
 3          3
 3          3
 3          3

In the above instance, if we query just by the saveset ID for the number of valid copies, NetWorker happily tells us “0”. If we query for copies and validcopies, we get 3 of each.

So, what does this say to me? Steer away from ‘validcopies’ until it’s fixed.

(On a side note, why does the offsite parameter remain Write Only? We can’t query it through mminfo, and I’ve had an RFE in since the day the offsite option was introduced into nsrmm. Why this is “hard” or taking so long is beyond me.)

Posted in Features, NetWorker, Scripting | Tagged: , , , | Comments Off on Validcopies hazardous to your sanity

Recovery reporting comes to NetWorker

Posted by Preston on 2009-12-02

One of the areas where administrators have been rightly able to criticise NetWorker has been the lack of reporting or auditing options to do with recoveries. While some information has always been retrievable from the daemon logs, it’s been only basic and depends on keeping the logs. (Which you should of course always do.)

NetWorker 7.6 however does bring in recovery reporting, which starts to rectify those criticisms. Now in the enterprise reporting section, you’ll find the following section:

  • NetWorker Recover
    • Server Summary
    • Client Summary
    • Recover Details
    • Recover Summary over Time

Of these reporting options, I think the average administrator will want the bottom two the most, unless they operate in an environment where clients are billed for recoveries.

Let’s look at the Recover Summary over Time report:

Recover summary over time

This presents a fairly simple summary of the recoveries that have been done on a per-client basis, including the number of files recovered, the amount of data recovered and the breakdown of successful vs failed recovery actions.

I particularly like the Recover Details report though:

Recover Details report

(Click the picture to see the entire width.)

As you can see there, we get a per user breakdown of recovery activities, when they were started, how long they took, how much data was recovered, etc.

These reports are a brilliant and much needed addition to NetWorker reporting capabilities, and I’m pleased to see EMC has finally put them into the product.

There’s probably one thing still missing that I can see administrators wanting to see – file lists of recovery sessions. Hopefully 7.(6+x) would see that report option though.

Posted in NetWorker | Tagged: , , , , , , | 2 Comments »

November’s top article

Posted by Preston on 2009-12-01

November saw the article, “Carry a jukebox with you (if you’re using Linux)” remain the top read story for another month. This details how to use the LinuxVTL open source software with NetWorker.

For those of you interested in setting this up for testing purposes, I’d also recommend reading the follow-up article I wrote this month, “NetWorker and LinuxVTL, redux“, which details recent advances Mark Harvey made in the code to allow NetWorker to use multiple virtual tape drives in the VTL. This makes LinuxVTL very capable as a supplement to a test or lab environment.

(As an aside, if you haven’t yet visited my new blog, I am the Anti-Cloud, you may want to flag it for reading. At Anti-Cloud, my goal is to point out the inadequacies of current attitudes by Public Cloud providers towards their customers, deflate some of the ridiculous hype that has grown out of Cloud Buzzword levels, and point out that not all of the revolutionary features are all that new, or revolutionary.)

Posted in Aside, NetWorker | Tagged: , , | Comments Off on November’s top article