NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Archive for the ‘General Technology’ Category

Show me the man pages

Posted by Preston on 2009-12-21

As a long term Unix admin, it’s frustrating when there are commands on my systems for which there aren’t man pages. As a long-term NetWorker user, it’s equally frustrating when there aren’t man pages for particular NetWorker commands.

When I’ve discussed this in the past, I’ve usually had a response of “that’s because you shouldn’t be running that command”. That’s a bad response. The correct response should be something along the lines of “oops, we’ll write a man page for the next release that states:

That command is for internal NetWorker use only. It does X. It should not be run manually.

Having undocumented commands that give no output, hang or produce strange results is just inviting frustration. Of just the nsr prefixed commands, on my current 7.6 lab server, the following commands are undocumented:

  • nsravamar
  • nsravtar
  • nsrbmr
  • nsrcatconfig
  • nsr_cp_install
  • nsrdmpix
  • nsrdsa_recover
  • nsrdsa_save
  • nsrfile
  • nsrfsra
  • nsrlmc
  • nsrndmp_2fh
  • nsrrcopy
  • nsrrcopy2
  • nsrvcbserv_tool

So out of the 55 nsr prefixed commands I have on my server, 15 (or 27%) are undocumented.

Note to EMC: This does not produce a healthy level of trust. Please – get some documentation on these commands, even if that documentation gives us a one line overview of where they’re used and tells us not to run them ourselves.

Posted in General Technology, General thoughts, NetWorker | Tagged: , , , | Comments Off on Show me the man pages

15 crazy things I never want to hear again

Posted by Preston on 2009-12-14

Over the years I’ve dealt with a lot of different environments, and a lot of different usage requirements for backup products. Most of these fall into the “appropriate business use” categories. Some fall into the “hmmm, why would you do that?” category. Others fall into the “please excuse my brain it’s just scuttled off into the corner to hide – tell me again” category.

This is not about the people, or the companies, but the crazy ideas that sometimes get hold within companies that should be watched for. While I could have expanded this list to cover a raft of other things outside of backups, I’ve forced myself to just keep it to the backup process.

In no particular order then, these are the crazy things I never want to hear again:

  1. After the backups, I delete all the indices, because I maintain a spreadsheet showing where files are, and that’s much more efficient than proprietary databases.
  2. We just backup /etc/passwd on that machine.
  3. But what about /etc/shadow? (My stupid response to the above statement, blurted after by brain stalled in response to statement #2)
  4. Oh, hadn’t thought about that (In response to #3).
  5. Can you fax me some cleaning cartridge barcodes?
  6. To save money on barcodes at the end of every week we take them off the tapes in the autochanger and put them on the new ones about to go in.
  7. We only put one tape in the autochanger each night. We don’t want <product> to pick the wrong tape.
  8. We need to upgrade our tape drives. All our backups don’t fit on a single tape any more. (By same company that said #7.)
  9. What do you mean if we don’t change the tape <product> won’t automatically overwrite it? (By same company that said #7 and #8.)
  10. Why would I want to match barcode labels to tape labels? That’s crazy!
  11. That’s being backed up. I emailed Jim a week ago and asked him to add it to the configuration. (Shouted out from across the room: “Jim left last month, remember?”)
  12. We put disk quotas on our academics, but due to government law we can’t do that to their mail. So when they fill up their home directories, they zip them up and email it to themselves then delete it all.
  13. If a user is dumb enough to delete their file, I don’t care about getting it back.
  14. Every now and then on a Friday afternoon my last boss used to delete a filesystem and tell us to have it back by Monday as a test of the backup system.
  15. What are you going to do to fix the problem? (Final question asked by an operations manager after explaining (a) robot was randomly dropping tapes when picking them from slots; (b) tapes were covered in a thin film of oily grime; (c) oh that was probably because their data centre was under the area of the flight path where planes are advised to dump excess fuel before landing; (d) fuel is not being scrubbed by air conditioning system fully and being sucked into data centre; (e) me reminding them we just supported the backup software.)

I will say that numbers #1 and #15 are my personal favourites for crazy statements.

Posted in Backup theory, General Technology, Policies, Quibbles | Tagged: | 1 Comment »

Maybe I’m just an old Unix hack…

Posted by Preston on 2009-12-13

In the land of Dilbert, I’d probably be obligated to wear suspenders and have my socks pulled up past my knees, but ultimately I think I’m becoming an old Unix hack. Why?

  • Not because of my disdain for Windows. (Though that probably helps.)
  • Not because of my passion for Linux. (I have little in that regard.)
  • Not because of my rigid adherence to a particular Unix platform. (Used to be Solaris, now Mac OS X.)
  • Because of my ongoing use of vi.

I’ve been using Mac OS X now since 2005. The date is fairly well fixed in my head simply because it happened about a month after 10.4 (Tiger) was released. It’s also fixed in my head since I’ve never been as productive on a computer as I am on a Mac.

The Mac has changed a lot of my workflows, but the one thing that it hasn’t changed is the absolute automatic way I lunge for vi whenever I need to edit text, source code, etc. Now, I’ll admit I have the absolutely fantastic BBEdit program from Bare Bones Software. I even use it a lot of the time for in-depth coding across a lot of files. I’d certainly recommend anyone doing lots of software development on the Mac outside of Xcode to buy a license.

But it’s never what I open first when I need to edit a file. There’s something so spartan and uncomplicated about vi. (Which incidentally is probably why emacs just never appealed. It was never spartan or uncomplicated – at least in my opinion.)

I know it’s arcane. The idea of a editor mode and a control mode freaks a lot of people out. The use of freaky control commands that make WordStar look like the Paragon of User Interface Design take a lot of getting used to. Yet, whenever I’m in Word, or OpenOffice*, or even BBEdit, I still find myself automatically trying to type in vi search and replace commands. (Hint to any Bare Bones product manager that stumbles across this. Please please, pretty please, can we get a “vi” mode in BBEdit?)

To me, and I know a lot of Mac users out there will probably have a conniption in response to what I’m going to say: vi is a lot like Mac OS X. It’s like a butler. It doesn’t jump up and down and pester you every 5 minutes (like Windows) about what you want to do, or that you’ve got an icon not being used on your desktop, or that a new network was found, or any other garbage like that. It just hangs back, lets you work, and jumps to your assistance when you want it.

Call me an old Unix hack if you want, but I can’t go a day without vi. Being able to do things such as the following:

(esc) :.,$s/^/insert into blah(x) values(‘/

(esc) :.,$s/$/’);/

Is for some reason vitally important to my ability to work productively. Heck, I even use vi in NetWorker, thanks to default editor settings and nsradmin‘s response to the keyword ‘edit’ on Unix platforms.

I think every technical person who works on heterogenous systems should learn vi. It’s pretty much the one interactive editor you can guarantee being available on every Unix system. (Discounting ‘ed’, and disrespecting emacs ;-) ) I can also guarantee that anyone who has used vi for more than 5 minutes and successfully saved a document can navigate around the user interface behaviours of the Windows default editor, ‘notepad’, or the Mac OS X default editor, ‘Text Edit’. The same isn’t in reverse, and I find that a lot of say, Windows admins who start doing bits and pieces of work on Unix systems are usually hampered by the entire vi experience. vi, it seems, is suitably foreign to people who grow up in GUI only environments that it taints the entire Unix interactivity process. However, being an old Unix hack, I don’t think this is vi‘s fault. Indeed, I’d suggest that anyone who can’t type “vi quick reference card” into Google and then use the results productively is doing themselves a disservice.

If you’re a Windows admin and you’ve just assumed I’m having a dig at you for not knowing vi, I’m not. Like knowing a cross platform scripting language (e.g,. perl), I merely recommend that administrators in heterogenous environments enjoy their job more, and can do their job more easily, if they know vi.

Oh, and as a final point, can someone please explain why almost everyone else on the planet except me seems to save and quit in vi either through multiple actions or more obscure commands (e.g., esc :wq) than just:

(esc) :x


* And if someone could explain the arrogance of having OpenOffice on the Mac takeover all possible document types whenever it is first run, I’ll be very interested in rebutting your arguments.

Posted in Aside, General Technology | Tagged: , , , | 3 Comments »

…And why I’ll stick to Parallels

Posted by Preston on 2009-12-09

So in an earlier post, I mentioned that I’d been looking at first comparisons between VMware Fusion 3.0 and Parallels Desktop 5 for Mac, and I thought it was time to follow-up with longer term impressions.

To be blunt, VMware Fusion 3 is unpolished and unpleasant to use on an almost continual basis. I’ll keep it around for only two reasons: (a) so I can run ESX/vSphere within a VM for testing purposes, and (b) I can periodically play with the demo/test images provided by EMC for particular products that won’t convert into Parallels images.

So what’s there to dislike about Fusion?

  • Unity. It’s like someone at VMware declared “Make it slow. Make it inefficient. Make it periodically take 10+ seconds to redraw windows. Make it work but glitchy enough that it makes the user grind their teeth in frustration.” Well, if someone did decree that as a product feature, they did a remarkably good job of achieving it. Here’s a tip, folks at VMware: Buy a copy of Parallels and see how professionals do an integrated windowing feature. Unity in Fusion v3 is worse than Coherence when it was first introduced (which was fine) – i.e., you have a long, long way to go.
  • Import another VM. What VM would you like to import? Parallels? Forget it. Why offer to import VMs from Parallels if every VM comes in unusable? (I’m sure other people must have better experiences than this, but I’m certainly not impressed.)
  • Performance. OK, so VMware Fusion performance isn’t atrocious – it’s actually OK. However, I’d been led to believe that VMware Fusion kicked Parallels Desktop out of the ballpark when it came to performance. I’ve not seen anything to indicate that it exceeds the performance of Parallels, and so I see that as a negative.
  • Quit. Don’t pester me, just suspend my VM.

As I said, I’ll be keeping Fusion around, but only for those situations where I can’t use Parallels.

Posted in General Technology | Tagged: , , , , | 4 Comments »

Funny attitude adjustments

Posted by Preston on 2009-12-08

It’s funny sometimes seeing attitude adjustments that come from various companies as they’re acquired by others.

One could never say that EMC has been a big fan of tape (I’ve long since given up any hopes of them actually telling the 100% data protection story and buying a tape company), but at least they’ve tended to admit that tape is necessary over the years.

So this time the attitude adjustment now seems to be coming from Data Domain as they merge into the backup and recovery division at EMC following the acquisition. Over at SearchStorage, we have an article by Christine Cignoli called “Data deduplication goes mainstream, but tape lives on“, which has this insightful quote:

Even Shane Jackson, director of product marketing at Data Domain, agrees. “We’ve never gone to the extreme of ‘tape is dead,'” he said. “As an archive medium, keeping data for seven years for HIPAA compliance in a box on a shelf is still a reasonable thing to do.”

That’s interesting, I could have sworn I have a Data Domain bumper sticker that says this:

Data Domain Bumper Sticker

Now don’t get me wrong, I’m not here to rub salt into Data Domain’s wounds, but I would like to take the opportunity to point out that tape has been killed more times than the iPhone, so next time an up and coming company trumpets their “tape-is-dead” story, and some bright eyed eager and naïve journalist reports on it, remember that they always come around … eventually.

Posted in Backup theory, General Technology | Tagged: , | 2 Comments »

Introducing my new blog

Posted by Preston on 2009-11-28

As frequent visitors to my blog will know, I don’t buy into all the Cloud Hype that threatens to overwhelm the technology industry at the moment. While I’ve periodically written about the Cloud on this blog when something particularly unsettling has come up, I’ve decided that it’s time to fire up a new blog dedicated to providing an alternative view on Cloud Computing.

So, over at my new blog, you’ll find ongoing commentary about Cloud Computing that will be refreshingly free of the hype that we so often find ourselves exposed to on a daily basis.

I will strive to be as honest as possible, will willingly point out anything being done in Cloud initiatives that is fresh and new, and will be open to people trying to convince me that I’m wrong.

I’m a backup consultant. I don’t go for bleeding edge for the sake of it, I don’t buy into hype, and I don’t recommend or accept anything that jeopardises user data.

So, without further adieu, please feel free to visit I Am The Anti-Cloud.

(Moving forward, unless something significantly overlaps NetWorker and The Cloud, I’ll not be posting about the Cloud on this blog.)

Posted in General Technology, General thoughts | Tagged: | Comments Off on Introducing my new blog

Storage Tiering vs ILM

Posted by Preston on 2009-11-24

Over at StorageNerve, and on Twitter, Devang Panchigar has been asking Is Storage Tiering ILM or a subset of ILM, but where is ILM? I think it’s an important question with some interesting answers.

Devang starts with defining ILM from a storage perspective:

1) A user or an application creates data and possibly over time that data is modified.
2) The data needs to be stored and possibly be protected through RAID, snaps, clones, replication and backups.
3) The data now needs to be archived as it gets old, and retention policies & laws kick in.
4) The data needs to be search-able and retrievable NOW.
5) Finally the data needs to be deleted.

I agree with items 1, 3, 4 and 5 – as per previous posts, for what it’s worth, I believe that 2 belongs to a sister activity which I define as Information Lifecycle Protection (ILP) – something that Devang acknowledges as an alternative theory. (I liken the logic to separation between ILM and ILP to that between operational production servers and support production servers.)

The above list, for what it’s worth, is actually a fairly astute/accurate summary of the involvement of the storage industry thus far in ILM. Devang rightly points out that Storage Tiering (migrating data between different speed/capacity/cost storage based on usage, etc.), doesn’t address all of the above points – in particular, data creation and data deletion. That’s certainly true.

What’s missing from ILM from a storage perspective are the components that storage can only peripherally control. Perhaps that’s not entirely accurate – the storage industry can certainly participate in the remaining components (indeed, particularly in NAS systems it’s absolutely necessary, as a prime example) – but it’s more than just the storage industry. It’s operating system vendors. It’s application vendors. It’s database vendors. It is, quite frankly, the whole kit and caboodle.

What’s missing in the storage-centric approach to ILM is identity management – or to be more accurate in this context, identity management systems. The brief outline of identity management is that it’s about moving access control and content control out of the hands of the system, application and database administrators, and into the hands of human resources/corporate management. So a system administrator could have total systems access over an entire host and all its data but not be able to open files that (from a corporate management perspective) they have no right to access. A database administrator can fully control the corporate database, but can’t access commercially sensitive or staff salary details, etc.

Most typically though, it’s about corporate roles, as defined in human resources, being reflected from the ground up in system access options. That is, human resources, when they setup a new employee as having a particular role within the organisation (e.g., “personal assistant”), triggering the appropriate workflows to setup that person’s accounts and access privileges for IT systems as well.

If you think that’s insane, you probably don’t appreciate the purpose of it. System/app/database administrators I talk to about identity management frequently raise trust (or the perceived lack thereof) involved in such systems. I.e., they think that if the company they work for wants to implement identity management they don’t trust the people who are tasked with protecting the systems. I won’t lie, I think in a very small number of instances, this may be the case. Maybe 1%, maybe as high as 2%. But let’s look at the bigger picture here – we, as system/application/database administrators currently have access to such data not because we should have access to such data but because until recently there’s been very few options in place to limit data access to only those who, from a corporate governance perspective, should have access to that data. As such, most system/app/database administrators are highly ethical – they know that being able to access data doesn’t equate to actually accessing that data. (Case in point: as the engineering manager and sysadmin at my last job, if I’d been less ethical, I would have seen the writing on the wall long before the company fell down under financial stresses around my ears!)

Trust doesn’t wash in legal proceedings. Trust doesn’t wash in financial auditing. Particularly in situations where accurate logs aren’t maintained in an appropriately secured manner to prove that person A didn’t access data X. The fact that the system was designed to permit A to access X (even as part of A’s job) is in some financial, legal and data sensitivity areas, significant cause for concern.

Returning to the primary point though, it’s about ensuring that the people who have authority over someone’s role within a company (human resources/management) having control over the the processes that configure the access permissions that person has. It’s also about making sure that those work flows are properly configured and automated so there’s no room for error.

So what’s missing – or what’s only at the barest starting point, is the integration of identity/access control with ILM (including storage tiering) and ILP. This, as you can imagine, is not an easy task. Hell, it’s not even a hard task – it’s a monumentally difficult task. It involves a level of cooperation and coordination between different technical tiers (storage, backup, operating systems, applications) that we rarely, if ever see beyond the basic “must all work together or else it will just spend all the time crashing” perspective.

That’s the bit that gives the extra components – control over content creation and destruction. The storage industry on its own does not have the correct levels of exposure to an organisation in order to provide this functionality of ILM. Nor do the operating system vendors. Nor do the database vendors or the application vendors – they all have to work together to provide a total solution on this front.

I think this answers (indirectly) Devang’s question/comment on why storage vendors, and indeed, most of the storage industry, has stopped talking about ILM – the easy parts are well established, but the hard parts are only in their infancy. We are after all seeing some very early processes around integrating identity management and ILM/ILP. For instance, key management on backups, if handled correctly, can allow for situations where backup administrators can’t by themselves perform the recovery of sensitive systems or data – it requires corporate permissions (e.g., the input of a data access key by someone in HR, etc.) Various operating systems and databases/applications are now providing hooks for identity management (to name just one, here’s Oracle’s details on it.)

So no, I think we can confidently say that storage tiering in and of itself is not the answer to ILM. As to why the storage industry has for the most part stopped talking about ILM, we’re left with one of two choices – it’s hard enough that they don’t want to progress it further, or it’s sufficiently commercially sensitive that it’s not something discussed without the strongest of NDAs.

We’ve seen in the past that the storage industry can cooperate on shared formats and standards. We wouldn’t be in the era of pervasive storage we currently are without that cooperation. Fibre-channel, SCSI, iSCSI, FCoE, NDMP, etc., are proof positive that cooperation is possible. What’s different this time is the cooperation extends over a much larger realm to also encompass operating systems, applications, databases, etc., as well as all the storage components in ILM and ILP. (It makes backups seem to have a small footprint, and backups are amongst the most pervasive of technologies you can deploy within an enterprise environment.)

So we can hope that the reason we’re not hearing a lot of talk about ILM any more is that all the interested parties are either working on this level of integration, or even making the appropriate preparations themselves in order to start working together on this level of integration.

Fingers crossed people, but don’t hold your breath – no matter how closely they’re talking, it’s a long way off.

Posted in Architecture, General Technology, General thoughts, Security | Tagged: , , , , , , , , | 2 Comments »

Can you trust Azure?

Posted by Preston on 2009-11-18

So The Register has a story about how Microsoft is edging closer to delivering it’s cloud based system, Azure.

It seems inept that through the entire article, there wasn’t a single mention of the Sidekick Debacle. As you may remember, that debacle was sponsored by ‘Danger’, a Microsoft subsidiary. If you think Microsoft weren’t involved because Danger was a subsidiary, think again.

If we can learn anything from this, it’s that too many people like to close one eye and half shut the other one to make sure they don’t see all those dark and dangerous storm clouds racing around their silver linings.

Based on Microsoft’s track record, I wouldn’t trust Azure for a minute with a KB of my data even if they were paying me. Not until there’s an industry-wide alliance for certifying cloud based solutions and ensuring vendors actually treat customer data as if it were their own most sensitive and important data. Not until Microsoft are a gold member of that alliance and have come out of their first two audits with shining covers.

Until then when it comes to Azure, all I see are dark Clouds with no silver linings.

Posted in Aside, General Technology, General thoughts | Tagged: , , | 2 Comments »

Virtualisation as an exercise in MTBF

Posted by Preston on 2009-11-08

When IT people discuss Mean Time Between Failure (MTBF), the most common focus is on disk drives. We all know the basics for instance – the more drives you put in an array, the lower the cumulative MTBF, etc.

What impact does virtualisation have on MTBF though? Are there any published studies? I suspect not yet.

I’ll be clear from the outset: I like virtualisation.

Just because I like it though doesn’t lead me to question how many sites (particularly smaller ones) implement it, and the risks that they carry of effectively decreased MTBF by putting too many eggs in one basket.

Consider for instance a small business that decides, as part of an infrastructure refresh, to replace their current fileserver, directory server, mail server, database server and internet gateway server with a single VMware ESX server. (We’ll assume of course that they do not virtualise their backup server – something you should never do.)

So, instead of having five primary production servers, each of which has some chance of experiencing a catastrophic failure, we now have one primary production server which can still experience catastrophic failure. I’m not talking at the OS layer here (though that’s still relevant), but at the hardware layer.

Let’s be honest with ourselves – this is IT, and things can go wrong in IT just as they can anywhere else.

Now, in a small business such as the above, it can be argued that the loss of any one server is likely to cause a fair to serious inconvenience, but in each case, other functions are likely to still be carried out while the hardware is being repaired. If people can’t email, they may be able to catch up on some documentation or file related work. If people can’t access the database, they may be able to process things manually while still emailing, etc.

If all five servers go down at once, that’s a significantly more challenging proposition.

Anyone with exposure to virtualisation, high availability/redundancy or data protection should see what is needed here – a second server, shared storage and the ability to have guest systems moved from one virtualisation server to the other. (In smaller companies it may be achieved instead by just having a standby server with storage that can be accessed by the other host if necessary.)

However, it’s clear there’s more to running a virtualised environment than just whacking a big server in and virtualising the hosts that are already in the computer room.

Companies that are now just starting to adopt virtualisation may feel that it’s a mature enough industry that the time is ripe for jumping in – and they’re right. In fact, it’s been mature enough for long enough that virtualisation is practically old hat.

Regardless of the maturity of virtualisation though, it doesn’t change the fact that you’re still at the mercy of hardware failures (or other critical virtualisation-host failures), and you still have to design your systems to provide the appropriate level of protection you can (a) afford and (b) is necessary. When doing cost comparisons, it’s not appropriate to compare say, the cost of replacing 5 servers with another 5 servers vs replacing 5 servers with 1 beefier server – virtualised services should never be about putting all the eggs in just one basket.

Without that consideration, it’s too easy to see MTBF for your computing environment fall through the floor – and blame virtualisation technology instead of the real culprit: the practical implementation.

Posted in Aside, General Technology, General thoughts | Tagged: , | Comments Off on Virtualisation as an exercise in MTBF

Google service and accountability in The Cloud

Posted by Preston on 2009-11-02

Over at The Register, there’s a story, “Gmail users howl over Halloween Outage“. As readers may remember, I discussed in The Scandalous Truth about Clouds that there needs to be significant improvements in the realm of visibility and accountability from Cloud vendors if it is to achieve any form of significant trust.

The fact that there was a Gmail outage for some users wasn’t what caught my attention in this article – it seems that there’s almost always some users who are experiencing problems with Google Mail. What really got my goat was this quote:

Some of the affected users say they’re actually paying to use the service. And one user says that although he represents an organization with a premier account – complete with a phone support option – no one is answering Google’s support line. Indeed, our call to Google’s support line indicates the company does not answer the phone after business hours. But the support does invite you leave a message and provide an account pin number. Google advertises 24/7 phone support for premier accounts, which cost about $50 per user per year.

Do No Evil, huh, Google? What would you call unstaffed 24×7 support line for people who pay for 24×7 support?

It’s time for the cloud hype to be replaced by some cold hard reality checks: big corporates, no matter “how nice” they claim to be, will as a matter of indifference trample on individual end-users time and time again. Cloud is all about big corporates and individual end users. If we don’t get some industry regulation/certification/compliance soon, then as people continue to buy into the cloud hype, we’re going to keep seeing stories of data loss and data unavailability – and the frequency will continue to increase.

Shame Google, shame.

Posted in General Technology | Tagged: , , , | Comments Off on Google service and accountability in The Cloud