Posts Tagged ‘performance tuning’

NetWorker 7.6 Performance Tuning Guide – No longer an embarrassment

Posted by Preston on 2009-11-21

While NetWorker 7.6 is not available for download as of the time I write this, the documentation is available on PowerLink. For those of you chomping at the bit to at least read up on NetWorker 7.6, now is the time to wander over to PowerLink delve into the documentation.

The last couple of releases of NetWorker have been interesting for me when it comes to beta testing. In particular, I’ve let colleagues delve into VCB functionality, etc., and I’ve stuck to “niggly” things – e.g., checking for bugs that have caused us and our customers problems in earlier versions, focusing on the command line, etc.

For 7.6 I also decided to revisit the documentation, particularly in light of some of the comments that regularly appear on the NetWorker mailing list about the sorry state of the Performance Tuning and Optimisation Guide.

It’s pleasing, now that the documentation is out, to read the revised and up to date version of the Performance Tuning Guide. Regularly critics of the guide for instance will be pleased to note that FDDI does not appear once. Not once.

Does it contain every possible useful piece of information that you might use when trying to optimise your environment? No, of course not – nor should it. Everyone’s environment will differ in a multitude of ways. Any random system patch can affect performance. A single dodgy NIC can affect performance. A single misconfigured LUN or SAN port can affect performance.

Instead, the document now focuses on providing a high level overview of performance optimisation techniques.

Additionally, recommendations and figures have been updated to support current technology. For instance:

There’s a plethora of information on PCI-X vs PCIeXpress.
RAM guidelines for the server based on the number of clients has been updated.
NMC finally gets a mention as a resource hog! (Obviously, that’s not the words used, but it’s the implication for larger environments. I’ve been increasingly encouraging larger customers to put NMC on a separate host for this reason.)
There’s a whole chunk on client parallelism optimisation, both for the clients and the backup server itself.

I don’t think this document is perfect, but if we’re looking at the old document vs the new, and the old document scored a 1 out of 10 on the relevancy front, this at least scores a 7 or so, which is a vast improvement.

Oh, one final point – with the documentation now explicitly stating:

The best approach for client parallelism values is:

– For regular clients, use the lowest possible parallelism settings to best balance between the number of save sets and throughput.

– For the backup server, set highest possible client parallelism to ensure that index backups are not delayed. This ensures that groups complete as they should.

Often backup delays occur when client parallelism is set too low for the NetWorker server. The best approach to optimize NetWorker client performance is to eliminate client parallelism, reduce it to 1, and increase the parallelism based on client hardware and data configuration.

(My emphasis)

Isn’t it time that the default client parallelism value were decreased from the ridiculously high 12 to 1, and we got everyone to actually think about performance tuning? I was overjoyed when I’d originally heard that the (previous) default parallelism value of 4 was going to be changed, then horrified when I found out it was being revised up, to 12, rather than down to 1.

Anyway, if you’ve previously dismissed the Performance Tuning Guide as being hopelessly out of date, it’s time to go back and re-read it. You might like the changes.

Posted in NetWorker | Tagged: 7.6, NetWorker 7.6, performance, performance tuning | 4 Comments »

Impact of high speed tape on backup

Posted by Preston on 2009-05-25

Back when I first started doing enterprise backup, DLT 7000 had just been introduced. There were a few systems I had to administer that still had DLT 4000 drives attached, but DLT 7000 was rapidly becoming the standard.

With DLT 7000 came a batch of additional headaches, most notably: how do I keep the damn thing streaming? With a 5MB/s write time and at least half of the servers in my environment still connected by 10Mbit rather than 100Mbit ethernet, keeping a drive of that speed streaming was a challenge involving juggling of backup timings and parallelism.

Fast forward 13 years, and we’ve come full circle. For a while systems and networks leapfrogged tape, or at least were able to mostly keep up with tape, but we’re now, with high speed tape like LTO-4, back to a situation the average site will struggle to keep tape streaming.

First, I guess I should qualify – what’s this streaming that I refer to? If you want to get down to the utter nuts and bolts of it, it refers to keeping the tape running through the drive mechanism at a consistent (and high) number of metres per second. (For instance, several LTO-4 drives are rated at 7 metres per second.) In backup terms, what we’re talking about is keeping a consistently high number of MB/s running to the drive.

When we’re unable to keep a consistently high number of MB/s running to the drive, one of two things will typically happen – if the drive is able to (and it depends entirely on the manufacturer and tape format), it may “step down” its streaming speed to a number that is more suitable to the environment. This has variable success. You might be able to argue it’s like only ever going up to 3rd gear in a Ferrari, but I don’t know cars so that’s likely to be a terribly analogy for a whole suite of reasons I don’t understand … :-)

The second thing that may happen is that the tape will start to shoe-shine. Shoe-shining is where the minimum threshold throughput for drive streaming can’t be achieved. The drive eventually starts stopping and starting when its buffers are emptied, etc., and this slows the backup down even further, plus creates additional wear and tear both on drives and on media.

To be blunt – the minimum goal of any backup administrator when it comes to performance tuning an environment should be to eliminate shoe-shining wherever possible.

So, back to that “full circle”; years ago, we’re now at the point again where keeping media streaming is a real challenge.

One problem that frequently occurs on new sites is that when evaluating tape formats for purchase, they look at that magic “bang for buck” number – the size of the media, in GB. For this reason, LTO-4 looks appealing to a large number of sites – 800 GB native, 1.6TB compressed (assuming 2:1 compression), it just seems like a great media format.

The problem that frequently happens though is that the streaming speed isn’t taken into consideration. LTO-4 on average has an uncompressed streaming speed of 120MB/s. This is not easy to achieve, and as you can imagine, achieving faster with compression is even more challenging.

Now, there are undoubtedly big environments that can easily keep LTO-4 streaming with direct backups from client to tape. But these aren’t your average environments. Look at the speed – 120MB/s – that’s faster than gigabit ethernet. We’re immediately talking either large trunked environments at both the server and the clients, or stepping up to 10 gigabit ethernet. We’re talking lots of spindles on high speed disk. Or to be perhaps a little crass, we’re talking buckets of $$$.

To me then the primary impact of high speed tape on backup is the need for organisations to rethink backup when using high speed tape. Using even LTO-3, it was possible for a gigabit based environment to achieve a modicum of tape streaming just by using higher levels of parallelism, etc. However, once you reach the point where your average streaming speed for native/uncompressed backups exceeds your average network speed, you must adjust the backup architecture.

The most common, and most appropriate way to achieve this is to move to a 2-tier storage system, comprising of a layer of disk and then the layer of tape.

Within NetWorker, there’s two ways to achieve this:

First backup to disk backup units (ADV_FILE devices), then clone/stage to tape.
First backup to virtual tape libraries (VTLs), then clone/stage to tape.

The purpose of either of these mechanisms is to put all the backups that would be done overnight, etc., into a single location where once it is streamed to tape the network is no longer a factor.

So, if we go down the disk backup unit option, this would mean attaching some high speed storage to the backup server (or a storage node – let’s assume in this instance that every time I say “backup server”, I could equally mean “storage node”), and also attach the LTO-4 drives to the backup server. When the backup is initially done though, it is run across the network to the backup server’s disk backup units. Once the backup completes, the backup server runs first cloning operations to write tape copies – without the network in play, and assuming we have suitable hardware connectivity, we should be able to easily keep LTO-4 streaming from one consistent and uninterrupted read from high speed disk. At a later point, we then stage that data – write a second copy, which when completes, removes the copy from the disk backup unit.

(I should note, there’s a raft of other options that can be deployed to assist with getting high speed tape streaming, many of which I discuss in the performance tuning section of my book. I’ve just picked the most common scenario here.)

If we go down the VTL path, we’re still essentially relying on the same mechanism, but in a different format. That is, we’re relying on the scenario that once all the data we want to transfer out to physical tape is on one “chunk” of high speed disk, we can do that transfer at streaming speed.

My first recommendation then to any site that is using LTO-4* in a direct-to-tape scheme, and can’t get drives streaming, is that they need to rethink their backup architecture. In the end it doesn’t matter how much time you spend tweaking software settings here and there, if the hardware can’t cut it, you won’t get it.

—
* More generally, as you may have imagined, this can apply to any tape format where, as I mentioned earlier in the article, the native streaming speed exceeds the native network speed.

Posted in Backup theory, NetWorker | Tagged: ADV_FILE, high speed tape, LTO-4, performance tuning, shoe-shining, streaming, VTL | Comments Off on Impact of high speed tape on backup

NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

This blog has moved!

Enterprise Systems Backup and Recovery

This blog has moved!

Twitter

Posts Tagged ‘performance tuning’

NetWorker 7.6 Performance Tuning Guide – No longer an embarrassment

Impact of high speed tape on backup