NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.
  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Twitter

    Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Things not to virtualise: backup servers and storage nodes

Posted by Preston on 2009-02-13

Introduction

When it comes to servers, I love virtualisation. No, not to the point where I’d want to marry virtualisation, but it is something I’m particularly keen about. I even use it at home – I’ve gone from 3 servers, one for databases, one as a fileserver, and one as an internet gateway down to one, thanks to VMware Server.

Done rightly, I think the average datacentre should be able to achieve somewhere in the order of 75% to 90% virtualisation. I’m not talking high performance computing environments – just your standard server farms. Indeed, having recently seen a demo for VMware’s Site Recovery Manager (SRM), and having participated in many site failover tests, I’ve become a bigger fan of the time and efficiency savings available through virtualisation.

That being said, I think backup servers fall into that special category of “servers that shouldn’t be virtualised”. In fact, I’d go so far as to say that even if every other machine in your server environment is virtual, your backup server still shouldn’t be a virtual machine.

There are two key reasons why I think having a virtualised backup server is a Really Bad Idea, and I’ll outline them below:

Dependency

In the event of a site disaster, your backup server should be at least equally the first server that is rebuilt. That is, you may start the process of getting equipment ready for restoration of data, but the backup server needs to be up and running in order to achieve data recovery.

If the backup server is configured as a guest within a virtual machine server, it’s hardly going to be the first machine to be configured is it? The virtual machine server will need to be built and configured first, then the backup server after this.

In this scenario, there is a dependency that results in the build of the backup server becoming a bottleneck to recovery.

I realise that we try to avoid scenarios where the entire datacentre needs to be rebuilt, but this still has to remain a factor in mind – what do you want to be spending time on when you need to recover everything?

Performance

Most enterprise class virtualisation systems offer the ability to set performance criteria on a per machine basis – that is, in addition to the basics you’d expect such as “this machine gets 1 CPU and 2GB of RAM”, you can also configure options such as limiting the number of MHz/GHz available to each presented CPU, or guaranteeing performance criteria.

Regardless though, when you’re a guest in a virtual environment, you’re still sharing resources. That might be memory, CPU, backplane performance, SAN paths, etc., but it’s still sharing.

That means at some point, you’re sharing performance. The backup server, which is trying to write data out to the backup medium (be that tape or disk), is potentially either competing with for, or at least sharing backplane throughput with the machines that is backing up.

This may not always make a tangible impact. However, debugging such an impact when it does occur becomes much more challenging. (For instance, in my book, I cover off some of the performance implications of having a lot of machines access storage from a single SAN, and how the performance of any one machine during backup is no longer affected just by that machine. The same non-trivial performance implications come into play when the backup server is virtual.)

In Summary

One way or the other, there’s a good reason why you shouldn’t virtualise your backup environment. It may be that for a small environment, the performance impact isn’t an issue and it seems logical to virtualise. However, if you are in a small environment, it’s likely that your failover to another site is likely to be a very manual process, in which case you’ll be far more likely to hit the dependency issue when it comes time for the full site recovery.

Equally, if you’re a large company that has a full failover site, then while the dependency issue may not be as much of a problem (due to say, replication, snapshots, etc.), there’s a very high chance that backup and recovery operations are very time critical, in which case the performance implications of having a backup server share resources with other machines will likely make a virtual backup server an unpalatable solution.

A final request

As someone who has done a lot of support, I’d make one special request if you do decide to virtualise your backup server*.

Please, please make sure that any time you log a support call with your service provider you let them know you’re running a virtual backup server. Please.


* Much as I’d like everyone to do as I suggest, I (a) recognise this would be a tad boring and (b) am unlikely at any point soon or in the future to become a world dictactor, and thus wouldn’t be able to issue such an edict anyway, not to mention (c) can occasionally be fallible.

Advertisements

6 Responses to “Things not to virtualise: backup servers and storage nodes”

  1. What about having your back up system with-in a Solaris zone?

    • Preston said

      I’m away from my computer at the moment, so I can’t say whether running a NSR server in a non-global zone is supported or not…

      That being said my personal preference would still be to avoid running a server or a storage node in a Solaris zone; it’s still effectively putting you into a position where resources are being shared for what is traditionally a performance critical/drive host.

      If you are virtualising, or even running in Solaris zones, overall monitoring and control of IO throughput and performance in general becomes much trickier — particularly if you don’t have root/admin priveleges on the virtual server/global zone.

  2. Alby said

    In Networker there are 3 classes of server as I understand it – the server console, Datazone servers and Storage nodes. I see your points working for the Datazone and Storage node. Can you comment on the need to keep the console server physical, or can it be virtualised?

    • Preston said

      I don’t believe there’s any driving reasons why you couldn’t virtualise either a management console or a license server (if you happened to use a license server). Neither of them are performance driven, and as such are actually ideal candidates for virtualisation.

      (Technically in a NetWorker datazone, there’s only one type of server – the backup server itself. Storage nodes aren’t referred to officially as servers in a NetWorker sense, and management console hosts are ‘servers’ but for a control zone – one control zone can administer multiple datazones. I’m not being nit-picky, I just thought I’d elaborate on how the terms are used, etc.)

  3. Alex said

    Its been 10 months now since this article (good read btw), am wondering what your thoughts are with the introduction and support of Networker server running in Solaris LDOM.

    Specially two Networker server running on each physical CMT system, the domains are physically separated by the system’s bus. Each LDOM would have direct access to FC tape drive for recovery and supportability reasons.

    For DR considerations all data is replicated to another site and for performance reasons the Networker server only serves as the controller for a datazone and does not handle real data IOs.

    • Preston said

      So I take it your goal in this description is to run two separate datazones from the same physical server?

      I’ll admit that my experience with LDOM is insufficient to provide any specific yay or nay recommendations on the proposed configuration. My gut reaction, even in a zeroth-tier configuration, is to keep backup servers on physically independent hosts. However, I will agree though that in the configuration you’re suggesting, where the actual backup processes will be handled by physically separate storage nodes, you have the most chance of running virtualised servers that don’t leach from each others resources – so long as you’re allocating a sufficient number of CPUs and the appropriate amount of RAM for each datazone to each server.

Sorry, the comment form is closed at this time.

 
%d bloggers like this: