NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to Please jump across to the new site for the latest articles (and all old archived articles).



  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.
  • Advertisements
  • This blog has moved!

    This blog has now moved to Please jump across to the new site for the latest articles (and all old archived articles).



  • Twitter

    Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Is a “copy” a “backup”?

Posted by Preston on 2009-09-30

There’s been of discussions on various storage blogs both previously, and again now on whether a copy (e.g., a tarball, or a snapshot, etc.) is a backup. There have been arguments on both sides of the fence, and I’m going to equally contribute to those arguments now.

You see, a copy is a backup, and it’s not a backup.

It’s almost like Schrödinger’s Cat – it may be a backup, or it may not be a backup, and you won’t know for sure until you look more closely at it.

In my book, I set out early in the process to define a backup, and define it as follows:

A backup is a copy of any data that can be used to restore the data as/when required to its original form. That is, a backup is a valid copy of data, files, applications or operating systems that can be used for the purposes of recovery.

So it would seem then that I come down fairly heavily in favour of the notion that a copy is a backup. Well, yes – and no.

In the broadest sense of the term, a random copy of data such as a tarball, an rsync, a zip file, a read-only snapshot is indeed a “backup”, as it can be used, in a single instance, for the purposes of recovery. However, so too could be a binary print-out/dump of the exact state of every bit on a LUN. Few would argue though that such an arduous and manual re-entry process would really be recoverable, even though in theory it is.

The reason that it’s not really recoverable is we’re all aware of the time frames required for recovery – recoveries must be completed in a timeframe that is useful to the business (or the end user) who needs the data back. Without that, we don’t really have a backup at all – just a random copy of the data.

If we look past the broad term “backup” though, and actually evaluate the term backup system, then I would suggest that a single “backup”, unless it’s an instantiation of protection from the backup system, is not a backup at all, but instead is just a random (or pseudo-random) copy.

To me this boils down to the need to work with the notion of Information Lifecycle Protection. As you may recall, in a previous blog entry I suggested that there’s a need to break off data protection activities from ILM and define a new process that revolves around keeping data available in order to be managed by ILM. It may seem a small distinction, but it’s one which helps in these sorts of discussions. At the time I suggested that conceptually, ILP may be represented as follows:

Components of ILP

Components of ILP

Under this definition, we can cease to worry about whether a copy is a backup, because clearly, a copy will be part of an overall ILP strategy. It’s still data protection, but it doesn’t have to be backup in order to be data protection.

My personal opinion is that a single, isolated copy is technically a backup, but is logically not a backup. “Technically is” because it can be used to restore data. “Logically not” because it’s not in itself a guarantee of a correctly designed backup system. I.e., unless we can say that the copy came from the backup system, we can’t be guaranteed it’s a backup.

One last quote from my book – this time from the back page:

A well-designed backup system comes about only when several key factors coalesce: business involvement, IT acceptance, best practice designs, enterprise software and reliable hardware.

So the answer I guess to “is a copy a backup” is another question – “did the copy from a backup system?” If the answer to that question is yes, then the answer to the original question is the same. If the answer is no, we can’t reliably answer “yes” to the original question.


Sorry, the comment form is closed at this time.

%d bloggers like this: