NetWorker Blog

Commentary from a long term NetWorker consultant and Backup Theorist

  • This blog has moved!

    This blog has now moved to nsrd.info/blog. Please jump across to the new site for the latest articles (and all old archived articles).
  •  


     


     

  • Enterprise Systems Backup and Recovery

    If you find this blog interesting, and either have an interest in or work in data protection/backup and recovery environments, you should check out my book, Enterprise Systems Backup and Recovery: A Corporate Insurance Policy. Designed for system administrators and managers alike, it focuses on features, policies, procedures and the human element to ensuring that your company has a suitable and working backup system rather than just a bunch of copies made by unrelated software, hardware and processes.

Archive for the ‘Basics’ Category

Manually Staging? Don’t forget the Clone ID!

Posted by Preston on 2009-11-13

Something that continues to periodically come up is the need to remind people running manual staging to ensure they specify both the SSID and the Clone ID when they stage. I did some initial coverage of this when I first started the blog, but I wanted to revisit and demonstrate exactly why this is necessary.

The short version of why is simple: If you stage by SSID alone, NetWorker will delete/purge all instances of the saveset other than the one you just created. This is Not A Good Thing for 99.999% of what we do within NetWorker.

So to demonstrate, here’s a session where I:

  1. Generate a backup
  2. Clone the backup to tape
  3. Stage the saveset only to tape

In between each step, I’ll run mminfo to get a dump of what the media database says about saveset availability.

Part 1 – Generate the Backup

Here’s a very simple backup for the purposes of this demonstration, and the subsequent mminfo command to find out about the backup:

[root@tara ~]# save -b Default -LL -q /etc
save: /etc  106 MB 00:00:07   2122 files
completed savetime=1258093549

[root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid,
savetime
 volume        ssid          clone id  date
Default.001    2600270829  1258093549 11/13/2009
Default.001.RO 2600270829  1258093548 11/13/2009

There’s nothing out of the ordinary here, so we’ll move onto the next step.

Part 2 – Clone the Backup

We’ll just do a manual clone to the Default Clone pool. Here we’ll specify the saveset ID alone, which is fine for cloning – but is often what leads people to being in the habit of not specifying a particular saveset instance. I’m using very small VTL tapes, so don’t be worried that in this case I’ve got a clone of /etc spanning 3 volumes:

[root@tara ~]# nsrclone -b "Default Clone" -S 2600270829
[root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid,
savetime
 volume        ssid          clone id  date
800843S3       2600270829  1258094164 11/13/2009
800844S3       2600270829  1258094164 11/13/2009
800845S3       2600270829  1258094164 11/13/2009
Default.001    2600270829  1258093549 11/13/2009
Default.001.RO 2600270829  1258093548 11/13/2009

As you can see there, it’s all looking fairly ordinary at this point – nothing surprising is going on at all.

Part 3 – Stage by Saveset ID Only

In this next step, I’m going to stage by saveset ID alone rather than specifying the saveset ID/clone ID, which is the correct way of staging, so as to demonstrate what happens at the conclusion of the staging. I’ll be staging to a pool called “Big”:

[root@tara ~]# nsrstage -b Big -v -m -S 2600270829
Obtaining media database information on server tara.pmdg.lab
Parsing save set id(s)
Migrating the following save sets (ids):
 2600270829
5874:nsrstage: Automatically copying save sets(s) to other volume(s)

Starting migration operation...
Nov 13 17:34:00 tara logger: NetWorker media: (waiting) Waiting for 1 writable 
volume(s) to backup pool 'Big' disk(s) or tape(s) on tara.pmdg.lab
5884:nsrstage: Successfully cloned all requested save sets
5886:nsrstage: Clones were written to the following volume(s):
 BIG991S3
6359:nsrstage: Deleting the successfully cloned save set 2600270829
Successfully deleted original clone 1258093548 of save set 2600270829 
from media database.
Successfully deleted AFTD's companion clone 1258093549 of save set 2600270829 
from media database with 0 retries.
Successfully deleted original clone 1258094164 of save set 2600270829 
from media database.
Recovering space from volume 4294740163 failed with the error 
'Cannot access volume 800844S3, please mount the volume or verify its label.'.
Refer to the NetWorker log for details.
6330:nsrstage: Cannot access volume 800844S3, please mount the volume 
or verify its label.
Completed recover space operation for volume 4177299774
Refer to the NetWorker log for any failures.
Recovering space from volume 4277962971 failed with the error 
'Cannot access volume 800845S3, please mount the volume or verify its label.'.
Refer to the NetWorker log for details.
6330:nsrstage: Cannot access volume 800845S3, please mount the volume 
or verify its label.
Recovering space from volume 16550059 failed with the error 
'Cannot access volume 800843S3, please mount the volume or verify its label.'.
Refer to the NetWorker log for details.
6330:nsrstage: Cannot access volume 800843S3, please mount the volume 
or verify its label.

You’ll note there’s a bunch of output there about being unable to access the clone volumes the saveset was previously cloned to. When we then check mminfo, we see the consequences of the staging operation though:

[root@tara ~]# mminfo -q "client=tara.pmdg.lab,name=/etc" -r volume,ssid,cloneid,
savetime
 volume        ssid          clone id  date
BIG991S3       2600270829  1258095244 11/13/2009

As you can see – no reference to the clone volumes at all!

Now, has the clone data been erased? No, but it has been removed from the media database, meaning you’d have to manually scan the volumes back in order to be able to use them again. Worse, if those volumes only contained clone data that was subsequently removed from the media database, they may become eligible for recycling and get re-used before you notice what has gone wrong!

Wrapping Up

Hopefully the above session will have demonstrated the danger of staging by saveset ID alone. If instead of staging by saveset ID we staged by saveset ID and clone ID, we’d have had a much more desirable outcome. Here’s a (short) example of that:

[root@tara ~]# save -b Default -LL -q /tmp
save: /tmp  2352 KB 00:00:01     67 files
completed savetime=1258094378
[root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid
 volume        ssid          clone id
Default.001    2583494442  1258094378
Default.001.RO 2583494442  1258094377
[root@tara ~]# nsrclone -b "Default Clone" -S 2583494442

[root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid
 volume        ssid          clone id
800845S3       2583494442  1258095244
Default.001    2583494442  1258094378
Default.001.RO 2583494442  1258094377
[root@tara ~]# nsrstage -b Big -v -m -S 2583494442/1258094377
Obtaining media database information on server tara.pmdg.lab
Parsing save set id(s)
Migrating the following save sets (ids):
 2583494442
5874:nsrstage: Automatically copying save sets(s) to other volume(s)

Starting migration operation...

5886:nsrstage: Clones were written to the following volume(s):
 BIG991S3
6359:nsrstage: Deleting the successfully cloned save set 2583494442
Successfully deleted original clone 1258094377 of save set 2583494442 from 
media database.
Successfully deleted AFTD's companion clone 1258094378 of save set 2583494442 
from media database with 0 retries.
Completed recover space operation for volume 4177299774
Refer to the NetWorker log for any failures.

[root@tara ~]# mminfo -q "name=/tmp" -r volume,ssid,cloneid
 volume        ssid          clone id
800845S3       2583494442  1258095244
BIG991S3       2583494442  1258096324

The recommendation that I always make is that you forget about using saveset IDs alone unless you absolutely have to. Instead, get yourself into the habit of always specifying a particular instance of a saveset ID via the “ssid/cloneid” option. That way, if you do any manual staging, you won’t wipe out access to data!

Posted in Basics, NetWorker, Scripting | Tagged: , , , , , | 2 Comments »

Lessons I’ve recently learned…

Posted by Preston on 2009-11-11

When I was at University, a philosophy lecturer remarked rather sagely that University is the last place people can go to learn for the sake of learning.

That’s sort of correct, but not always so. People can fumble through their jobs on a day to day basis learning what they have to, but they can also work along the basis of trying to soak up as much information as they can along the way. I’m not always a knowledge sponge – particularly if my caffeine quota is on the light side for the day, but I like to think I learn the odd thing here and there.

In the spirit of knowledge acquisition, here’s a few smaller things I’ve learned recently:

  • When simulating network connectivity problems, there’s a big difference between yanking the network cable and shutting down the network interface. (I was doing the interface shutdown, another person was doing the network cable unplug – and our results didn’t correlate.) Lesson: When escalating a case to vendor support, always spell out how you’re simulating the “comms failure” a customer is having.
  • The ‘bigasm’ utility starts to fall in a heap and becomes extremely unreliable once you exceed about 2100 GB of data generated for a single file. Lesson: When setting out to generate 2.3+ TB of backup data, create a bunch of files and have a bigasm directive to generate a smaller amount of data per file.
  • When setting up tests that will take a couple of days to run, always triple check what you’re about to do before you start it. Lesson: If you make a typo of 250 files at 100 GB each instead of 250 files at 10 GB each, bigasm/NetWorker won’t interpolate what you really meant.
  • There’s a hell of a difference between Solaris 10 AMD release 2 and release 8. Lesson: If wanting to get a Solaris 10 AMD 64-bit OS working in Parallels Desktop for Mac v5 with networking, go for release 8. It will save many forehead bruises.
  • ext3 is about as “modern” a filesystem as I am an elite sportsperson. Lesson: If wanting to achieve decent operational activities with backup to disk under Linux, use XFS instead of ext3.
  • All eSATA is not created equal. Lesson: When using an motherboard SATA -> eSATA converter, make sure the dual drive dock you order doesn’t work as a port multiplier.

Posted in Basics, General thoughts, NetWorker | Tagged: , , , , | Comments Off on Lessons I’ve recently learned…

Basics – NetWorker Data Lifecycle

Posted by Preston on 2009-10-26

Within NetWorker, data (savesets) can go through several stages in its lifecycle. Here’s a simple overview of those stages:

Basic data lifecycle

Basic data lifecycle

The first stage, obviously, is when data is initially being written – the “in progress” stage.

After the backup completes, data enters two stages – a browsable period and a retention period. These periods may have 100% overlap, or they may be distinctly different. For instance, the “standard” browse/retention policies chosen by NetWorker when you create a new client are:

  • Browse period – 1 month
  • Retention period – 1 year

A common mistake people make with NetWorker is to assume that the retention period starts when the browse period finishes; in actual fact, the retention and browse period start at the same time, but the browse period can finish before the retention period. So using that standard setting as an example, the saveset is browsable for the first 1 month of the 12 months that it is retained – it is not the case that the saveset is browsable for 1 month, then retained for another 12.

Once data is no longer within the retention period, and there are no backups that depend on it still within the retention period, data is considered to be recyclable.

When data is recyclable:

  • If it is on tape:
    • The data will remain available until the media is recycled. This will only happen once all the backups on the media is also recyclable, and either the administrator manually recycles the media or NetWorker re-uses it.
  • If it is on a disk backup unit (ADV_FILE) device:
    • The data will be erased from the disk backup unit the next time a volume clean operation is run, or nsrim is run (either as a overnight standard event by NetWorker, or manually via nsrim -X).

This isn’t the “whole picture” for data lifecycle within NetWorker, but it is a good brief overview to give you an idea of how data is managed within the environment.

Posted in Basics, NetWorker | Tagged: , , , | Comments Off on Basics – NetWorker Data Lifecycle

Basics – Default pool debugging 101

Posted by Preston on 2009-10-15

Many of us with NetWorker have been in the situation where a backup has started (particularly when it’s for a newly configured group), and instead of going to the pool we want it to go to, it’s goes to the Default pool. For sites using multiple pools, it’s usually the case that no media will be in the Default pool, and hence the backup won’t go anywhere.

In those situations, determining why NetWorker is suddenly requesting media in the Default pool is quite easy. Sometimes however, the answer is not so easy. A media request may come out of the blue, with no server-initiated activities behind it, and nothing may be logged to indicate what is causing the request. It could be that an end-user is attempting to run a backup, or that a backup process that was server initiated has gone awry, restarted, and for some reason targeted the Default pool.

This leads me to what I’d call “Default pool debugging 101” … or “how to save yourself a lot of hair tearing”. I had a customer once who called me and expressed a level of exasperation over having already spent several days off and on chasing down what might be causing the persistent request for “1 writable volume in the Default pool”.

My solution in such situations is simple: if you can’t spot what is going wrong – why NetWorker is asking for the media in the wrong pool, then label a volume into that pool and see what writes to it. In such cases one of three things will typically happen:

  1. The volume will be loaded but then not used because a process requested it, was aborted, and for some reason NetWorker didn’t detect the abort.
  2. The volume will be loaded and written to by a manual backup process, in which case the metadata for the backup can be used to identify who (or what) has sent the data to the wrong pool.
  3. The volume will be loaded and written to by an errant scheduled backup process that experienced some failure “a while ago”, in which case it can be staged, upon completion, to the correct pool.

I’m the first person to jump to the defense of elegant and well considered solutions. Doing the mundane thing of just labeling media into the “incorrect” pool that NetWorker is requesting media for smacks of inelegance or even a pseudo “brute force” approach. However, sometimes the easiest solution is also the best – instead of wasting considerable amounts of time chasing phantoms, why not just cut to the chase in such media situations where the solution isn’t obvious, and let NetWorker tell you where the request is coming from?

Posted in Basics | Tagged: , , , , | Comments Off on Basics – Default pool debugging 101

Library sharing vs Dynamic Drive Sharing

Posted by Preston on 2009-10-12

Hi!

The text of this post is available at the NetWorker Information Blog. Click here to read it.

Posted in Backup theory, Basics, NetWorker | Tagged: , , , | 4 Comments »

Basics – Standalone drive auto media management

Posted by Preston on 2009-09-23

I don’t have many customers with standalone tape drives. Usually when they do, it’s due to one of two reasons:

  • Purchased to support recovery from previous-format media during a format change.
  • Used in remote or satellite offices for local backups.

In the first instance, a company may say, replace SDLT with LTO, but decide not to stage their long-term backups from SDLT to the replacement media. Instead, they may just say, purchase a standalone SDLT drive so that future recovery requests can be met (albeit more slowly) through protein based autoloading.

In the second instance, a company may either run multiple NetWorker servers, or a WAN based datazone with storage nodes in satellite offices. In smaller offices, an autochanger may be either undesirable or represent too high a cost, and therefore one or more standalone tape drives may be deployed.

One of the questions that does inevitably come up whenever I do encounter people with standalone drives is “how can I make NetWorker just automatically load and use the tape that’s put in by the <janitor|secretary>?”

There are limits to what you can achieve when your tape operators have either (a) no technical skill or (b) no access to the hosts they are replacing media for, but there’s one thing that you can enable which will make your life slightly easier in these situations – standalone device auto media management.

When we normally think of auto media management, we think of tape libraries. In tape libraries, auto media management refers to one thing alone – having NetWorker automatically label previously unlabeled media when it gets to a point that no labeled media is available.

However, when auto media management is enabled for standalone tape drives, it fulfills two very useful functions. These are:

  • Recyclable volumes loaded into the drive are automatically recycled.
  • Unlabeled volumes loaded into the drive are automatically labeled. (From memory, this is to the Default pool, but in small satellite offices, that often ends up being used.)

These are done whenever the device is idle – i.e,. when it’s not being used, NetWorker monitors the device for the above two situations and acts accordingly.

While this doesn’t solve all problems with tape management at satellite offices using standalone drives, it does at least help.

Posted in Basics, NetWorker | Tagged: , , , , , , | 2 Comments »

Basics – Directed Recoveries

Posted by Preston on 2009-09-21

Hi,

This blog post can now be read at the NetWorker Information Hub.

Posted in Basics, NetWorker | Tagged: , , , | 2 Comments »

Basics – Peeking inside your jukebox without leaving your desk

Posted by Preston on 2009-09-10

In order to speed up jukebox operations, NetWorker maintains a cache, or a map, if you will, of the current expected jukebox state based on the operations that have happened since it was last fully queried. This avoids having to do (time) costly SCSI probes before every operation.

(This, for what it’s worth, is why you can’t have another process, or another person, playing with the jukebox as well as NetWorker. For instance, a customer once had their jukebox accessible to all the developers on-site. They found on average the jukebox got into a terrible state several times a day, and thought they had a lemon of a product (either NetWorker or the STK L700) until they found out that having developers open the library door, arbitrarily pull tapes out and put new tapes in was not a good idea.)

Coming back to jukeboxes though, there are times when the cache is out of sync with reality. A few of the more common scenarios where this will happen are:

  • In disaster recovery situations
  • In situations where someone has manually moved around media
  • In situations where NetWorker has lost track of state due to a lengthy timeout on an error

In situations such as these, there’s an invaluable tool called sjirdtag that can come to the rescue. Instead of checking with the NetWorker cached contents of the library, sjirdtag instead delves down into what the library describes as its own content. I.e., it’s like peeking inside the library without having to leave your desk.

In order to use sjirdtag, you need to know the SCSI control port of the library; this is reported in the library properties in NetWorker management console, or you can find it out relatively quickly via inquire:

[root@tara ~]# inquire -l

-l flag found: searching all LUNs, which may take over 10 minutes per adapter
 for some fibre channel adapters.  Please be patient.

scsidev@0.0.0:STK     L700            5500|Autochanger (Jukebox), /dev/sg1
                                           S/N:    XYZZY     
                                           ATNN=STK     L700            XYZZY     
                                           WWNN=5123456003030303
scsidev@0.1.0:QUANTUM SDLT600         5500|Tape, /dev/nst0
                                           S/N:    ZF7584364
                                           ATNN=QUANTUM SDLT600         ZF7584364
                                           WWNN=5123456003030303

In this case, our library (a VTL presenting itself as an STK L700) is on scsidev@0.0.0. So, when we want to check the contents of the library, we run the command sjirdtag 0.0.0 – which looks like the following:

[root@tara ~]# sjirdtag 0.0.0
Tag Data for 0.0.0, Element Type DATA TRANSPORT:
        Elem[001]: tag_val=0 pres_val=1 med_pres=0 med_side=0
Tag Data for 0.0.0, Element Type STORAGE:
        Elem[001]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800843S3                       >
        Elem[002]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800844S3                       >
        Elem[003]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800845S3                       >
        Elem[004]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800846S3                       >
        Elem[005]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800847S3                       >
        Elem[006]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800848S3                       >
        Elem[007]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800849S3                       >
        Elem[008]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800850S3                       >
        Elem[009]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800851S3                       >
        Elem[010]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800852S3                       >
        Elem[011]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800853S3                       >
        Elem[012]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800854S3                       >
        Elem[013]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800855S3                       >
        Elem[014]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800856S3                       >
        Elem[015]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800857S3                       >
        Elem[016]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800858S3                       >
        Elem[017]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800859S3                       >
        Elem[018]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800860S3                       >
        Elem[019]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800861S3                       >
        Elem[020]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<800862S3                       >
        Elem[021]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG990S3                       >
        Elem[022]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG991S3                       >
        Elem[023]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG992S3                       >
        Elem[024]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG993S3                       >
        Elem[025]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG994S3                       >
        Elem[026]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG995S3                       >
        Elem[027]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG996S3                       >
        Elem[028]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG997S3                       >
        Elem[029]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG998S3                       >
        Elem[030]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<BIG999S3                       >
        Elem[031]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<CLN001L1                       >
        Elem[032]: tag_val=1 pres_val=1 med_pres=1 med_side=0
                   VolumeTag=<CLN002L1                       >
Tag Data for 0.0.0, Element Type MEDIA TRANSPORT:
        Elem[001]: tag_val=0 pres_val=1 med_pres=0 med_side=0
Tag Data for 0.0.0, Element Type IMPORT/EXPORT:
        Elem[001]: tag_val=0 pres_val=1 inp_enab=1 exp_enab=1 access=1 full=0 imp_exp=1
        Elem[002]: tag_val=0 pres_val=1 inp_enab=1 exp_enab=1 access=1 full=0 imp_exp=1
        Elem[003]: tag_val=0 pres_val=1 inp_enab=1 exp_enab=1 access=1 full=0 imp_exp=1
        Elem[004]: tag_val=0 pres_val=1 inp_enab=1 exp_enab=1 access=1 full=0 imp_exp=1

For those who are unfamiliar with sjirdtag, let’s break this up into the four sections presented (using the capitalisation in the output – not shouting):

  • DATA TRANSPORT – Refers to the tape drives within the library – i.e., the units responsible for transporting the data.
  • STORAGE – The slots used by the library for storage of cartridges. This does not refer to the slot(s) in the CAP/MAS.
  • MEDIA TRANSPORT – The robot head(s). There’ll be one per robot head.
  • IMPORT/EXPORT – The contents of the slots in the CAP/MAS.

If you’re wondering about those element numbers, they’re essentially the positions or numbers of the units as assigned by the library. In particular, for the drives (DATA TRANSPORT) section, these refer to the drives in order as they are presented by the tape library; this means that if your operating system drive mappings don’t match the library sequence, the output here also won’t match the operating system sequence of devices.

Now for each element other than the CAP/MAS areas, we get the following selection of information:

tag_val=[0|1] pres_val=[0|1] med_pres=[0|1] med_side=[0|1]

Each of these items mean:

  • tag_val – Indicates that there’s SCSI tag data for that element. 1 for yes, 0 for no.
  • med_pres – Jukebox state indicates that there is media present in this location. 1 for yes, 0 for no.
  • pres_val – A bit of an airy-fairy value; if set to 1, then it means that the med_pres value should be fairly believable. If set to 0 but the med_pres value is 1, then while there may be media present, there may also be an error condition. If set to 0, and med_pres is set to 0, then it also means that the med_pres value should be fairly believable.
  • med_side – For jukeboxes/media that supports double-sided media (e.g., older optical disk libraries), this indicates which side of the media is in use; for tape based libraries, this will always be 0.

For any element that has a volume with a barcode, this will be shown on the line underneath the element details with the format:

VolumeTag=<PCL                 >

For our import/export regions, the additional options, inp_enab, exp_enab, access, full and imp_exp are effectively undocumented, but my assumption on these items are:

  • inp_enab – Slot can be used for import.
  • exp_enab – Slot can be used for export.
  • access – Slot is accessible.
  • imp_exp – Slot is an import/export slot.

(The other option, “full”, most definitely indicates whether the slot is occupied or not.)

As can be evidenced by the “airy-fairy” nature of the pres_val tag, there’s no 100% guarantee that this information is physically accurate. However, it is an accurate reflection of the state that the library thinks it’s in, and thus is an accurate reflection of how the library will behave in response to requested operations. Furthermore, if the state shown by sjirdtag differs from the state shown by nsrjb, then it’s a good indication that it’s time to reset/reinventory the library. I.e., time to run:

# nsrjb -HEvvv
# nsrjb -II

(The reset instructs NetWorker to throw away its state information, tell the library to reinitialise itself, and then refreshes the volume state.The inventory command specified is assuming a barcode-supported library with barcoded volumes.)

Things that I routinely use (or get customers to use) sjirdtag for include:

  • Checking to see if there is a tape in a drive that NetWorker thinks is empty.
  • Checking to see if the tape NetWorker thinks is in a drive really is in the drive.
  • Checking to see if operators at a remote library have loaded media into the CAP/MAS.
  • Checking to see if there is a tape stuck in the robot gripper.
  • Finding the bootstrap volume when a disaster recovery (mmrecov) is required.

If you’ve not used sjirdtag before, it’s worthwhile scheduling a time where there’s minimal activity in the library so you can check it out.

Posted in Basics, NetWorker | Tagged: , , , , | Comments Off on Basics – Peeking inside your jukebox without leaving your desk

Basics – Staging

Posted by Preston on 2009-09-08

Hi!

This text of this post can be viewed over at the NetWorker Information Hub.

Posted in Basics | Tagged: , , , | Comments Off on Basics – Staging

Basics – Perpetual overrides in schedules

Posted by Preston on 2009-09-07

In the dim dark days of NetWorker (e.g., v4 and v5), I used to have periodic cron jobs that would fire off on my local workstation – these would send me snappy messages, once every 3-6 months, along the lines of:

“Right you, time to update all the overrides in all the Daily/Monthly schedules!”

Of course, this referred to wanting to run backups which say, had a monthly full on the last Friday of the month, and therefore skipped the daily backup on the final Friday of the month. NetWorker didn’t really support this other than to sit in the schedule configuration and one at a time, set the required date for the end of each month to a Full or Skip depending on the schedule.

To say that it was tedious was a bit of an understatement.

Thankfully, in version 6 of NetWorker, a “bug” was introduced that sort of allowed this to be perpetually set; in v7 however, setting overrides perpetually became more readily available. Now, to me, it’s one of the most useful options that you can have within NetWorker’s schedules. I’ll show you how it works using the example above.

First, let’s create our Daily schedule; it will initially look like this:

Classic "Daily" Schedule

Classic "Daily" Schedule

Now, in the old days (and using the old, hideous GUIs), if you wanted to set the last Friday of the month to skip, you’d have to do this for the last Friday of each month for as long as you were prepared to scan through:

Classic method for setting skip on final Friday of the month

Classic method for setting skip on final Friday of the month

As you can imagine, this would get very tedious, very quickly. Yes, you could script it using nsradmin, but quite frankly, at the time that I wanted to do this, I was a lazy scripter and hated working with dates. These days I’m a less lazy scripter, but I still try to avoid working with dates wherever possible.

That being said, there’s now a much easier way, and you do it by modifying the schedule view to turn off the calendar functionality. While that functionality is good for day to day browsing, it does obstruct some more powerful usage of the schedule system. To do so, right-click on the Daily schedule within the schedule list:

Turning off calendar view for schedules

Turning off calendar view for schedules

Now, with that turned off, edit the Daily policy again – instead of coming up as a calendar, you’ll instead see:

Non-calendar view of Daily Schedule

Non-calendar view of Daily Schedule

In order to set a perpetual override, all we have to do is update the “Override” field with the following string:

skip last friday every month

Which will look like the following:

Setting the perpetual override for the Daily schedule

Setting the perpetual override for the Daily schedule

When we revert back to calendar view, we can then see the skip applied not only for the current month, but every month selected moving forward:

Daily schedule showing perpetual override

Daily schedule showing perpetual override

Now, here’s another trick by turning off calendar view. It’s easier to create monthly schedules. Remember the “action” field in the non-calendar view? That’s basically a list of levels, one per day, for the type of schedule we create – either daily or monthly. When it’s a daily schedule, it’s a list that starts on Sunday and finishes on Saturday. When it’s a monthly schedule, it starts on the first, and finishes on the thirty-first – any days that don’t exist in a month (e.g., the twenty-ninth through to the thirty-first in any regular February) are just ignored.

The trick to lazy schedule creation is knowing that if the list is shorter than the number of days specified by the schedule type, NetWorker will just loop the list again and again until it’s got the right number of entries. The net result of this is: you can create the Monthly schedule much quicker. Instead of either creating it as a Monthly schedule with “skip” set for every day of the month, or a Daily schedule with “skip” set for every day of the week, you can instead just do this:

Monthly schedule creation (non-calendar view)

Monthly schedule creation (non-calendar view)

When viewing this schedule as a calendar again, we can see that it works exactly as we want:

Perpetual monthly schedule in calendar view

Perpetual monthly schedule in calendar view

It couldn’t be easier or simpler!

Posted in Basics, NetWorker | Tagged: , , , | Comments Off on Basics – Perpetual overrides in schedules