Posted by Preston on 2009-11-11
When I was at University, a philosophy lecturer remarked rather sagely that University is the last place people can go to learn for the sake of learning.
That’s sort of correct, but not always so. People can fumble through their jobs on a day to day basis learning what they have to, but they can also work along the basis of trying to soak up as much information as they can along the way. I’m not always a knowledge sponge – particularly if my caffeine quota is on the light side for the day, but I like to think I learn the odd thing here and there.
In the spirit of knowledge acquisition, here’s a few smaller things I’ve learned recently:
- When simulating network connectivity problems, there’s a big difference between yanking the network cable and shutting down the network interface. (I was doing the interface shutdown, another person was doing the network cable unplug – and our results didn’t correlate.) Lesson: When escalating a case to vendor support, always spell out how you’re simulating the “comms failure” a customer is having.
- The ‘bigasm’ utility starts to fall in a heap and becomes extremely unreliable once you exceed about 2100 GB of data generated for a single file. Lesson: When setting out to generate 2.3+ TB of backup data, create a bunch of files and have a bigasm directive to generate a smaller amount of data per file.
- When setting up tests that will take a couple of days to run, always triple check what you’re about to do before you start it. Lesson: If you make a typo of 250 files at 100 GB each instead of 250 files at 10 GB each, bigasm/NetWorker won’t interpolate what you really meant.
- There’s a hell of a difference between Solaris 10 AMD release 2 and release 8. Lesson: If wanting to get a Solaris 10 AMD 64-bit OS working in Parallels Desktop for Mac v5 with networking, go for release 8. It will save many forehead bruises.
- ext3 is about as “modern” a filesystem as I am an elite sportsperson. Lesson: If wanting to achieve decent operational activities with backup to disk under Linux, use XFS instead of ext3.
- All eSATA is not created equal. Lesson: When using an motherboard SATA -> eSATA converter, make sure the dual drive dock you order doesn’t work as a port multiplier.
Posted in Basics, General thoughts, NetWorker | Tagged: eSATA, ext3, network, parallels, xfs | Comments Off
Posted by Preston on 2009-11-05
Recently when I made an exasperated posting about lengthy ext3 check times and looking forward to btrfs, Siobhán Ellis pointed out that there was already a filesystem available for Linux that met a lot of my needs – particularly in the backup space, where I’m after:
- Being able to create large filesystems that don’t take exorbitantly long to check
- Being able to avoid checks on abrupt system resets
- Speeding up the removal of files when staging completes or large backups abort
That filesystem of course is XFS.
I’ve recently spent some time shuffling data around and presenting XFS filesystems to my Linux lab servers in place of ext3, and I’ll fully admit that I’m horribly embarrassed I hadn’t thought to try this out earlier. If anything, I’m stuck looking for the right superlative to describe the changes.
Case in point – I was (and indeed still am) doing some testing where I need to generate >2.5TB of backup data from a Windows 32-bit client for a single saveset. As you can imagine, not only does this take a while to generate, but it also takes a while to clear from disk. I had got about 400 GB into the saveset the first time I was testing and realised I’d made a mistake with the setup so I needed to stop and start again. On an ext3 filesystem, it took more than 10 minutes after cancelling the backup before the saveset had been fully deleted. It may have taken longer – I gave up waiting at that point, went to another terminal to do something else and lost track of how long it actually took.
It was around that point that I recalled having XFS recommended to me for testing purposes, so I downloaded the extra packages required to use XFS within CentOS and reformatting the ~3TB filesystem to XFS.
The next test that I ran aborted due to a (!!!) comms error 1.8TB through the backup. Guess how long it took to clear the space? No, seriously, guess – because I couldn’t log onto the test server fast enough to actually see the space clearing. The backup aborted, and the space was suddenly back again. That’s a 1.8TB file deleted in seconds.
That’s the way a filesystem should work.
I’ve since done some (in VMs) nasty power-cycle mid-operation tests and the XFS filesystems come back up practically instantaneously – no extended check sessions that make you want to cry in frustration.
If you’re backing up to disk on Linux, you’d be mad to use anything other than XFS as your filesystem. Quite frankly, I’m kicking myself that I didn’t do this years ago.
Posted in Linux, NetWorker | Tagged: ADV_FILE, backup to disk, ext3, filesystem, Linux, xfs | 8 Comments »
Posted by Preston on 2009-08-02
This morning I had to replace half of a mirror in my Linux server, which being a home server, meant needing to reboot. (Of course I’d love a server with hot-swappable drives, but I suspect both my partner and I might find the noise of such a server somewhat overwhelming for combination computer-room/office.)
So I shutdown the virtual machines running on the HP ML110 G4 (running VMware Server), shutdown the server itself, swapped the drives and rebooted. I promised my partner that internet access would only be down for about 15 minutes, so of course Murphy decided to play a visit.
“Warning, /dev/sdc1 has not been checked in 191 days. Check forced.”
And wouldn’t you know it, /dev/sdc1 is 917GB, so the check took quite a lot longer than 15 minutes. It’s running on ext3, so checking is required less frequently, but I’m still paranoid enough that for the data housed on that particular filesystem, I’d rather not turn checking off altogether. (That filesystem isn’t mirrored, due to transient data on it.)
Having only just read about btrfs, an upcoming filesystem for Linux, the lengthy delay-on-boot caused by a “large” filesystem check was acutely hammered home. Amongst other things, btrfs promises online checks of the filesystem, as well as fast offline checks of the filesystem – something every storage administrator and system administrator wants. We are at the point where filesystem capacities are routinely too large for conventional exhaustive checks. Many more modern filesystems have already achieved this (e.g., ZFS, VxFS, etc.), but it’s a relief to know that such advances are coming, and coming with corporate sponsorship, to Linux.
For an excellent overview of btrfs, check out this short history published on Linux Weekly News.
Posted in Aside | Tagged: btrfs, ext3, fsck | 2 Comments »