The 12 Days of dd: Day Eleven
On the eleventh day of dd, we dd a tape (part two of two).
On the Tenth Day of dd we learned how to get our tape drive prepared for dd. Now we can begin the actual command line process.
With your tape in the drive the first thing we want to do is rewind the tape and move it to the first file or file 0. Some tape drives like my new Quantum SDLT 600 drive do this automatically when you put a new tape it but others don't. Type the following:
# mt asf 0
The asf tells the drive to position the tape at the beginning of the count file. Positioning is done by first rewinding the tape and then spacing forward over count filemarks. But let's check just to make sure. Type the following:
# mt status
This should verify for you that you are at the beginning of the tape and on file 0. There are a bunch of codes that the mt status command will show that are of use to you:
BOT - Beginning Of Tape
EOT - End Of Tape
EOD - End Of Data
Next we start the process of copying data and more importantly trying to guess what block size the tape backup software that created the tape used to write the tape. There is no real science to this part just some logical guessing. First mount up a target drive that you are going to dd the tape data out to; in our example we will assume that our target drive is /mnt/storage. Then start by trying the smallest block size first. Type the following:
# dcfldd if=/dev/nst0 of=/mnt/storage/file0.img bs=512b
If everything is good you will get a number of blocks copied from the tape to /mnt/storage directory and a positive dcfldd output that looks like:
20044080+0 records in
20044080+0 records out
20525137920 bytes transferred in 5665.925325 secs (3622557 bytes/sec)
Remember from my previous postings on dd that +0 is good and +1 or +2 is bad. You will also probably get some kind of error message from the tape drive if you didn't read any data. If the read failed you have to repeat the whole process and increment the block size. Type the following:
# mt asf 0 <---to move the tape back to file 0
# mt status <---just to make sure we are back on file 0 and at BOT
# dcfldd if=/dev/nst0 of=/mnt/storage/file0.img bs=1k <---notice that we have incremented the block size to 1k
Here is the bad news. You don't know how many files are going to be on the tape and they may have different block sizes. I usually find that the blocks are 512 bytes, 1k, 4k, 16k, 32k and 64k. Different tape backup software from different manufactures can create files in a variety of different block sizes on a tape. Some tape backup software creates tape lead-ins and lead-outs that are small like two blocks of 512 bytes and then writes all other files in 16k blocks.
Once you have successfully found the right block size and copied file0.img the tape will now be on file 1. It is important that you keep track of what file number you are working on so that when you do the "mt asf #" command you don't mess up and keep copying the same file over and over. For example after having successfully copied file 0 you would do the following:
# dcfldd if=/dev/nst0 of=/mnt/storage/file1.img bs=1k <---notice we changed the file name
If everything went well you just keep copying away but if it errors you have to go back to file 1 by typing:
# mt asf 1 <---rewinds the tape back to file 1
# dcfldd if=/dev/nst0 of=/mnt/storage/file1.img bs=2k <---notice we changed the block size
You want to just keep repeating this process until you get an EOD or EOT notice from mt status. When you are done copying all of the data off the tape you can rewind and eject the tape by typing:
# mt offline
Once you have finished copying all the data I recommend generating MD5 sums of all of the files. You could do this while doing the dcfldd but I like to write all of the hash values to a single file when I'm finished. The easiest way to do this is to use MD5deep. MD5deep allows you to recursively examine a directory and make MD5 hashes of every file in the directory and every subdirectory. Type the following:
# md5deep -r –e -l /mnt/storage/* > md5sums.txt
The "-r" tells md5deep to recurse, the "-e" provides an estimate of how long the md5 hash will take to compute for each file and the "-l" tells md5deep to only output the file name not the absolute path into the log file. Be sure to read the man page for md5deep for more options. Generating MD5 hashes of large files can take several hours, for example a 100GB file takes about 3 1/2 hours to compute on a 3Ghz P4. It can take 10-15 hours to compute all of the MD5 hash values for an entire tape.
Tomorrow, for the twelfth and final day of dd, we'll be examining how dd's "losetup" can allow you to view your image files. Stay tuned!




