Cloning Worker Nodes with dd
This is the second page of a two part tutorial on cloning worker nodes with dd. The full tutorial includes
- Cloning Worker Nodes
- Cloning Worker Nodes with dd
Data Dumping: dd
Performing a data dump, or "dd", is a quick and dirty way to image a hard drive. Unlike using Updcast, where the udp-sender broadcasts its image out to the network and all of the other nodes receive the image at the same time, a data dump essentially works point-to-point. The other main difference is that dd's occur on one machine with the sending and receiving hard drives hooked up to it. Multiple data dumps can be occurring at the same time, but each one is a separate process and having too many run at once will bog down the system.
Preparing for the dd
The first step is to take the hard drives out of the worker nodes (unless they're already out), and put them all into another machine. This machine can be the machine holding the hard drive that's already set up, but it doesn't need to be. Next, the operating system of the machine that's going to be cloned shouldn't be running while this operation takes place. The easiest way to get around this is to use a bootable CD, like Ubuntu's live CD or Knoppix. After all of the drives are hooked up, turn the machine on and boot from the CD.
Next, you'll need to become root. If you're using an Ubuntu CD, do this by issuing sudo su -. Otherwise, become root as you would normally. Then run
fdisk -l
If you don't run it as root, it will run without returning any drives. What you want to see is something like this:
root@ubuntu:~$ fdisk -l Disk /dev/sda: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 9483 76172166 83 Linux /dev/sda2 9484 9729 1975995 82 Linux swap / Solaris Disk /dev/sdb: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 9484 76180198+ 83 Linux /dev/sdb2 9485 9728 1959930 83 Linux Disk /dev/sdc: 80.0 GB, 80026361856 bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdc1 1 9483 76172166 83 Linux /dev/sdc2 9484 9729 1975995 82 Linux swap / Solaris
Fdisk -l shows all of the hard drives currently plugged into the system. In the example above, I have three hard drives plugged in: /dev/sda/, /dev/sdb, and /dev/sdc. The sd part refers to them being SATA drives; IDE drives will show up as hd-something.
Next, we need to figure out which hard drive is the one that's already set up. If you don't already have an operating system installed on the others, you'll be able to see the difference when using fdisk. But if you already do, and you can't tell which hard drive should be the source for the data dump, you can start mounting them one at a time and checking them. To mount a drive, first create a directory
mkdir /mnt
and then mount one of the hard drives
mount <your hard drive>1 /mnt
The 1 is important. Rather than mounting an entire hard drive (including the swap space), you want to only mount the part that contains a filesystem. For instance, mounting one of mine would be
mount /dev/sda1 /mnt
instead of mount /dev/sda /mnt. To unmount, use
umount <your drive>1 /mnt
Running dd
Make sure you know which hard drive will be the source before you run dd. It's very easy to accidentally overwrite the drive that you've put effort into setting up with another drive by accident. Once know you which drive it is and you're ready to continue, unmount all of the drives (see above).
The dd command takes several arguments. By default, it takes input from standard in and outputs it to standard out. (Try this by running dd by itself, typing a few lines, and then ending it with Ctrl-D.) Instead, we want it to take input from one hard drive and output it to another hard drive. This is the "dirty" part of "quick and dirty" - it will read it byte by byte and copy that exact sequence from one hard drive to another. We could just as easily output it to a file and make a copy of the hard drive.
The input and output are specified with if= (input file equals) and of= (output file equals). The full syntax is
dd if=<prepared hard drive> of=<hard drive to image> &
For instance, mine from above might be
dd if=/dev/sda of=/dev/sdb &
Notice this time that we're not specifying the partitions (there's no number) - we want the entire hard drive to be copied so that the new hard drive gets a swap space, too, and not just the file system. The & makes the process run in the background; this is fine because it won't give any output as it progresses anyway.
You can run multiple data dumps at the same time, but each one will have to be specified separately. You can use top to monitor how long the processes have been run and to tell when they have stopped. Each one takes awhile. I wouldn't recommend running more than four at once; run fewer if the machine has a slow processor or is low on RAM.