Thursday, July 25, 2013

Multiple output files with dd utility


Multiple output files with dd utility

This is a note for personal reference and in case anybody finds this while searching some day.
Did you know that it is possible to redirect output not just to multiple files, but multiple commands in a shell? I speak in the context of Bash on Linux but this probably applies to some other environments too.
A few days ago I wondered if I could write an image file dumped with dd to two devices at once.
Copying to one is easy, and goes something like this:
# >   dd if=image.bin of=/dev/sdc
The “if” and “of” arguments specify input file and output file. /dev/sdc might point to a hard drive or USB stick (on my system today, it was a Compact Flash card). Please don’t try run it if you don’t understand what it’s doing, or you might destroy some data :) .
So what about two outputs? This won’t work:
# >   dd if=image.bin of=/dev/sdc of=/dev/sdh
It will run, but not the way I want it to. dd will ignore all “of” arguments except for the last one.
I can do it like this, by just running two copies at the same time:
# >   dd if=image.bin of=/dev/sdc & dd if=image.bin of=/dev/sdh &
The ampersands put the commands into the background so that multiple commands can run at once. This works well, but it feels like a waste to read the input file from disk twice!
I am guessing that thanks to lots of RAM and some caching, it is fast enough, but it made me wonder… can I reuse the same disk reads by way of some output redirection in the shell?
Already having some understanding of standard in, standard out, and pipes, I knew I could do this much:
# >   dd if=image.bin | dd of=/dev/sdh
dd will write to standard output if not given an “of” argument and it read from standard in with no “if” argument. The pipe character, |, connects the standard output from one command to the standard input of another.
I even knew that the tee command could be used to split standard input into multiple outputs.
# >   echo "hello" | tee file1.txt file2.txt
The above will write hello to file1.txt, file2.text, and standard out. I might even be able to output from tee directly to a block device like /dev/sdc, but I’m not sure it will treat all those bits and blocks as nicely as dd will. I want to send the output to several more dd processes. So how do I do it?
The key is the >() construct, which is a shell feature that allows an output command to be used instead of an output file. The shell creates a temporary file name and substitutes it in place of that expression. The command sending the output will write to the temporary file, and the output will be redirected to the standard input of whatever command is inside the parenthesis. I knew this would be possible somehow!
That explanation probably got a little confusing, so here is an attempt that I came up with:
# >   dd if=image.bin | tee >(dd of=/dev/sdc) >(dd of=/dev/sdh)
You could do it that way, but since tee writes to standard output in addition to any files it is given, you will dump an extra copy of all the bits right into your terminal output, and it will make a big mess. Better to catch tee’s standard output in a nice pipe for the last dd.
# >   dd if=image.bin | tee >(dd of=/dev/sdc) | dd of=/dev/sdh
And that’s it! The >() is something I just picked up today, so if you understand it better than I do, please correct me on any mistakes you noticed.

One Response to “Multiple output files with dd utility”

  1. greg Says:
    You can also use the dcfldd command, it lets you use as many “of=” as you want on the command line. I wish they named it dddcfl or something else entirely as I don’t use dd or dcfldd very often and its hard to remember dcfldd. I found this blog post because I did a google search because I couldn’t remember how to spell it. I have no idea what kind of overhead tee causes, but I’d assume dcfldd is more efficient since its all one binary executable.

No comments:

Post a Comment