Monday, June 11, 2007

You want What???!??

One of the joys of being director of archaeological computing on a small dig is the regular occurrence of moments like the one that happened yesterday. Our ceramics expert made a simple request: "Could you please print out a copy of each of the drawings of the registered pottery we've scanned so far? Oh, and it needs to be reduced 1:5. And they really should be labeled by the registry name. Oh, and I really need it by breakfast at 7:30."
Seems simple enough. But it isn't. For one thing, there are almost 400 images to print out.
Another thing: these images are of varying sizes, ranging from 453 pixels wide by 340 pixels tall to about 3000 pixels wide and 5000 pixels tall. And it really wouldn't do to blast through 400 pages of paper, considering that
  • we're low on toner
  • we only have about 450 sheets of blank paper on hand
  • many of the images would print out only to 2 or 3 inches square
What's an geek to do? Most people would say, "just select all the pictures from an explorer window, select 'print', and off you go!" But if you do that (or use PhotoAlbum or similar graphical user interfaces to photographs) you can't guarantee the proper scaling of the resulting printout. Of course, I immediately thought to myself, "If only I had a Linux distribution handy!" But no, this excavation is all Windows based, and although I could use ubuntu, I actually need the computers for other Windows tasks. I could use the VMWare image of Linux I have available on the dig's main computer, but it would indubitably run too slow for this number-crunching task. Maybe I could get Windows to act more like Linux, and allow me to do some shell scripting, a little awk here, a grep there, sed, vi ... and of course, ImageMagick.

Now, if you've not heard of ImageMagick, you've missed out on one of the treats of a geek's life.
ImageMagick (or IM, as it's known to aficionados) is a wonderful Swiss Army knife of tools you can apply to a graphical image to transform it in various and sundry ways. Any software that comes with a command 'mogrify' rates way high on the geek-must-have scale. I have used IM in the past to splice images together in a 'montage', and this sounded like just the tool for this job.


So how do I go about turning a Windows computer into something at least mostly useable under the command line? Install cygwin unix command equivalents, Gimp, ImageMagick, and Vim, the improved VI editor.


Then, write a series of batch files that transform the data... which I will include below for the enjoyment of those who know what I'm talking about and in order to impress those who don't. :)

Run each batch file in successive order, then use Gimp to change the resolution and print the results. Easy shmeasy!

So, now that I've written the scripts, I am about ready to run them, which I estimate will take 8 hours of solid computing time. [ Update: Turns out it only took 2 hours, and another hour to print out.]

Happy Mogrifying!

John

Example input drawing :


Example output file (made smaller for this blog):


Assumptions:

- All operations unless otherwise noted take place in

My Documents\Zeitah\pics\Registered Item Drawings\test

- Originals are one level above and are named P1.jpg through Pxxxxx.jpg

- Original scans are at 150 dpi

- We want to maximize the number of images printed per page

Procedure:

  1. At the command line (start|run|cmd) and run the command c:\unixutils.bat, which will add the path to Cygnus to your command path.
  2. cd “My Documents\Zeitah\pics\Registered Item Drawings”
  3. Create test if it doesn’t already exist:

a. mkdir test

  1. Create a list of file names and their sizes

a. identify P*.jpg | sort –n +2 | sed "s/x/ /" | sed "s/\[.*\]//" >test\identify.txt

5. cd test

  1. Sort names of files into various file sizes (1classify.bat)

a. awk "{ if ( $3 > 1 &&amp;amp;amp; $3 <= 1200 && $4 <= 800 ) { print $1 } }" | sort -n +0.1 >fivebyfive.txt

b. awk "{ if ( $3 > 1200 &&amp;amp;amp; $3 <= 1500 && $4 <= 1000 ) { print $1 } }" | sort -n +0.1 >fourbyfour.txt

c. awk "{ if ( $3 > 1500 &&amp;amp;amp; $3 <= 2000 && $4 <= 1300 ) { print $1 } }" | sort -n +0.1 >threebythree.txt

d. awk "{ if ( $3 > 2000 &&amp;amp;amp; $3 <= 3000 && $4 < 2000 ) { print $1 } }" | sort -n +0.1 >twobytwo.txt

    1. awk "{ if ( $3 > 3000) { print $1 } }" | sort -n +0.1 >onebyone.txt
  1. create commands to extend the files into consistent widths: (2mkextent.bat)

a. sed "s/^/mogrify -extent 1200x800 /" >fiveextent.bat

b. sed "s/^/mogrify -extent 1500x1000 /" >fourextent.bat

c. sed threeextent.bat

    1. sed "s/^/mogrify -extent 3000x2000 /" >twoextent.bat
  1. Extend the canvas sizes (3doextent.bat)
  2. Create montage commands (4mkmontage.bat)

a. fmt –250 fivebyfive.txt | sed "s/^/montage –frame 10 –geometry 1200x800 –pointsize 48 –label %%f " | sed "s/$/ montage5.jpg" >domontage5.bat

    1. fmt –160 fourbyfour.txt | sed "s/^/montage –frame 10 –geometry 1200x800 –pointsize 48 –label %%f " | sed "s/$/ montage4.jpg" >domontage4.bat
    2. fmt –90 threebythree.txt | sed "s/^/montage –frame 10 –geometry 1200x800 –pointsize 48 –label %%f " | sed "s/$/ montage3.jpg" >domontage3.bat
    3. fmt –40 twobytwo.txt | sed "s/^/montage –frame 10 –geometry 1200x800 –pointsize 48 –label %%f " | sed "s/$/ montage2.jpg" >domontage2.bat
    4. sed domontage1.bat
  1. Run montage commands (5domontage.bat)

a. call domontage5.bat

b. call domontage4.bat

c. call domontage3.bat

d. call domontage2.bat

    1. call domontage1.bat
  1. Open each montage file in gimp.
  2. Change the page setup for maximum printing area
  3. Change the density from 72 to 600 (for 1:4) or 750 (for 1:5)
  4. Print the resulting pages.

No comments: