The Joys of Shell Pipelines ... (Shallow Thoughts)

Akkana's Musings on Open Source, Science, and Nature.

Fri, 29 Dec 2006

The Joys of Shell Pipelines ...

A friend called me for help with a sysadmin problem they were having at work. The problem: find all files bigger than one gigabyte, print all the filenames, add up all the sizes and print the total. And for some reason (not explained to me) they needed to do this all in one command line.

This is Unix, so of course it's possible somehow! The obvious place to start is with the find command, and man find showed how to find all the 1G+ files:

find / -size +1G

(Turns out that's a GNU find syntax, and BSD find, on OS X, doesn't support it. I left it to my friend to check man find for the OS X equivalent of -size _1G.)

But for a problem like this, it's pretty clear we'd need to get find to execute a program that prints both the filename and the size. Initially I used ls -ls, but Saz (who was helping on IRC) pointed out that du on a file also does that, and looks a bit cleaner. With find's unfortunate syntax, that becomes:

find / -size +1G -exec du "{}" \;

But now we needed awk, to collect and add up all the sizes while printing just the filenames. A little googling (since I don't use awk very often) and experimenting led to the final solution:

find / -size +1G -exec du "{}" \; | awk '{print $2; total += $1} END { print "Total is", total}'

Ah, the joys of Unix shell pipelines!

Update: Ed Davies suggested an easier way to do the same thing. turns out du will handle it all by itself: du -hc `find . -size +1G` Thanks, Ed!

Tags: , , , ,
[ 16:53 Dec 29, 2006    More linux | permalink to this entry ]