I find find(1) to be useful

I recently shared Tom Limoncelli’s excellent critique of the BSD find(1) man page in the documentation channel at work. One of my coworkers responded with “that’s why I just use mlocate”, and that made me very sad. Sure, mlocate is a great tool if you know there’s a file somewhere that has a particular name (assuming it was created before the last time updatedb was run), but that’s about the best you can do.

There are plenty of examples on how to use find out there, but I haven’t written a “here’s a basic thing about Linux” post in a while, so I’ll add to the pile. find takes, at a minimum, a path to find things in. For example:

find /

will find (and print) every file on the system. Probably not all that useful. You can change the path argument to narrow things down a bit, but that’s still probably not all that useful to you. So let’s throw in some additional arguments to constrain it. Maybe you want to find all the JPEG files in your home directory?

find ~ -name '*jpg'

But wait! What if some of them have an uppercase extension?

find ~ -iname '*jpg'

Aw, but I bet some of the pictures have an extension of .jpeg because 8.3 is so 1985. Well, we can combine them in a slightly ugly fashion:

find ~ \( -iname '*jpeg' -o -iname '*jpg' \)

Oh, but you have some directories that end in jpg? (Why you named a directory “bucketofjpg” instead of “pictures” is beyond me) We can modify it to only look for files!

find ~ \( -iname '*jpeg' -o -iname '*jpg' \) -type f

Or maybe you’d just like to find those directories so you can rename them later:

find ~ \( -iname '*jpeg' -o -iname '*jpg' \) -type d

It turns out you’ve been taking a lot of pictures lately, so let’s narrow this down to ones whose status has changed in the last week.

find ~ \( -iname '*jpeg' -o -iname '*jpg' \) -type f -ctime -7

You can do time filters based on file status change time (ctime), modification time (mtime), or access time (atime). These are in days, so if you want finer-grained control, you can express it in minutes instead (cmin, mmin, and amin, respectively). Unless you know exactly the time you want, you’ll probably prefix the number with a + (more than) or – (less than). The time arguments are probably the ones I use most often.

Maybe you’re running out of disk space, so you want to find all of the gigantic (let’s define that as greater than 1 gigabyte) files in the log directory:

find /var/log -size +1G

Or maybe you want to find all the files owned by bcotton in /data:

find /data -owner bcotton

You can also look for files based on permissions. Perhaps you want to find all of the world-readable files in your home directory to make sure you’re not oversharing.

find ~ -perm -o=r

So far, all we’ve done is print the file paths, which is useful, but sometimes you want to do more. find has a few built in actions (like -delete), but it’s true power comes in giving input for other commands to act on. In the simplest case, you can pipe the output to something like xargs. There’s also the -exec action, which allows you to execute more complicated actions against the output. For example, if you wanted to get the md5sum of all of your Python scripts:

find ~ -type f -name '*.py' -exec md5sum {} \;

(Yes, you could pipe to xargs here, too, but that’s not the point.) Note the \; at the end. That’s very important.

Warning! You can really cause a world of hurt if you’re not careful with the output of find. Files that contain spaces or other special characters might cause unexpected behavior when passed to another command. Be very careful. One way to mitigate your risk is to use -ok instead of -exec. This prompts you before executing each line (but it might get tedious if you have a lot of lines to process). The -ls action escapes special characters, so that might be useful when piping to another program.

This post only begins to scratch the surface of what find can do. Combining tests with boolean logic can give you incredible flexibility to find exactly the files you’re looking for. Have any favorite find expressions? Share them in the comments!