Tuesday, June 9, 2009

Method for finding disk space faster than du.


  1. Mount all of the areas with your data with unique paths. E.g., /mnt/fileserver1/mystuff, /mnt/fileserver2/mystuff, etc.

  2. With cron, do periodic 'find -ls' runs to a file across all of you personal paths.

  3. To count a certain directory:

cat find_ls_file | awk ' /^some\/sub\/path/ {sum += $7} END  { gb=(sum/1024/1024/1024); printf"%0.2f GB\n",gb}'

where find_ls_file holds the results of your find.  Even more useful is adding up any pattern, e.g., all .xvid files.
cat find_ls_file | awk ' /\.xvid/ {sum += $7} END  { gb=(sum/1024/1024/1024); printf"%0.2f GB\n",gb}'

Also, be sure to exclude snapshot directories from your find, if you have them (e.g., NetApp, ZFS)

Another advantage of this method is parallelism. You can farm out the finds if you need to.

Cheers,

Adam

Who am I?

I am an Senior DevOps Engineer.

I have a B.S. in Applied Mathematics and am a member of Alpha Sigma Lambda.

Outside of work, I study art, illustration, design, science, personal finance engineering, and the math of processes.

Cheers,

-Adam