Tuesday, June 9, 2009

Method for finding disk space faster than du.


  1. Mount all of the areas with your data with unique paths. E.g., /mnt/fileserver1/mystuff, /mnt/fileserver2/mystuff, etc.

  2. With cron, do periodic 'find -ls' runs to a file across all of you personal paths.

  3. To count a certain directory:

cat find_ls_file | awk ' /^some\/sub\/path/ {sum += $7} END  { gb=(sum/1024/1024/1024); printf"%0.2f GB\n",gb}'

where find_ls_file holds the results of your find.  Even more useful is adding up any pattern, e.g., all .xvid files.
cat find_ls_file | awk ' /\.xvid/ {sum += $7} END  { gb=(sum/1024/1024/1024); printf"%0.2f GB\n",gb}'

Also, be sure to exclude snapshot directories from your find, if you have them (e.g., NetApp, ZFS)

Another advantage of this method is parallelism. You can farm out the finds if you need to.

Cheers,

Adam