| 
This is the BSDA Study Guide Book written via a
wiki collaboration.
This is a work in progress. You may contribute to or discuss this specific page at http://bsdwiki.reedmedia.net/wiki/Determine_disk_capacity_and_which_files_are_consuming_the_most_disk_space.html. Determine disk capacity and which files are consuming the most disk spaceConcept
Be able to combine common Unix command line utilities to quickly determine which files are consuming the most disk space. IntroductionAs disk sizes have increased over the years, so have the amount of data that we seem to want to keep on them.  At one time or another, you may be faced with the "too much data/not enough space" problem.  How can you quickly find the "disk hogs"? Use the tools!The BSD systems are full of tools that can assist with this problem, including: 
df(1)- "disk free"du(1)- "disk usage"find(1)- "walk a file hierarchy" If you're using NetBSD, you can also get a dftype reading fromsystat(1).  And with any BSD variant, using common "Unix-fu" (in particular,findand shell pipes), these commands can quickly produce useful information about disk usage. df and duFor a quick summary of disk space, simply call df.  Using "-c" withdfprovides an "overall total"; using "-h" with eitherdforduproduces "human readable" output: that is, calculated into K, M, G (kilobytes, megabytes, gigabyes), etc., instead of "blocks" as indicated by the environment variable $BLOCKSIZE. Unlike df, you probably don't want to simply calldu.  Without arguments,dulists the size of every file and subdirectory (and its files and subfiles, ad infinitum) of your CWD, roughly in the order of inodes --- if you happen to be in "/", you'd be a long time reading the output ofdu.  Usually it's better to useduwith "-s", possibly even with a specific file or file "glob" argument, or with "-h" and maybe "-c", and pipe the output through sort(1); look for a rather convoluted (yet effective) example below. ducan also read the sizes of files listed to its standard input, which makesfinda fairly useful "frontend" toduon occasion (but see the section onfindbelow before you scratch your head too hard on this).
 Note: under certain conditions, dfanddumay disagree somewhat about the amount of free space on a filesystem.  Generally, this occurs when a program is holding an open file descriptor to a file that has been unlinked; in such a case,duwouldn't count the file's size, but the blocks are still unavailable as "free blocks" (df="disk free", remember?)  In such cases, you can usefstat(1)to see currently open files. findand the "size" primary
The complete use of findis beyond the scope of this section; please see Find a file with a given set of attributes for complete information.  However, using the "size" primary and an expression representing a given filesize, you can quickly produce a list of "disk hogs".  See the Examples below. ExamplesAre any partitions nearing "full"? $ df
    /dev/ad0s1a      1978    977   842    54%    /
    /dev/ad0s1e     67765  49502 12841    79%    /usr
    /dev/ad0s1d      3962   2182  1463    60%    /var
 Display all the *.mp3 files in my homedir, and their sizes with a total: $ du -sc *mp3 $HOME
 List all files in the current directory, in order of size (almost): $ du -h | sort -n | more
 Here's a pretty wild set of pipes for "du", showing the largest disk hogs (unless files are >999MB - if so change "M" to "G" in the regular expression); to see the smallest files, use "head" rather than "tail", or for a complete listing pipe it to $PAGER instead of either.  The "-n" option to sort(1) ensures that the filesizes are in numeric rather than alphabetical order: [root@server][/usr/src]
# du -hc * | sort -n | grep "[0-9]M" | tail
    26M    crypto
    27M    contrib/binutils
    28M    release
    40M    sys/dev
    47M    contrib/gcc
    105M   sys
    204M   contrib
    458M   total
 But this brings us to the relative power of find(1).  A similar report could be produced like this ("find all files in the cwd greater than approximately 900MB in size"): # find . -size +940000000c
 The main difference between this statement's output and that of the "piped arrangement" above is that find doesn't report the actual sizes and the list isn't "sorted".  Note that if you're using FreeBSD, you can use "[KMGTP]" with the size designation, thus:  "find . -size +900M". Practice Exercises
Use dfto see if your hard drives are nearing "full".Use findto find out whom in/home/is the biggest "disk hog".  (Optional: Use grep to see if any of these files are "mp3"s).Use dualong withsort(1)andgrep(1)to produce lists of files by size. More informationdu(1), df(1), find(1), sort(1), and, for NetBSD systat(1) 
  |