If you want to find out how much disk space is used by files and directories in Linux, you can use the du command. This command stands for disk usage and it can display the size of files and folders in various formats. However, by default, the du command does not sort the output by size, which can make it hard to identify the largest or smallest items. In this article, you will learn how to sort the output of the du command by size using some simple tricks and options. You will also learn how to filter the output by file type, depth, and human-readable format.
Key Takeaways
- The du command can show the disk usage of files and directories in Linux.
- To sort the output of the du command by size, you can pipe it to the sort command with the -n option for numerical sorting.
- To make the output more readable, you can use the -h option for both du and sort commands, which will display the size in human-readable format (such as K, M, G, etc.).
- To limit the depth of the output, you can use the –max-depth option for the du command, which will specify how many levels of subdirectories to show.
- To filter the output by file type, you can use the -a option for the du command, which will show all files, and then use grep to match the file extension.
Sorting the Output of the du Command by Size
The du command has many options that can modify the output, but none of them can sort the output by size. To do that, you need to use another command, called sort, which can sort the input by various criteria. The sort command can take the output of another command as input, using the pipe symbol (|). For example, the following command will sort the output of the du command by size in ascending order:
du | sort -n
The -n option tells the sort command to sort the input numerically, rather than alphabetically. This is important because the du command outputs the size in bytes, followed by the file or directory name. If you sort the output alphabetically, you will get incorrect results, such as 10 being smaller than 2.
The output of the above command will look something like this:
4 ./dir1/file1.txt
8 ./dir2/file2.txt
12 ./dir3/file3.txt
16 ./dir1
20 ./dir2
24 ./dir3
60 .
The first column shows the size in bytes, and the second column shows the file or directory name. The dot (.) represents the current directory. As you can see, the output is sorted by size in ascending order, meaning the smallest items are at the top and the largest items are at the bottom.
If you want to sort the output by size in descending order, you can use the -r option for the sort command, which will reverse the order of the output. For example:
du | sort -n -r
The output of the above command will look something like this:
60 .
24 ./dir3
20 ./dir2
16 ./dir1
12 ./dir3/file3.txt
8 ./dir2/file2.txt
4 ./dir1/file1.txt
As you can see, the output is sorted by size in descending order, meaning the largest items are at the top and the smallest items are at the bottom.
Making the Output More Readable
The output of the du command can be hard to read, especially when the size is in bytes. It can be difficult to compare the size of different items, or to understand how much space they are taking up. To make the output more readable, you can use the -h option for both the du and the sort commands, which will display the size in human-readable format. This means that the size will be shown in units such as K (kilobytes), M (megabytes), G (gigabytes), etc., depending on the size of the item. For example:
du -h | sort -h
The output of the above command will look something like this:
4.0K ./dir1/file1.txt
8.0K ./dir2/file2.txt
12K ./dir3/file3.txt
16K ./dir1
20K ./dir2
24K ./dir3
60K .
As you can see, the output is much more readable, and you can easily compare the size of different items. The -h option for the sort command is available in GNU coreutils, which is common in most Linux distributions. If you are using an older version of Mac OS X, you may need to install coreutils with brew install coreutils, and then use gsort instead of sort.
Limiting the Depth of the Output
By default, the du command will show the disk usage of all files and directories in the specified directory, including all subdirectories and their contents. This can result in a very long and cluttered output, especially if you have a lot of files and directories. To limit the depth of the output, you can use the –max-depth option for the du command, which will specify how many levels of subdirectories to show. For example, if you want to show only the disk usage of the files and directories in the current directory, without any subdirectories, you can use –max-depth=1. For example:
du -h --max-depth=1 | sort -h
The output of the above command will look something like this:
4.0K ./dir1/file1.txt
8.0K ./dir2/file2.txt
12K ./dir3/file3.txt
16K ./dir1
20K ./dir2
24K ./dir3
60K .
As you can see, the output only shows the disk usage of the files and directories in the current directory, without any subdirectories. The –max-depth option can take any positive integer as an argument, which will determine how many levels of subdirectories to show. For example, if you want to show the disk usage of the files and directories in the current directory, and the files and directories in the first level of subdirectories, you can use –max-depth=2. For example:
du -h --max-depth=2 | sort -h
The output of the above command will look something like this:
4.0K ./dir1/file1.txt
8.0K ./dir2/file2.txt
12K ./dir3/file3.txt
16K ./dir1
20K ./dir2
24K ./dir3
28K ./dir1/subdir1/file4.txt
32K ./dir2/subdir2/file5.txt
36K ./dir3/subdir3/file6.txt
40K ./dir1/subdir1
44K ./dir2/subdir2
48K ./dir3/subdir3
100K .
As you can see, the output shows the disk usage of the files and directories in the current directory, and the files and directories in the first level of subdirectories. You can use any value for the –max-depth option, depending on how much detail you want to see.
Filtering the Output by File Type
Sometimes, you may want to filter the output of the du command by file type, such as only showing the disk usage of text files, or only showing the disk usage of directories. To do that, you can use the -a option for the du command, which will show all files, not just directories. Then, you can use grep to match the file extension or the slash (/) symbol, which indicates a directory. For example, if you want to show only the disk usage of text files, you can use the following command:
du -ah | grep "\.txt$" | sort -h
The -a option tells the du command to show all files, not just directories. The grep command matches the pattern “.txt$”, which means any string that ends with .txt. The dollar sign ($) indicates the end of the line. The backslash () is used to escape the dot (.), which is a special character in regular expressions. The sort command sorts the output by size in human-readable format. The output of the above command will look something like this:
4.0K ./dir1/file1.txt
8.0K ./dir2/file2.txt
12K ./dir3/file3.txt
28K ./dir1/subdir1/file4.txt
32K ./dir2/subdir2/file5.txt
36K ./dir3/subdir3/file6.txt
As you can see, the output only shows the disk usage of text files, sorted by size. You can use any file extension that you want, such as .sh for shell scripts, .py for Python scripts, .jpg for JPEG images, etc. Just make sure to escape the dot with a backslash, and use the dollar sign to indicate the end of the line.
If you want to show only the disk usage of directories, you can use the following command:
du -ah | grep "/$" | sort -h
The grep command matches the pattern “/$”, which means any string that ends with a slash (/). The slash indicates