Managing data effectively often requires organizing information in a specific order, and the Linux command line provides a straightforward way to achieve this with the df sort by column operation. The `df` command reports file system disk space usage, while `sort` rearranges lines of text files. Combining these utilities allows system administrators to parse raw disk usage statistics and display them in a more meaningful sequence, such as sorting by usage percentage or mount point. This technique transforms a simple list of mounted drives into an actionable overview, prioritizing attention on the most critical volumes.
Understanding the Default Behavior of df
Before manipulating the output, it is essential to understand what `df` produces by default. Running the command without arguments generates a table that includes the file system name, total size, used space, available space, usage percentage, and the mount point. The columns are separated by spaces, but the exact number of spaces can vary depending on the system and the length of the device names. This irregular spacing is important to remember when constructing parsing commands, as it means the data is not delimited by a single, consistent character like a comma.
Sorting by the Usage Percentage Column
Handling the Header Row
A challenge with sorting the output of `df` is that the first line is a header describing the columns. Standard numeric sorting will place this header anywhere in the list, often resulting in it appearing in the middle of the output where it disrupts readability. To solve this, users can extract the header separately and concatenate it with the sorted data. This involves using `head -n 1` to grab the first line and `tail -n +2` to skip it, ensuring the summary remains at the top of the report regardless of the sort order applied to the subsequent lines.
Sorting in Reverse Order for Critical Alerts
Sorting by Mount Point for Organization
Dealing with Variable Whitespace
One technical nuance when working with `df` output is the inconsistent whitespace between columns. The first column, representing the device name, can be of varying lengths, which throws off the column numbering if you count positions statically. A more robust approach involves treating the output as space-delimited and counting fields from the end. Since the last column is always the mount point and the second-to-last is the usage percentage, specifying keys relative to the end of the line can future-proof your parsing logic against changes in device path lengths.