Scroll to navigation

CEPH-DIFF-SORTED(8) Ceph CEPH-DIFF-SORTED(8)

NAME

ceph-diff-sorted - compare two sorted files line by line

SYNOPSIS

ceph-diff-sorted file1 file2

DESCRIPTION

ceph-diff-sorted is a simplifed diff utility optimized for comparing two files with lines that are lexically sorted.

The output is simplified in comparison to that of the standard diff tool available in POSIX systems. Angle brackets ('<' and '>') are used to show lines that appear in one file but not the other. The output is not compatible with the patch tool.

This tool was created in order to perform diffs of large files (e.g., containing billions of lines) that the standard diff tool cannot handle efficiently. Knowing that the lines are sorted allows this to be done efficiently with minimal memory overhead.

The sorting of each file needs to be done lexcially. Most POSIX systems use the LANG environment variable to determine the sort tool's sorting order. To sort lexically we would need something such as:

$ LANG=C sort some-file.txt >some-file-sorted.txt


EXAMPLES

Compare two files:

$ ceph-diff-sorted fileA.txt fileB.txt


EXIT STATUS

When complete, the exit status will be set to one of the following:

0
files same
1
files different
2
usage problem (e.g., wrong number of command-line arguments)
3
problem opening input file
4
bad file content (e.g., unsorted order or empty lines)

AVAILABILITY

ceph-diff-sorted is part of Ceph, a massively scalable, open-source, distributed storage system. Please refer to the Ceph documentation at http://ceph.com/docs for more information.

SEE ALSO

rgw-orphan-list(8)

COPYRIGHT

2010-2024, Inktank Storage, Inc. and contributors. Licensed under Creative Commons Attribution Share Alike 3.0 (CC-BY-SA-3.0)

April 10, 2024 dev