Jon Simpson

Find Content Differences Between Plain-Text Files

27 Mar 2015 — diff, plaintext

A handy one-liner for comparing two plain-text files for content equality using standard unix tools, swap A.txt and B.txt for your two source files.

sort A.txt > A.sorted; sort B.txt > B.sorted; diff -u A.sorted B.sorted; rm A.sorted B.sorted

I’ve used this while programmatically restructuring large JSON files; where keys are reordered (especially when initially sorting a JSON file containing many dictionaries) it can be difficult to establish whether any content has been lost in parsing and re-serializing the file, given many libraries lint and re-format the output.

The diff output from this command is position independent and will produce only the content difference between the two files; it’ll pick up blank lines and differences in white spaces. With JSON it’s worth linting both files before trying to restructure, as this removes noise relating to whitespace differences and bracket/comma positioning from the diff.