To compare 2 files, we use the diff command. How do we compare 2 directories? Specifically, we want to know what files/subdirectories are common, what are only in 1 directory but not the other.
Unix old-timers may remember the dircmp command. Alas, that command is not available in Linux. In Linux, we use the same diff command to compare directories as well as files.
$ diff ~peter ~george
Only in /home/peter: announce.doc
diff /home/peter/.bashrc /home/george/.bashrc
76,83d72
<
< # Customization by Peter
< export LESS=-m
< export GREP_OPTIONS='--color=always'
< shopt -s histappend
< shopt -s cmdhist
< export PROMPT_COMMAND="history -a;$PROMPT_COMMAND"
< #echo keycode 58 = Escape |loadkeys -
Only in /home/george: .mcoprc
Only in /home/peter: .metacity
Only in /home/george: .newsticker-images
Only in /home/peter: .notifier.conf
Only in /home/george: targets.txt
Only in /home/peter: .xsession-errors
Without any option, diffing 2 directories will tell you which files only exist in 1 directory and not the other, and which are common files. Files that are common in both directories (e.g., .bashrc in the above listing) are diffed to see if and how the file contents differ.
If you are NOT interested in file differences, just add the -q (or --brief) option.
diff -q ~peter ~george |sort
Files /home/peter/.bashrc and /home/george/.bashrc differ
Only in /home/george: .mcoprc
Only in /home/george: .newsticker-images
Only in /home/george: targets.txt
Only in /home/peter: .metacity
Only in /home/peter: .notifier.conf
Only in /home/peter: .xsession-errors
Only in /home/peter: announce.doc
diff orders its output alphabetically by file/subdirectory name. I prefer to group them by whether they are common, and whether they only exist
in the first or second directory. That is why I piped the output of diff through sort in the above command.
Note that by default diff does not reach into the subdirectories to compare the files and subdirectories at that level. To change its behavior to recursively go down subdirectories, add -r.
diff -qr ~peter ~george |sort
26 comments:
Any way to do this across an SSH tunnel?
phyzome
Assuming both files reside in remote machine.
ssh user@123.123.123.123 diff -rq dir1 dir2
Peter
I was hoping for something to compare remote to local.
I'll just mount the remote filesystem using SSH-fuse or whatever.
This is wrong on some systems.... For example on Ubuntu using diff (GNU diffutils) 2.8.1, you must use the -r switch to get the full comparison.
Thanks for the article...
I just backed up 30+ gig of data. I wanted to insure I had a good copy.
I created an empty file in the middle of the directory structure of the backup.
I found it as a file on backup, but not original, and three files that were not the same.
Appreciate the guidance!
@phyzome (he won't need it anymore, but maybe someone else finds it useful):
If you want to compare files between local and remote machines, look into "rsync". As the name sais, it's purpose is to sync them, but you can use it without actually changing anything.
Especially if you are interested in missing/additional files AND changed files, rsync is cool, because it can compare the file contents without transfering the files over the wire, which makes it very fast.
In a situation, where I have had to compare remote directories on separate sites quickly and where time stamp was no good, I have used find with cksum to good effect.
find . -exec cksum {} \;
Directing the output to a text file for the two directories then left me two text files to compare.
Do you guys have a recommendation what to do in the following case: I have two potentially different directory trees, say A and B, and I want to know which files somewhere in A but not in B and vice verse.
There are programs like "fdupes" that look for duplicate files across directory trees but they also look for duplicates within A and B. This can be rather painful if A and/or B contains a lot of files.
In my situation I have two directory trees with pictures which I want to unify.
@Tim
fdupes -f A B
files from A will appear only if more than 1 equals is in A
@Tim
You can use rsync too.
rsync -avn source-dir/ target-dir/
This will list the files that are different (or new) in source-dir compared to target-dir.
If you run it without the -n option, it will copy the missing (or different) files over to the target.
my directory has objects files . but i nedd to diff between files other than object files.. can some help me..
The rsync command I posted above compares files based on last modification date and size, if you want to compare the CONTENT of the files, you can make rsync use checksums (-c):
rsync -avcn source-dir/ target-dir
This will list the files whose content is different in two directories. Again, if you want to replace the files that are differentin target-dir with the files from source-dir, remove the -n option.
Very nice article. It very useful way for comparison. Thanks a lot.. :)
This is a very useful post and will give people peace of mind when backing up files.
I'm using the diff -r command before deleting the original copy of some backups that I've transferred to a new RAID array.
Keep in mind I did run rsync twice to do this, but this data is important to me so I want to be sure nothing is missing and that the file integrity is in tact.
if you want the list of colliding filenames:
diff -sq dir1 dir2 | grep -v "Only in"
White text on black background sucks. But thanks for the post.
If you're looking to compare the contents of two directories, and want to see how they are the same (not how they are different), and getting cmpdir isn't an option, try this:
diff -q -y -s | grep "are identical$"
It's ugly and IO intensive, but works for me. The diff is running with the "-s" option to also output items that are identical. The grep then only picks out the stuff that is identical and doesn't display differences.
-A
Thank you, it helped a lot
I use vimdiff to diff a local file to a remote. i know this is old discussion but eh...
scp://user@host//file/path/to/file/file.txt ~/local/file/path/file.txt
Your simple and clear explanation was a great help, just what I needed while I was having nightmares in front of the threatening black shell.
I sincerely thank you :)
PS : maybe you could update it with an easy way to export to a text file the results of a diff, but that's a detail.
@Oliver
That's certainly an easy one, just redirect the stdout to a file with ">":
diff -rq dir1 dir2 > dir1-dir2-diffs
:)
I just ran across this today. I was trying to compare directories and got a "command not found" response. I appreciate the information.
I just wish they had just created a command alias with documentation that the command had been deprecated. Imagine the script errors that happen when people pull out an old script or try to build from old sources.
what to do if i want to copy the files which are different to a third directory??
Thank you! Very useful!!
A useful trick I use a lot is:
diff -qr dir1 dir2 | grep -v .svn | sort
That excludes all my .svn directories, which are really irrelevant if you're doing this..
Thanks man. This post helped me with comparing two directories with lots files.
Sincerely,
Marc_Online_
Post a Comment