Unix old-timers may remember the dircmp command. Alas, that command is not available in Linux. In Linux, we use the same diff command to compare directories as well as files.
$ diff ~peter ~george Only in /home/peter: announce.doc diff /home/peter/.bashrc /home/george/.bashrc 76,83d72 < < # Customization by Peter < export LESS=-m < export GREP_OPTIONS='--color=always' < shopt -s histappend < shopt -s cmdhist < export PROMPT_COMMAND="history -a;$PROMPT_COMMAND" < #echo keycode 58 = Escape |loadkeys - Only in /home/george: .mcoprc Only in /home/peter: .metacity Only in /home/george: .newsticker-images Only in /home/peter: .notifier.conf Only in /home/george: targets.txt Only in /home/peter: .xsession-errors
Without any option, diffing 2 directories will tell you which files only exist in 1 directory and not the other, and which are common files. Files that are common in both directories (e.g., .bashrc in the above listing) are diffed to see if and how the file contents differ.
If you are NOT interested in file differences, just add the -q (or --brief) option.
diff -q ~peter ~george |sort Files /home/peter/.bashrc and /home/george/.bashrc differ Only in /home/george: .mcoprc Only in /home/george: .newsticker-images Only in /home/george: targets.txt Only in /home/peter: .metacity Only in /home/peter: .notifier.conf Only in /home/peter: .xsession-errors Only in /home/peter: announce.doc
diff orders its output alphabetically by file/subdirectory name. I prefer to group them by whether they are common, and whether they only exist
in the first or second directory. That is why I piped the output of diff through sort in the above command.
Note that by default diff does not reach into the subdirectories to compare the files and subdirectories at that level. To change its behavior to recursively go down subdirectories, add -r.
diff -qr ~peter ~george |sort
36 comments:
Any way to do this across an SSH tunnel?
phyzome
Assuming both files reside in remote machine.
ssh user@123.123.123.123 diff -rq dir1 dir2
Peter
I was hoping for something to compare remote to local.
I'll just mount the remote filesystem using SSH-fuse or whatever.
This is wrong on some systems.... For example on Ubuntu using diff (GNU diffutils) 2.8.1, you must use the -r switch to get the full comparison.
Thanks for the article...
I just backed up 30+ gig of data. I wanted to insure I had a good copy.
I created an empty file in the middle of the directory structure of the backup.
I found it as a file on backup, but not original, and three files that were not the same.
Appreciate the guidance!
@phyzome (he won't need it anymore, but maybe someone else finds it useful):
If you want to compare files between local and remote machines, look into "rsync". As the name sais, it's purpose is to sync them, but you can use it without actually changing anything.
Especially if you are interested in missing/additional files AND changed files, rsync is cool, because it can compare the file contents without transfering the files over the wire, which makes it very fast.
In a situation, where I have had to compare remote directories on separate sites quickly and where time stamp was no good, I have used find with cksum to good effect.
find . -exec cksum {} \;
Directing the output to a text file for the two directories then left me two text files to compare.
Do you guys have a recommendation what to do in the following case: I have two potentially different directory trees, say A and B, and I want to know which files somewhere in A but not in B and vice verse.
There are programs like "fdupes" that look for duplicate files across directory trees but they also look for duplicates within A and B. This can be rather painful if A and/or B contains a lot of files.
In my situation I have two directory trees with pictures which I want to unify.
@Tim
fdupes -f A B
files from A will appear only if more than 1 equals is in A
@Tim
You can use rsync too.
rsync -avn source-dir/ target-dir/
This will list the files that are different (or new) in source-dir compared to target-dir.
If you run it without the -n option, it will copy the missing (or different) files over to the target.
my directory has objects files . but i nedd to diff between files other than object files.. can some help me..
The rsync command I posted above compares files based on last modification date and size, if you want to compare the CONTENT of the files, you can make rsync use checksums (-c):
rsync -avcn source-dir/ target-dir
This will list the files whose content is different in two directories. Again, if you want to replace the files that are differentin target-dir with the files from source-dir, remove the -n option.
Very nice article. It very useful way for comparison. Thanks a lot.. :)
This is a very useful post and will give people peace of mind when backing up files.
I'm using the diff -r command before deleting the original copy of some backups that I've transferred to a new RAID array.
Keep in mind I did run rsync twice to do this, but this data is important to me so I want to be sure nothing is missing and that the file integrity is in tact.
if you want the list of colliding filenames:
diff -sq dir1 dir2 | grep -v "Only in"
White text on black background sucks. But thanks for the post.
If you're looking to compare the contents of two directories, and want to see how they are the same (not how they are different), and getting cmpdir isn't an option, try this:
diff -q -y -s | grep "are identical$"
It's ugly and IO intensive, but works for me. The diff is running with the "-s" option to also output items that are identical. The grep then only picks out the stuff that is identical and doesn't display differences.
-A
Thank you, it helped a lot
I use vimdiff to diff a local file to a remote. i know this is old discussion but eh...
scp://user@host//file/path/to/file/file.txt ~/local/file/path/file.txt
Your simple and clear explanation was a great help, just what I needed while I was having nightmares in front of the threatening black shell.
I sincerely thank you :)
PS : maybe you could update it with an easy way to export to a text file the results of a diff, but that's a detail.
@Oliver
That's certainly an easy one, just redirect the stdout to a file with ">":
diff -rq dir1 dir2 > dir1-dir2-diffs
:)
I just ran across this today. I was trying to compare directories and got a "command not found" response. I appreciate the information.
I just wish they had just created a command alias with documentation that the command had been deprecated. Imagine the script errors that happen when people pull out an old script or try to build from old sources.
what to do if i want to copy the files which are different to a third directory??
Thank you! Very useful!!
A useful trick I use a lot is:
diff -qr dir1 dir2 | grep -v .svn | sort
That excludes all my .svn directories, which are really irrelevant if you're doing this..
Thanks man. This post helped me with comparing two directories with lots files.
Sincerely,
Marc_Online_
Great productive tip. Saved me time comparing two directories.
Is there a way to get a machine-readable rather than human-readable output, particularly for "Files X and Y differ" lines? My problem with this format is that 'X' and 'Y' are not delimited in any way, so if 'X' happens to be 'foo and bar' then there is no way to parse these lines unambiguously.
So, how do I?
If you want to just compare what files and subdirectories are different in one directory from another, ignoring differences within files and common subdirectories/files, you could add the following to your .bashrc or .bash_profile:
dircmp() { diff -q "$@" | grep -v "^Files" | grep -v "^Common"; }
Lots of interesting tips here, but I am still confused. I tried the following commands under Ubuntu 12.04 to compare a directory (recursively) to a copy on a mounted SMB share:
rsysn -avn
rsysn -avnc
diff -rq
in increasing order of execution time.
The rsync commands produced identical lists containing several directories and files. The diff command, however, did not report any differences. How is this possible?
I normally use ls | sort (or ls -l if I want to include file permissions, ownership, sizes and modification/access times in my comparison, or ls -R if I want to do a recursive comparison). So I use it like this:
ls first_directory | sort > ~/tempfile1
ls second_directory | sort > ~/tempfile2
diff ~/tempfile1 ~/tempfile2
Of course you have to pipe it through sort as well because otherwise the lines sometimes come out in a different order even if they are the same. I have never had any problems with using this method but it is nice to have a slightly more elegant way of doing this!
It's a really useful nice post. For those who are interested I added two tools that are really
powerful.
Tools like "rsync" and "diff" are really great. I would also mention "unison" and the new shining "syncany".
UNISON: To sync to folders with unison simply do a:
unison folderA folderB
In contrast to rsync unison does a sync in both directions. Further more unison includes rsync and works on windows, linux and mac. Unison is a very mature and stable software.
SYNCANY:Use
sy status
to show differences
sy up
to upoload differences
sy down
to download differences
syncany has also version control capilities. Nevertheless syncany is in Alpha status as in 2016.
Both tools are for advanced usage and a simple "diff -rq foderA folderB" does the job. With rsync, unison and syncany you can automate tasks, share folders like dropbox and sync them over the internet.
Greetings and hopefully it helped you.
Diff folders by checksums, between 'local' and SSH 'remote':
rsync --dry-run -aci --delete /local/path/ -e "ssh -i ~/.ssh/sshkey" user@domain.de:/remote/path/
Do a simulation (-n, --dry-run) only, as if you were archiving (-a, --archive),
comparison based checksum (-c. --checksum), output a change-summary for all updates
( -i, --itemize-changes), as if you were mirroring to destination (--delete).
we can use diff -rq path1(dir)/ path2(dir) > list.txt
all the compared list will be saved in list.txt
Thanks, it works.
Thank you very much, you saved me a lot of time !
Post a Comment