4.8.1 is current (though a beta is expected out shortly) and uses SHA-1 and THEX.
Definitive answers are hard to come by, since the project is quite dynamic and answers are thus changing as the code improves.
If you can read code,
http://limewire.org/fisheye/changelog/limecvs/ is a good way to keep up with the changes.
The gnutella developers try to maintain the documentation at
http://www.the-gdf.org/wiki/index.php?title=Main_Page
As for spotting those files, there are initial attempts to develop filters which I hope make it into the beta.
Another keyword to search for is "credence"