Gnutella Forums - View Single Post

verdyp · #13 (**permalink**) May 16th, 2004

Quote:

Originally posted by et voilà
Do I have to understand that you are more for the kamdelia approach??
I'm feeling too that it is a less quick and dirty way to deal with requieries...

There was no reference to Kademlia in my message, which just explains why 6-bytes hashes (24 bits) are very weak to identify files (some users may think that the 16 millions possible values it allows should be enough to identify all files available on Gnutella, when I just explain that it will be enough only to manage very small subsets of files)

OK I had not understood that you added the verification step after finding a reference. That's a good idea which is statistically correct. The fact that a 6-byte hash will produce collisions 50% of the time every 4096 randomly found files, means that the total risk of collision when querying a source will be below 1% (but it will not be null, and that's why the verification step is required!). So the overhead of veryifying the source with the complete hash will be extremely small face to the gain in bandwidth for locating the candidate sources.

This statistic optimization is similar to the statistic optimization performed in QRP with 16-bit hash values (64K tables) whose cryptographic strengh is about 6-bit (64 average values, enough to divide the traffic to 2% for moderately filled QRP tables; if QRP tables are filled at 25%, the collision risk is evaluated to roughly 80%, and QRP will be loosing its capacity to filter searches efficiently; however, for more than 80% of leaf nodes that have QRP tables filled below 10%, the collision risk for non-matching keywords falls under 25%, which means that QRP can filter 75% of queries sent to shielded leaf nodes, saving much bandwidth on UltraPeers, and allowing them to support more leaf nodes).