|
Register | FAQ | The Twelve Commandments | Members List | Calendar | Arcade | Find the Best VPN | Today's Posts | Search |
General Gnutella Development Discussion For general discussion about Gnutella development. |
| LinkBack | Thread Tools | Display Modes |
| |||
Partial File Sharing Protocol Development This thread is for discussing and writing an open protocol for sharing partial files across the gnutellanetwork. This feature will greatly increase the amound of resources available to the network. Everyones input is welcome. Here are two great threads at the_gdf to get started: http://groups.yahoo.com/group/the_gdf/message/6807 http://groups.yahoo.com/group/the_gdf/message/6918 Thanks everyone, GF |
| |||
PFSP 0.1 - Very Very Rough Here is a very rough draft: Partial File Sharing Protocol 0.1 1 Introduction 1.1 Purpose Creation of an open protocol to allow the sharing of partially downloaded files between gnutella clients. There are many benefits to this and it can be done with the same (or even greater) confidence than the sharing of complete and hashed files. The best way to share partial files is to use tree hashes. This approach actually provides a greater confidence than the current full file hash because the client could confirm segments of the file. The current full file hash can only confirm that the file is the desired file after it has been fully downloaded. If there is an error there is no way to tell were it is and the full file must be discarded. This could be a huge bandwidth waste, especially if throwing out a 700mb file. An added bonus of the tree hash method is the ability to resume from nearly identical files. This solves the problem we are having with files that are the same but w/ different metadata. For example, as soon as I play many video files, something in them changes and the file cannot be swarmed any longer. The same problem exists when the same song has different meta tags at the end. The current proposed solution to the problem is to do a second hash of the file without the portion containing the meta data. This is very file type specific and offers little additional benefit when compared with the tree hash, which would offer the benefits of true swarming, partial file sharing, and the ability to share nearly identical files. For example: I want to download songx, so I search for it and find it. There are quite a few versions with the same size, bitrate, ect but they have different metadata so the hash is different. Well, with the tree hash you could use all of those sources to swarm from for the parts of the file that are the same! This would greatly increase swarming speeds while providing the same security and confidence we currently have with hashed files! 2 Protocol Definition 2.1 Tiger Tree Hash A Tiger Tree hash MUST be generated for each file shared by the client calculated by applying the Tiger hash to 1024-byte blocks of a stream, then combining the interim values through a binary hash tree. Clients MUST NOT share partial files that have not had a Tiger Tree Hash value calculated. See: http://sourceforge.net/projects/tigertree/ http://www.cs.technion.ac.il/~biham/Reports/Tiger/ http://bitzi.com/developer/bitprint 2.2 Sub Hash Ranges Clients MUST store sub ranges of 1mb sizes and may choose to store smaller ranges. 2.3 Transmissions and Display of Tiger Tree Hashes 3.1 Queries and Replies Queries by clients that can handle Partial File Share MUST indicate this in the query (GGEP?). All replies of partial files MUST indicate the ranges available and any X-Alternate-Locations for any parts of the file. Clients SHOULD NOT display partial file results to the user UNLESS the location of a full file is found or the ranges returned cover the range of the full file. 3.1.2 Searching by Sub Hash Clients MAY search by any 1mb sub hash. 3.2 Requesting Ranges HTTP Range GETs are the best standard multivendor way to request parts of larger file. Clients SHOULD request all required ranges. 3.3 Uploading Partial Files A client with a complete file SHOULD randomly upload ranges of the file. If ranges of the file are requested then the client SHOULD randomly choose which ranges to supply first. The intention is to propagate the whole file across the network as rapidly as possible. Clients MAY use X-Alternate-Locations to decide which ranges are rarest and preferentially upload those ranges. 3.4 Sharing Partial Files Clients that are capable of sharing partial files MUST share partial files by default. Client MAY allow users to inactivate the sharing of partial files. To Do: -Add a new GGEP extension to queries that specify that you want to see partial files - Add a new GGEP extension to queryhits that simply specifies Percentage complete for partial hits. (a simple partial/full flag would do as well) - In an HTTP Get request, add a X-Gnutella-Partial-File header that lists the IP/Ports of servers thought to at least have X percentage of a file. Do not list the percentage here. - Servers that support partial file gets, should also support a new CGI type request of a style like GET /uri-res/N2PR? urn:sha1:PLSTHIPQGSSZTS5FJUPAKUZWUGYQYPFB where the returned payload is a csv file of the tuple (start, stop, active) -clients need to be able to indicate what the smallest increment of hash they will provide is. Im not sure it makes sense to store all of the 1024bit (?) hashes. Anyway, there should be a header indicating that the client stores hashes as small as X. Data would be the best way to decide of course but 1mb "feels" like a logical size. -there should be a mechanism for searching by a sub-hash. I guess if clients store 1mb hashes it would be possible to search by any of the 1mb hashes but it would probably be less computationaly intesive if there was an agreed on sub-hash to search by. Im not sure how the meta data at the start of a file works. If it is all of the same size regardless of content (and I doubt this) then it would be possible to search for sub-hashes after the first part of the file to do sub-hash matches between nearly identicle files with different "early" meta data. I guess it would make sense to either use the first 1mb or the second 1mb for searching by sub-hash. |
| ||||||||
Re: PFSP 0.1 - Very Very Rough Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
And that's how I would do it: Remote clients return queryreplies of partial files if the file is requested by hash. The local client try's establishing a HTTP1.1 connection to them and requests the ranges of the files it needs. The remote client (if no other error occurs) answers with a 206 (Partial Content) if it has got a subsection of the requested range and with a 416 (Requested range not satisfiable) if the remote client doesn't. Then the downloading is done and everybody is happy. The main advantage I see with this way of handling partial uploads is that it's easy to code, relatively secure (do we need absolute security?). Less changes to the protocol and it has mosts of the benefits your proposal has. I'd rather try using the hammer I have at home before I go buy a bigger one. |
| |||
security It is actually rather easy to fake a full file hash, just lie! There is no way for one client to know that the other client lied until it has download the whole file and rehashed it itself. Then, if there is a problem it doesnt know if it was just a mistake in transfering data. If the file was multi-sourced then there is no way to know which of the many clients it downloaded from lied. This is a MAJOR vulnerability with the current gnutella network. One rouge client could search the net, find the size and hash of files, and then use the same file size and hash to respond to ALL queries it can, send garbage data as just a small part of a swarm and destroy thousands or possibly millions of file transfers with minimal bandwith usage. |
| ||||
Re: security Quote:
Quote:
Quote:
Quote:
I say just give simple partial sharing a try, to see if tree hashes are really necessary. This simple kind of partial sharing could be ready in a month without tedious discussions in the GDF. Your kind of partial sharing will need half a year at least until it's implemented. |
| |||
Yes, I have no doubt that there will be major attacks on the gnutella network at some point. The argument that there are already so many holes in gnutella security so why not a few more isnt a very good approach. We need to be working to patch up those holes, not creat new ones. |
| |||
In fact, rather than fake file or file segment, corruption of a file segment is more common in partial file transfer. Without tree hash (or some equivalent mechanism), a single file chunk corruption will not be detected before the entire file downloaded then found corruption so has to be dumped. People should have at least tried out some of the p2p software that have implemented partila file transfer (e.g. eDonkey, etc), and get some first hand user experiences before jump to argue about whether 'partial file transfer' is necessary and how it should work. It's quite comfortable to say, that once you've tasted a p2p software which based on 'partial file transfer', you would never want to go back. |
| |||
Taliban, the java source for Tiger is available here: http://www.cryptix.org/products/jce/index.html It does not have the "Tree" functionality of course. |
| |
LinkBacks (?)
LinkBack to this Thread: https://www.gnutellaforums.com/general-gnutella-development-discussion/11328-partial-file-sharing-protocol-development.html | ||||
Posted By | For | Type | Date | |
Firefox : Partial File Sharing Protocol (???????? ?????? ??????? ?????) | FireFox 3 | This thread | Refback | November 15th, 2011 08:48 PM | |
LimeWire Gnutella - LimeWire | This thread | Refback | August 23rd, 2011 05:21 AM | |
Partial File Sharing Protocol (???????? ?????? ??????? ?????) | ????? Mozilla ?????? | This thread | Refback | April 26th, 2011 10:27 AM | |
Partial File Sharing Protocol ( ). : LiveInternet - - | This thread | Refback | March 7th, 2011 12:20 PM |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Partial File Sharing in LW! | et voilą | LimeWire Beta Archives | 26 | July 6th, 2003 02:04 PM |
Organize new protocol development | Etzi | General Gnutella Development Discussion | 3 | March 16th, 2002 02:38 PM |
partial file sharing and other questions | Unregistered | LimeWire Beta Archives | 4 | January 21st, 2002 11:31 AM |
Release partial file sharing protocol | GnutellaFan | XoloX Feature Request | 2 | September 13th, 2001 06:39 AM |