View Single Post
  #26 (permalink)  
Old June 26th, 2002
gnutellafan gnutellafan is offline
Gnutella Veteran
 
Join Date: September 21st, 2001
Posts: 110
gnutellafan is flying high
Default PFSP 0.2 - Programers needed!

Well, time for you programmers to add this amazing feature to all the open source clients out there.


PFSP 0.2

A great thanks goes out to Tor for updating the PFSP 0.2. The PFSP is now usable and hopefully someone is willing to implement it.
http://groups.yahoo.com/group/the_gdf/message/7984


____________________________

Partial File Sharing Protocol 0.2

Here, the server is the host that is providing the file, and client is the
host that requests the file.


1. Partial File Transfer

The server assigns file indexes for partial files, and allows HTTP requests
for them. Only partial requests (with a Range header) are accepted. Servers
that supports uri-res file requests should also allow such requests for
partial files. Servers should keep the file index when the file in completed
and moved to the incoming files folder.

The X-Available-Ranges header is used by the server to inform the client
about what ranges are available.

X-Available-Ranges: bytes=0-10,20-30

The client requests the ranges it wants using the Range header.

Range: bytes=0-
means the client wants any ranges the server can provide.
The server then provides the range it wants to upload using a 206 Partial
Content response. This allows the server to upload different ranges to
different hosts, and save bandwidth by allowing them to get the other parts
from each other.
The 206 response contains a Content-Range header on the form

Content-Range: bytes <start>-<end>/<total_size>

Note that <total_size> is te size of the FULL file.

If the server is unable to provide any part of the requested ranges, it
returns a 416 Requested Range Not Satisfiable response.


2. Tree Hashes

Tree hashes are not absolutely required for Partial File Sharing, so you don
't have to implement this part at first. TigerTree can be implemented
if/when corrupt files become a problem. The reason that it is in this
proposal is because Partial File Sharing might cause corrupt files to spread
faster.

TigerTree hashes are computed using a 1024 byte base size. It is then up to
each vendor to decide how many sub-hashes to actually store. Storing (and
advertising) the top 10 levels of the tree might be good decision. It would
allow a resolution of about 2 MB on a 1 GB file, and requires only a about
25 kB of hash data per file.

The tree is provided as specified in the Tree Hash EXchange format (THEX) at
http://www.open-content.net/specs/dr...e-thex-01.html /> It basically says that the hash tree is provided as a long stream of binary
data starting with the root hash, then the two hashes it is computed from,
and so on.

To inform the client about where the hash tree can be retrieved the server
includes a X-Thex-URI header on this form

X-Thex-URI: <URI> ; <ROOT>

<URI> is any valid URI. It can be to a uri-res translator, and can even
point to another host. The client can then retrieve desired parts of the
hash tree by doing range requests for the specified URI.

<ROOT> is the root TigerTree hash is base32 format.


3. How to find the location of partial files.

This proposal does not affect Gnutella messages in any way. The only
available mean of spreading the location of a partial file is through the
download mesh in X-Gnutella-Alternate-Location headers. I think this should
work very well. Since those who share a partial file are also downloading
the same file, they will be able to send alt-loc headers to other hosts
sharing the full file.

It would be good if the available ranges could be specified in the
X-Gnutella-Alternate-Location headers, but I don't really know how to do
that most efficiently. The information would quickly become outdated, and is
not very important anyway.



Spreading partial files in the download mesh will cause servants that do
not support partial file sharing to receive addresses to partial sources. I
don't think that is a problem. The worst thing that can happen is that they
won't be able to use those sources.


4. Sample negiotioation:

Here is a sample negotiation. I don't think it will look exactly like this,
but it should show the headers in action. Clients might want to request a
small range first, to get the list of available ranges. There are some
linebreakes in long headers below.

Client:
GET /get/1234/my_song.mp3 HTTP/1.1
User-Agent: FooBar/1.0
Host: 123.123.123.123:6346
Connection: Keep-Alive
Range: bytes=73826-
X-Gnutella-Content-URN: urn:sha1:QLFYWY2RI5WZCTEP6MJKR5CAFGP7FQ5X
X-Gnutella-Alternate-Location:
http://theclient.com:6346/get/2468/my_song.mp3 />
Server:
HTTP/1.1 206 Partial Content
Server: FooBar/1.0
Content-Type: audeo/mpeg
Content-Range: bytes 73826-83825/533273
Content-Lenght: 10000
Connection: Keep-Alive
X-Available-Ranges: bytes=0-285749
X-Gnutella-Content-URN: urn:sha1:QLFYWY2RI5WZCTEP6MJKR5CAFGP7FQ5X
X-Thex-URI:
/uri-res/n2x?urn:sha1:QLFYWY2RI5WZCTEP6MJKR5CAFGP7FQ5X;VEKX TRSJPTZJLY2IKG5F Q
2TCXK26SECFPP4DX7I

<10000 bytes of data>

"n2x" above is an example. Someone should comment on what should be used.
Since the URI is provided in the X-Thex-URI header, each vendor can chose
how to provide the THEX data.


5. Issues

* A server can decide to upload only a part of the requested range. This
means that clients cannot be sure to get the file in sequential order.
* Also clients cannot decide how many bytes to download per request. Perhaps
the server should be required to upload a range that begins with the first
requested byte.

Last edited by gnutellafan; July 1st, 2002 at 05:35 PM.
Reply With Quote