New filesharing model to outwit censors and improve reliability This idea may have been given some airing and been dismissed - if so, please tell me and forgive me.
Currently, gnutella as a P2P system consists of discrete, individual files, residing on dispersed machines.
How about a model in which register files are distributed about the system on a pseudo-arbitrary basis, corresponding to an equally pseudo-arbitrary distribution of the actual data in small, say, 200k chunks?
The register file would consist of a list of Currently Known Good registers for each portion of the data on different systems.
Clients would INITIALLY request register files, and subsequently the register files would request portions of the data from different systems that it knows should contain them.
It would attempt to stick to the same machine for subsequent chunks of data but actively switch to the next Know Good sources in the register on failure.
If a client finds a server machine to be unresponsive, it sets a status flag in the register file to "Failed Once".
If a subsequent client picks up that register file and finds that server machine to be unresponsive, it sets the status flag to "Failed Twice", etc, etc, and updates the timestamp.
========================
On the client side, when you 'share' a file for the first time, you actually do more than that. Yes, you do share the whole file, but you also publish and broadcast the file in small chunks amongst a large number of peers.
(It is a condition of gnutella that you offer a certain amount of free space for this trusted process.)
This process builds the initial register file, which is then broadcast amongst another large number of peers.
===================================
So what happens in a search? For example for 'Song1"?
1: A conventional search for Song1 retrieves the first THREE register files it finds for that file (if available) .
2: Before starting anything, the requesting peer creates a temporary register file.
3: Starting with the OLDEST register file, the requesting peer starts building Song1 with the data referred to in the register file.
4: It sets the flags "Failed Once/Twice" etc, if and when data is unavailable, and moves on to the later register file to see if that file contains more prevalent information, before reverting to the older file and continuing. Equally, if it finds that a 'Failed' flag works, it resets the value to "OK".
5: If it exhausts process 4, it polls previous peers for that chunk of data and updates the register file accordingly.
6: If it exhausts process 5, it begins a brand new nested gnutella search for more register files.
7: The result is (hopefully) a complete 'Song1' on the requesting peer.
8: Last, but not least, the requesting peer has a new register file. This register file supplants any failed register information with itself, and is redistributed.
===============================================
To stop the system from going mad and consuming space, if a client has not had a request for a certain chunk of data for say, 3 months, it deletes it and frees it up. (This does not apply to the complete files that it, itself is broadcasting).
==============================================
I don't know what the legal implications of this process would be. This is effectively a kind of new file system. But it may reduce the SPOF aspect of so much in gnutella, not only distributing the load, but the responsibility for the availability of individual files.
There may be the responsibility aspect. Many people wouldn't want to be an unwitting distributor of adult material, for example. |