Gnutella Forums  

Go Back   Gnutella Forums > Gnutella News and Gnutelliums Forums > General Gnutella Development Discussion
Register FAQ The Twelve Commandments Members List Calendar Arcade Find the Best VPN Today's Posts

General Gnutella Development Discussion For general discussion about Gnutella development.


Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old November 28th, 2001
White Magician
 
Join Date: November 20th, 2001
Location: Hannover, Germany
Posts: 25
guido is flying high
Default High quality host lists

At the present every Gnutella client has got a huge list of hosts of which the majority will refuse connection attempts. This not very practical.
A better thing would be to have a small host list with about 20 hosts which are very likely to accept your connection request.

Here are my ideas about how to achieve this:

Every node will keep the following information about every node in its host list:
-ip
-port
-latitude
-longitude
-amount of files reachable through this host at a TTL of 5
-number of incoming connections from Superpeers this node would have accepted at the time it provided this infomation
-the same for connections from clientpeers
-the average bandwidth caused by broadcast messages a connecting Superpeer could expect when connecting to this node
-the time when this information was last updated by this node in seconds from 01.01.1970, 00:00h, GMT
-this nodes uptime in seconds

-a host evaluation number (HEN, explained below)

The difference between the HEN and the other information fields is that while all the other information fields are provided directly by the node this whole information belongs to, the HEN is never passed over the network and is calculated every time a node receives this information, based on the receiving nodes individual needs. It is some kind of an indicator for the usefulness of a node.

A note about latitude/longitude:
These can be very rough figures. They should only be there to avoid too many Gnutella connections over the (very expensive) transcontinental WAN-lines. If some user lives in a country where the freedom of information (and thus the usage of a Gnutella node) is restricted, he may decide to fake these values. They aren't that important.

Now, every node keeps up 2 host lists, one high quality host list with about 20 entries and one 'raw' host list with about 500 entries (and maybe also one 'classical' list, as they are common now).
The entries of both lists are ordered by their HEN.

When a node receives information about another node, it passes this information to Algorithm A and maybe eventually to Algorithm B.

Algorithm A:
*Check whether this nodes IP and port number already appear in the high quality host list
**If Yes, check whether any of the information fields have changed since
***If Yes, delete the old entry
***If No, stop here
*Check whether the time indicated by the 'last updated' value was more than an hour ago
**If Yes, stop here
*Check whether the number of accepted incoming connections of the sort that the node which is now running this algorithm would like to request is 0
**If Yes, pass the received information to Algorithm B and stop here
*Calculate the HEN of this node
*Check, whether the high quality host list is already full
**If No, add the received information as a new entry to the high quality host list, sort the high quality host list and stop here
*Check, whether the HEN, which was just calculated is higher than the lowest HEN in the high quality host list
**If Yes, delete the entry with the lowest HEN from the high quality host list, add the new data as a new entry and sort the high quality host list
**If No, pass the new data to Algorithm B
*Stop here

Algorithm B:
*Check, whether the IP and port number already appear in the raw host list
**If Yes, check whether any of the information fields have changed since
***If Yes, delete the old entry
***If No, stop here
*Check wether this nodes HEN has already been calculated by Algorithm A
**If No, do that now
*Check whether the raw host list is already full
**If No, add the received information as a new entry to the raw host list, sort the raw host list and stop here
*Check whether the new HEN is higher than the lowest HEN in the raw host list
**If Yes, delete the entry with the lowest HEN from the raw host list, add the new data as a new entry and sort the raw host list
*Stop here

Then, there is still the question how the nodes will exchange these information.
I lately came to read a proposal on the gdf-mailinglist about how to add the possibility to search the Gnutella network not only for filenames but also for file hashes. Their trick was to append additional information which would be ignored by hosts that don't know what it means to search requests or replies.
The same trick could be used for this. A Supernode which has yet some incoming connection-slots to offer might append its node descriptor field to 2 or 3 three search requests/replies every 10 minutes or so

Guido
Reply With Quote
  #2 (permalink)  
Old November 28th, 2001
Unregistered
Guest
 
Posts: n/a
Default Re: High quality host lists

Quote:
At the present every Gnutella client has got a huge list of hosts of which the majority will refuse connection attempts. This not very practical.

A better thing would be to have a small host list with about 20 hosts which are very likely to accept your connection request.
Why treat symptoms instead of the cause?
More super peers must be available + more servants must provide incoming connections... NAT or gnutella proxy, see http://www.gnutellaforums.com/showth...&threadid=4163
Reply With Quote
  #3 (permalink)  
Old November 29th, 2001
White Magician
 
Join Date: November 20th, 2001
Location: Hannover, Germany
Posts: 25
guido is flying high
Default

Quote:
Why treat symptoms instead of the cause?
Because I believe these oversized host lists are one of the causes. Because I think that the percentage of hosts accepting incoming connections versus the number of all hosts on the net won't change very soon.
And because I think, as this method differs between connection requests from Clientnodes and those from other Supernodes, it could be a very useful addition to the Superpeer concept.
(Clientnodes don't need information about nodes which will only accept connections from other Supernodes)

And about NAT/Gnutella proxy: If we really get this Superpeer architecture to life, we'll be fine without the ability to connect to those hosts which are behind a firewall/IP-Masquerading router.

Guido
Reply With Quote
  #4 (permalink)  
Old November 29th, 2001
Moak's Avatar
Guest
 
Join Date: September 7th, 2001
Location: Europe
Posts: 816
Moak is flying high
Default

Quote:
And about NAT/Gnutella proxy: If we really get this Superpeer architecture to life, we'll be fine without the ability to connect to those hosts which are behind a firewall/IP-Masquerading router.
And how do you download? The problem is: Without more servants that accept incoming connections you have to trust on Pushs very often. But Pushs won't work bewteen two firewalled host (or better say bewteen two servants that can not handle incoming connections). When you see current Limewire host statistics [1], only few servants do accept incoming connections and on those the network file transfar relies IMHO.

For sure it's good to find alternative ideas. I didn't get the useful addition to the Superpeer concept, could you explain please? How a bout the "superpong" mentioned in another thread, did you think that's a good idea?

[1] Rolling Host Count http://www.limewire.com/index.jsp/size
Reply With Quote
  #5 (permalink)  
Old November 30th, 2001
White Magician
 
Join Date: November 20th, 2001
Location: Hannover, Germany
Posts: 25
guido is flying high
Default

Quote:
Originally posted by Moak

And how do you download? The problem is: Without more servants that accept incoming connections you have to trust on Pushs very often. But Pushs won't work bewteen two firewalled host (or better say bewteen two servants that can not handle incoming connections). When you see current Limewire host statistics [1], only few servants do accept incoming connections and on those the network file transfar relies IMHO.

For sure it's good to find alternative ideas. I didn't get the useful addition to the Superpeer concept, could you explain please? How a bout the "superpong" mentioned in another thread, did you think that's a good idea?

[1] Rolling Host Count http://www.limewire.com/index.jsp/size
Well, okay, maybe this last thing was not really justified. The problem of downloading something from a firewalled host will still be there.
This whole thing raises one question: What the heck is meant by 'incoming connections'? Do they (Limewire) mean attempted http-connections (download reqquests) or do they mean attempted Gnutella connections?
But, no matter what they do mean, I still think that the big difference between the number of unique hosts and that of the hosts accepting incoming connections on their host count is not mainly because so many hosts are firewalled, but rather because every node has a maximum number of connections (For Gnutella- as well as for http-connections), which, most of the time, are simply full.

About 'useful addition to the Supernode concept':
If you tried to connect to Gnutella as a Clientnode, you'd have a large list of IPs which you try to connect to one after the other, to find out that behind this IP/Port is one of the following:
a: No response because this node has already gone offline
b: No response because this node is an older Gnutella servent which doesn't even know about the existance of the Clientnode->Supernode protocol (btw, does such a thing already exist?)
c: A Clientnode
d: A Supernode which has already reached its maximum number of incoming Clientnode connections

or, finally

e: A Supernode which will happily accept your incoming connection request

To find a host of type e might probably take up to half an hour, especially in the the early days, if you only have a raw host list with no further information about the indexed hosts. If you, though, have more descriptiv information, you will find a suitable host a lot sooner. The whole thing could already be useful in a Gnutella net as it exists now (without Supernodes/Clientnodes).

About Superpong:
There were a lot of misunderstandings in this thread. When I first started it, I thought the main purpose of ping-pong was to give the nodes a rough figure about the number and size of the available files, the acquisition of new IPs only being a side effect, so I posted a method which would serve this purpose without providing the nodes with even one single new IP.
However, the main difference between the superpong idea, which emerged from that other discussion and the idea about the high quality host lists is, that while the latter focusses mainly on which information should be available about individual hosts and how it should be used, the first one focusses rather on how this information should be spread across the network.
My solution to the latter problem was to append this information to some query or query replies every one and then, as it is described in that 'HUGE'-RFC by the GDF. I think this is a better idea, because we wouln't have to introduce a new message type for this.

Guido
Reply With Quote
  #6 (permalink)  
Old November 30th, 2001
Moak's Avatar
Guest
 
Join Date: September 7th, 2001
Location: Europe
Posts: 816
Moak is flying high
Default

Quote:
This whole thing raises one question: What the heck is meant by 'incoming connections'? [...] I still think that the big difference between the number of unique hosts and that of the hosts accepting incoming connections on their host count is not mainly because so many hosts are firewalled, but rather because every node has a maximum number of connections
Hi Guido, 'incoming connection' means that your host has the ability to handle incoming TCP connections... it has nothing to do with a peer busy state or number of connections. So the Limewire statistics gives a true reflection of the small percentage of full operative peers.

There are three possibilities why a servant can not accept incoming connections: a) it's blocked by a firewall or packetfilter, b) it's running on a LAN with a NAT-router or masquerading (problem due to the nature of TCP, ask me if you need more details), c) your client does have a (sensless) switch not to allow incoming connections.

About a: you can add forwarding rules to your firewalll (one port must be assigned to each and every servant running inside the LAN), b: you can use an NAT-module or proxy (both is not available yet), c: just switch if off and then check for a+b.

I prepare together with Dun3 a document how to use Gnutella behind a firewall and whioch describes all technical details (TCP/IP, gnutella protocoll , firewalls, NAT, proxies)... sorry, it's not finished yet.

I will post more later, have to hurry and go shoping......

Last edited by Moak; November 30th, 2001 at 12:07 PM.
Reply With Quote
  #7 (permalink)  
Old November 30th, 2001
Morgwen's Avatar
lazy dragon - retired mod
 
Join Date: October 14th, 2001
Location: Germany
Posts: 2,927
Morgwen is flying high
Default

Hi Guido!

If you have time...

meet us in the IRC!

http://www.gnutellaforums.com/showth...&threadid=5917

Morgwen
Reply With Quote
  #8 (permalink)  
Old November 30th, 2001
Moak's Avatar
Guest
 
Join Date: September 7th, 2001
Location: Europe
Posts: 816
Moak is flying high
Default

Hi Guido again.
it's good to have some more people arround that are interested in development and spend time in new ideas. =)

Quote:
No response because this node is an older Gnutella servent which doesn't even know about the existance of the Clientnode->Supernode protocol (btw, does such a thing already exist?)
Not now, to integrate superpeers IMHO we only need:
* a kind of connect handshaking (fallback to V.04 if opponent is old and doesn't understand the new version)
* and especially a flag (e.g. in xping/xpong) saying if a node is servant or superpeer, very importat for host caches connect sheme. That way the hostslist gather more "quality", they provide IPs + saying if peer is servant or superpeer (+ more horizon information when using the full XPONG proposal).

With this small changes the network can still connect as done before (providing very fast startup when using local hostslists together with host caches), including full functionality for old servants which do only understand v0.4.
However it might also important to make best use of superpeers and improve horizons... I like the XPING/SUPERPONG idea. When connecting to an old V0.4 peer we can still use the old PING/PONG sheme, when connecting to a new we could use the new descriptors. I expect that we will need more additions in future. What about chat, swarming, specialized horizon, anti-leeching mechanism, content provider anonymity... I think many new ideas are important for Gnutella's survive.

Quote:
append this [new] information to some query or query replies every one and then, as it is described in that 'HUGE'-RFC by the GDF. I think this is a better idea, because we wouln't have to introduce a new message type for this.
The old clients wouldn't understand even this new information, so why not encapsulate the new type of data into a new descriptor? With the advantage that the protocoll remains logical and strict. 'HUGE' is about file metadata and hashs (which doesn't mean it coudn't be used freestyle).

Greets, Moak

Last edited by Moak; November 30th, 2001 at 02:19 PM.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


LinkBacks (?)
LinkBack to this Thread: https://www.gnutellaforums.com/general-gnutella-development-discussion/6006-high-quality-host-lists.html
Posted By For Type Date
Limewire Hosts | Life123 This thread Refback April 18th, 2011 02:40 PM
host limewire - Virgilio Ricerca Web This thread Refback February 8th, 2011 12:28 PM
Gnutella Host List | USA News Today This thread Refback October 8th, 2010 02:17 AM

Similar Threads
Thread Thread Starter Forum Replies Last Post
High Quality Movies mustangshelly Tips & Tricks 9 January 7th, 2006 06:49 AM
Quality: What does a green checkmark in it in the Quality column mean? alfred_bowman Open Discussion topics 3 July 3rd, 2005 08:55 PM
Quality of .mp3s & What do the 'quality stars' indicate ? NCC-1701 Open Discussion topics 2 March 26th, 2004 03:25 AM
Connection problems even after updating the host lists rabb2 Support: General 3 September 20th, 2001 03:08 AM


All times are GMT -7. The time now is 06:15 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright © 2020 Gnutella Forums.
All Rights Reserved.