Gnutella Forums  

Go Back   Gnutella Forums > Current Gnutella Client Forums > Phex (Cross-platform) > Development & Coding > Development Open Discussion
Register FAQ The Twelve Commandments Members List Calendar Arcade Find the Best VPN Today's Posts

Development Open Discussion Anything else about the Phex development


Reply
 
LinkBack Thread Tools Display Modes
  #11 (permalink)  
Old October 14th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

Directed as in "don't just disconnect the one where we have the highest overload, but disconnect to move into a region of the network where we won't get overloaded".

Currently if all outgoings are overloaded , they will get disconnected, which can be contraproductive, if the reason for them being overloaded is a greedy incoming client which we don't detect as greedy, because our inbound bandwidth is far higher than our outbound bandwidth (and because we replicate its requests, so each inbound requests creates up to 31 outbound requests).
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.
Reply With Quote
  #12 (permalink)  
Old October 14th, 2008
Phex Developer
 
Join Date: May 9th, 2001
Location: Stuttgart, Germany
Posts: 988
GregorK is flying high
Default

That means you like to disconnect connections which are not overloaded because other are overloaded and you are unsure if the not overloaded connections cause that the other connections are overloaded because they could generate to much traffic?
__________________
Reply With Quote
  #13 (permalink)  
Old October 14th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

No.

It means, I want to disconnect those not overloaded hosts if I am fairly sure that they cause the others to be overloaded.

I can't be perfectly sure (I could be generating messages myself), but if

* most of the outgoing are overloaded and
* one of the incoming ones creates much more traffic than the others

I can be fairly sure that the one incoming connection causes the outgoing ones to be overloaded.

Note on the words: With "incoming" I mean "incoming stream of data" and with "outgoing" "outgoing stream of data". Since all connections are both-way I don't differenciate between connections initiated by me or by others.
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.
Reply With Quote
  #14 (permalink)  
Old October 15th, 2008
Phex Developer
 
Join Date: May 9th, 2001
Location: Stuttgart, Germany
Posts: 988
GregorK is flying high
Default

Sure detecting spamming connections is always desirable. I guess the challenge would be to decide what is the better alternative. Disconnect the faster better connected host that send to much data (no spam) for others but maybe not for you or to disconnect the slower host on the edge of the network.

Its always good to have a mix of almost equally weaker and stronger connections around you, to achieve this, looking at message drops because of congestions in a certain time frame might not be such a bad solution and likely a much more easier solution to implement then a complicated algorithm. Why not check out first how the time frame approach will improve things already?
__________________
Reply With Quote
  #15 (permalink)  
Old October 16th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

It would in every case be a good first step.
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.
Reply With Quote
  #16 (permalink)  
Old October 17th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

Here's a simple way to get a window (borrowed from TCP):

When receiving

new_connection_quality = old_connection_quality*X + current_measurement * (1-X)

current_measurement is 1 (packet successfully transfered) or 0 (packet dropped).

TCP uses 0.8 for X.

The connection_quality gets initialized with 1 (good).

Since a disconnect is stronger than the "reduce" of TCP, I'd suggest using 0.9 for X and disconnecting when the connection quality drops below 0.3 (I added simulation code to SVN; needs pyxplot so i attached an image of the algo in operation).

These settings disconnect people who have an average drop rate of 30% when they have a burst of drops, anything below is fine. 60% drop rate gets dropped quickly.

The attached images are

- 30% drop rate with X = 0.9
- 60% drop rate with X = 0.9
- 10% drop rate with X = 0.9
Attached Thumbnails
[Concept] Disconnect Policies-plot-0.3-0.9.png   [Concept] Disconnect Policies-plot-0.6-0.9.png   [Concept] Disconnect Policies-plot-0.1-0.9.png  
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.
Reply With Quote
  #17 (permalink)  
Old October 18th, 2008
Phex Developer
 
Join Date: May 9th, 2001
Location: Stuttgart, Germany
Posts: 988
GregorK is flying high
Default

I did a test implementation. The following problems exists:
We don't count successful messages. This is hard since a message can be dropped anywhere else much later in the call chain. We count the total messages and the dropped. The good messages would be =total-dropped.

We drop received and sent queue messages. Both works quite differently. Received messages dropped are mostly because of parsing problems, spam or unsupported messages. They come once in a while between good messages.
Sent messages are most of the time dropped in large chunks. This is because of the send queue and message priority. First important messages are transferred, low priority messages are queued for the next rotation. This continues until the live time of queued messages is over, then they are dropped.
This might cause large chunks of messages (lets say 50-200) be dropped at once. And will immediately let your algorithm drop down below the threshold, even though its only a short few seconds lasting low bandwidth period.
__________________
Reply With Quote
  #18 (permalink)  
Old October 20th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

The code could be adapted to use chunks which begin and end at places, where a dropped message gets followed by a sent packet.

For received messages it would work as it is.

For sent messages, we can just make X larger (for example 0.95).

The math for calculation of a fitting X is pretty simple:

- We take the base value (for example 90% good packets - 10% drop -> base connection quality: 0.9)
- We want this quality to stay above 0.3 as long as the size of the chunk of dropped messages doesn't exceed a certain threshold (for example 200).

Math: base quality * X**(max length of drop chunk) > 0.3

Example:
- base quality: 0.9
- X: 0.995
- max drop chunk length: 200
0.9 * 0.995**(200) = 0.33026203955355032

This will make the outgoing window larger, but it has the advantage of still being extremly simple, and of suplpying a connection quality which can mean something to users.

As you can see in the attached plots (which already use "sent" = "success"), it takes about 300 to 600 packets for the algorithm to stabilize with X = 0.995

With chunks it will take about the same time, but it will reach that lower quality at the end of a drop chunk and fluctuate a bit more.

Using "sent packets" and "dropped" instead "successful" and "dropped" should be no problem. It just increases the connection quality given back by the algorithm, so we can just increase the bar of what is "good".

"Sent" is just taken as "successful", and we have a bit more "successful" packages in the algorithm (success + drop rate) than in the ideal algorithm, but that shouldn't do harm. The mean quality now can't go below 0.5, but the mean quality isn't the parameter for disconnect - the parameter is how far the drop chunk will decrease the connection quallity. Maybe it would also be possible, though, to take the mean of the quality by storing the extrema of the quality and using the mean of those (with expiration time).

pseudocode:
Code:
if current_quality > previous_quality: 
  set the latest maximum to current_quality. 
  if the latest minimum is not None: // if the latest minimum has been set in the last step
    add a minimum with the value None // we add a new one without value - we set the minimum to change to the next one in line. 
else if current_quality < previous quality: 
  set the latest minimum to current_quality. 
  if the latest maximum is not None: // if the latest maximum has been set in the last step
    add a maximum with the value None // we add a new one without value
This gets all local extrema which can be used for the mean calculation. With drop chunks the number of extrema shouldn't rise too fast (the list of extrema could also just have a length limit - when it is reached, the new maxima are added at the beginning again, overwriting the old ones).


The advantage of this algorithm is that it is very easy to code (and should be very fast).

The "base quality" of a 30% drop connection which drops in chunks (see 0.9*0.995**200) should be well above the 0.7 which we get from a randomly dropping connection, as it will have long phases in which no packets get dropped.

Example:
Code:
>>> a = 0.9
>>> for i in range(200):
...  a *= 0.995
...
>>> a
0.33026203955355027
>>> for i in range(600):
...  a = a*0.995 + 0.005
...
>>> a
0.96690568756215878
>>> for i in range(200):
...  a *= 0.995
...
>>> a
0.35481360492245118
>>> for i in range(600):
...  a = a*0.995 + 0.005
...
>>> a
0.96811887424582133
>>> (a + 0.35481360492245118 ) / 2.0
0.6614662395841362
(that's a quality 0.5 connection: 200 drops, 400 successes, 600 total packets - 1/3rd of the packets get dropped -> 0.66).
Attached Thumbnails
[Concept] Disconnect Policies-plot-0.3-0.995.png   [Concept] Disconnect Policies-plot-0.1-0.995.png   [Concept] Disconnect Policies-plot-0.6-0.995.png  
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.

Last edited by arne_bab; October 20th, 2008 at 01:20 AM.
Reply With Quote
  #19 (permalink)  
Old October 20th, 2008
arne_bab's Avatar
Draketo, small dragon.
 
Join Date: May 31st, 2002
Location: Heidelberg, Germany
Posts: 1,881
arne_bab is a great assister to others; your light through the dark tunnel
Default

I thought some more about the above, and I think it grows too complex.

I think it would be more useful to just take (packets - dropped / packets) as quality in a time window equal to the lifetime of a low-priority query.

To not just disconnect the node for one bad value, using this value as base for the TCP algorithm with X=0.9 should do the job (which will also give just connecting nodes a grace period).

Together these keep it simple and don't try to fix one algorithm with esoteric adjustments

The smallest single value for the calculation of the quality then isn't one packet but the mean successful/packets ratio the window of one liftetime (successful = packets - dropped).

For incoming packets the simple algorithm from TCP can be used with X=0.9 which will lead to blocking incoming abuse far faster than outgoing drops. That fits the reasoning that incoming abuse is worse than a too weak outgoing link, because incoming abuse gets replicated.
__________________

-> put this banner into your own signature! <-
--
Erst im Spiel lebt der Mensch.
Nur ludantaj homoj vivas.
GnuFU.net - Gnutella For Users
Draketo.de - Shortstories, Poems, Music and strange Ideas.
Reply With Quote
  #20 (permalink)  
Old October 20th, 2008
Phex Developer
 
Join Date: May 9th, 2001
Location: Stuttgart, Germany
Posts: 988
GregorK is flying high
Default

Why not just count total count and drops for a 10 minute window and then check if values are to high, reset counts and wait again 10 minutes until next recheck.
__________________
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -7. The time now is 10:02 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
SEO by vBSEO 3.6.0 ©2011, Crawlability, Inc.

Copyright © 2020 Gnutella Forums.
All Rights Reserved.