Quote:
Originally posted by sberlin Just FYI, the UDP attempts before GWebCaches aren't on the mainline yet. Once that's all completed we'll merge it in & likely put out a beta. Since GWebCaches are so overloaded right now, we don't want to put out another release till we can make sure they won't get further overloaded (and also make sure that clients can connect faster initially). |
Thanks for this marvelous change. As LimeWire represents alone (in all its versions) more than 60% of all traffic to GWebcaches, this would be wonderful to reduce this traffic by alternate host discovery methods like the new UDP caches.
I have seen the traffic on my cache multipled by 5 in the last few weeks, so I have done some major rewrites of part of its code in my new version 1.3, for tripled performance.
This version now also makes a more rigourous check of URLs, and contains now a very basic active checker for the submitted IP
ort of hosts, plus some other checks for known IP or port numbers that cause problems.
Also for LimeWire's active search of hosts by locale (which may require lots of connections with some locales to find a matching one), my cache now starts implementing a filter for the host addresses returned, so that a minimum number of returned hosts will at least match the querying country, region or continent. However my cache will still return a significant number of hosts belonging to any region. I may enhance the filter to also take into account linguistic data about each country or territory, but this requires some tuning of threshold parameters.
You can see how I detect them simply by looking at the web pages of my cache on rodage.net; the detection is based on statistic files published daily by each of the 4 RIRs, which I preprocess to allow fast search of querying IP. For now I update the preprocessed IP-to-country maps manually, every few days, but they are generated automatically by a script that I will enhance to perform all in one pass; I will soon implement a more automated updater, that will allow maintaining these maps at least once everyday. Today, 176 countries or territories seem active on Gnutella (some territories don't have IP delegations, and use assignment from the main country to which they belong).
Note that some hosts belong to the 'EU' region which is any country in the Europe/Africa region managed by RIPENCC, and some other belong to the 'AP' region which is Asia/Pacific managed by APNIC. Finally very few hosts use IP addresses that are still not registered for use by any RIR, and that I assign within a special code region '__' representing the Earth (this most often comes from new IP blocks recently delegated by RIRs to LIRs or ISPs, but whose usage still lacks some info published by the LIR or ISP; with the new RIR policies about IP assignments, this should appear much less often than before, and in a near future, such IPs will not be usable at all on Internet as long as they have not been registered by a RIR; some of these addresses also come from change of delegation between RIRs, notably by transfers of legacy European blocks from ARIN to RIPENCC).
My cache also now keeps 1000 hosts instead of 400 before, and returns 50 addresses instead of 30 before, to reduce the number of requests, and the number of occurences of repeated requests. 1000 hosts represents approximately the number of hosts sending valid updates now every about 20 minutes. I have also reduced a bit the minimum delay allowed for repeated accesses from 30 minutes to 20 minutes.
The next major version 1.4 of my cache is coming with faster operations, but more complex code. For now the version 1.3 of GWebCleaner contains only the most important and basic changes to accelerate it, but its code is becoming tricky. I expect being able to manage more than 5000 accesses per hour during peak hours (for now I can only manage about 3000 during peak hours, and 16000 during offtime hours).