|
Register | FAQ | The Twelve Commandments | Members List | Calendar | Arcade | Find the Best VPN | Today's Posts | Search |
New Feature Requests Your idea for a cool new feature. Or, a LimeWire annoyance that has to get changed. |
Welcome To Gnutella Forums You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, fun aspects such as the image caption contest and play in the arcade, and access many other special features after your registration and email confirmation. Registration is fast, simple and absolutely free so please, join our community today! (click here) (Note: we use Yandex mail server so make sure yandex is not on your email filter or blocklist.) Confirmation emails might be found in your Junk folder, especially for Yahoo or GMail. If you have any problems with the Gnutella Forum registration process or your Gnutella Forum account login, please contact us (this is not for program use questions.) Your email address must be legitimate and verified before becoming a full member of the forums. Please be sure to disable any spam filters you may have for our website, so that email messages can reach you. Note: Any other issue with registration, etc., send a Personal Message (PM) to one of the active Administrators: Lord of the Rings or Birdy. Once registered but before posting, members MUST READ the FORUM RULES (click here) and members should include System details - help us to help you (click on blue link) in their posts if their problem relates to using the program. Whilst forum helpers are happy to help where they can, without these system details your post might be ignored. And wise to read How to create a New Thread Thank you If you are a Spammer click here. This is not a business advertising forum, all member profiles with business advertising will be banned, all their posts removed. Spamming is illegal in many countries of the world. Guests and search engines cannot view member profiles. Deutsch? Español? Français? Nederlands? Hilfe in Deutsch, Ayuda en español, Aide en français et LimeWire en français, Hulp in het Nederlands Forum Rules Support Forums Before you post to one of the specific Client Help and Support Conferences in Gnutella Client Forums please look through other threads and Stickies that may answer your questions. Most problems are not new. The Search function is most useful. Also the red Stickies have answers to the most commonly asked questions. (over 90 percent). If your problem is not resolved by a search of the forums, please take the next step and post in the appropriate forum. There are many members who will be glad to help. If you are new to the world of file sharing please do not be shy! Everyone was ‘new’ when they first started. When posting, please include details for: Your Operating System ....... Your version of your Gnutella Client (* this is important for helping solve problems) ....... Your Internet connection (56K, Cable, DSL) ....... The exact error message, if one pops up Any other relevant information that you think may help ....... Try to make your post descriptive, specific, and clear so members can quickly and efficiently help you. To aid helpers in solving download/upload problems, LimeWire and Frostwire users must specify whether they are downloading a torrent file or a file from the Gnutella network. Members need to supply these details >>> System details - help us to help you (click on blue link) Moderators There are senior members on the forums who serve as Moderators. These volunteers keep the board organized and moving. Moderators are authorized to: (in order of increasing severity) Move posts to the correct forums. Many times, members post in the wrong forum. These off-topic posts may impede the normal operation of the forum. Edit posts. Moderators will edit posts that are offensive or break any of the House Rules. Delete posts. Posts that cannot be edited to comply with the House Rules will be deleted. Restrict members. This is one of the last punishments before a member is banned. Restrictions may include placing all new posts in a moderation queue or temporarily banning the offender. Ban members. The most severe punishment. Three or more moderators or administrators must agree to the ban for this action to occur. Banning is reserved for very severe offenses and members who, after many warnings, fail to comply with the House Rules. Banning is permanent. Bans cannot be removed by the moderators and probably won't be removed by the administration. The Rules 1. Warez, copyright violation, or any other illegal activity may NOT be linked or expressed in any form. Topics discussing techniques for violating these laws and messages containing locations of web sites or other servers hosting illegal content will be silently removed. Multiple offenses will result in consequences. File names are not required to discuss your issues. If filenames are copyright then do not belong on these forums & will be edited out or post removed. Picture sample attachments in posts must not include copyright infringement. 2. Spamming and excessive advertising will not be tolerated. Commercial advertising is not allowed in any form, including using in signatures. 3. There will be no excessive use of profanity in any forum. 4. There will be no racial, ethnic, or gender based insults, or any other personal attacks. 5. Pictures may be attached to posts and signatures if they are not sexually explicit or offensive. Picture sample attachments in posts must not include copyright infringement. 6. Remember to post in the correct forum. Take your time to look at other threads and see where your post will go. If your post is placed in the wrong forum it will be moved by a moderator. There are specific Gnutella Client sections for LimeWire, Phex, FrostWire, BearShare, Gnucleus, Morpheus, and many more. Please choose the correct section for your problem. 7. If you see a post in the wrong forum or in violation of the House Rules, please contact a moderator via Private Message or the "Report this post to a moderator" link at the bottom of every post. Please do not respond directly to the member - a moderator will do what is required. 8. Any impersonation of a forum member in any mode of communication is strictly prohibited and will result in banning. 9. Multiple copies of the same post will not be tolerated. Post your question, comment, or complaint only once. There is no need to express yourself more than once. Duplicate posts will be deleted with little or no warning. Keep in mind a forum censor may temporarily automatically hold up your post, if you do not see your post, do not post again, it will be dealt with by a moderator within a reasonable time. Authors of multiple copies of same post may be dealt with by moderators within their discrete judgment at the time which may result in warning or infraction points, depending on severity as adjudged by the moderators online. 10. Posts should have descriptive topics. Vague titles such as "Help!", "Why?", and the like may not get enough attention to the contents. 11. Do not divulge anyone's personal information in the forum, not even your own. This includes e-mail addresses, IP addresses, age, house address, and any other distinguishing information. Don´t use eMail addresses in your nick. Reiterating, do not post your email address in posts. This is for your own protection. 12. Signatures may be used as long as they are not offensive or sexually explicit or used for commercial advertising. Commercial weblinks cannot be used under any circumstances and will result in an immediate ban. 13. Dual accounts are not allowed. Cannot explain this more simply. Attempts to set up dual accounts will most likely result in a banning of all forum accounts. 14. Video links may only be posted after you have a tally of two forum posts. Video link posting with less than a 2 post tally are considered as spam. Video link posting with less than a 2 post tally are considered as spam. 15. Failure to show that you have read the forum rules may result in forum rules breach infraction points or warnings awarded against you which may later total up to an automatic temporary or permanent ban. Supplying system details is a prerequisite in most cases, particularly with connection or installation issues. Violation of any of these rules will bring consequences, determined on a case-by-case basis. Thank You! Thanks for taking the time to read these forum guidelines. We hope your visit is helpful and mutually beneficial to the entire community. |
View Poll Results: Do you want this feature to be available? | |||
Yes | 51 | 92.73% | |
No way! | 3 | 5.45% | |
I don't know | 1 | 1.82% | |
Voters: 55. You may not vote on this poll |
| LinkBack | Thread Tools | Display Modes |
| |||
and yes i read the thread and thier 3 topics... but i do have the right to focus on just one for a sec..... also (im that that bright.... whats does BTW mean)..... it took me 6 years to figure out whats lol means.... where can i get an "internet dictionary?) |
| |||
??? "real programmers know whats a REAL hash filter does" and how it works.... it just that internet revolution changes opps you DID do a funny ...and what the word hash really means..... maybe you shoud pm me to you "brain war" so we wont take up precious space in this forum.... and there wont be any public embarasments on both sides.... oh but maybe...... WERE ON THE SAME SIDE!!!!!!! HELLO?? Last edited by macho6868; September 3rd, 2006 at 11:11 PM. Reason: mispelled words.... the keyboard cant type...lol |
| |||
Hashing is the transformation of a string of characters into a usually shorter fixed-length value or key that represents the original string. Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. It is also used in many encryption algorithms. As a simple example of the using of hashing in databases, a group of people could be arranged in a database like this: Abernathy, Sara Epperdingle, Roscoe Moore, Wilfred Smith, David (and many more sorted into alphabetical order) Each of these names would be the key in the database for that person's data. A database search mechanism would first have to start looking character-by-character across the name for matches until it found the match (or ruled the other entries out). But if each of the names were hashed, it might be possible (depending on the number of names in the database) to generate a unique four-digit key for each name. For example: 7864 Abernathy, Sara 9802 Epperdingle, Roscoe 1990 Moore, Wilfred 8822 Smith, David (and so forth) A search for any name would first consist of computing the hash value (using the same hash function used to store the item) and then comparing for a match using that value. It would, in general, be much faster to find a match across four digits, each having only 10 possibilities, than across an unpredictable value length where each character had 26 possibilities. The hashing algorithm is called the hash function (and probably the term is derived from the idea that the resulting hash value can be thought of as a "mixed up" version of the represented value). In addition to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures (used to authenticate message senders and receivers). The digital signature is transformed with the hash function and then both the hashed value (known as a message-digest) and the signature are sent in separate transmissions to the receiver. Using the same hash function as the sender, the receiver derives a message-digest from the signature and compares it with the message-digest it also received. They should be the same. The hash function is used to index the original value or key and then used later each time the data associated with the value or key is to be retrieved. Thus, hashing is always a one-way operation. There's no need to "reverse engineer" the hash function by analyzing the hashed values. In fact, the ideal hash function can't be derived by such analysis. A good hash function also should not produce the same hash value from two different inputs. If it does, this is known as a collision. A hash function that offers an extremely low risk of collision may be considered acceptable. Here are some relatively simple hash functions that have been used: The division-remainder method: The size of the number of items in the table is estimated. That number is then used as a divisor into each original value or key to extract a quotient and a remainder. The remainder is the hashed value. (Since this method is liable to produce a number of collisions, any search mechanism would have to be able to recognize a collision and offer an alternate search mechanism.) Folding: This method divides the original value (digits in this case) into several parts, adds the parts together, and then uses the last four digits (or some other arbitrary number of digits that will work ) as the hashed value or key. Radix transformation: Where the value or key is digital, the number base (or radix) can be changed resulting in a different sequence of digits. (For example, a decimal numbered key could be transformed into a hexadecimal numbered key.) High-order digits could be discarded to fit a hash value of uniform length. Digit rearrangement: This is simply taking part of the original value or key such as digits in positions 3 through 6, reversing their order, and then using that sequence of digits as the hash value or key. A hash function that works well for database storage and retrieval might not work as for cryptographic or error-checking purposes. There are several well-known hash functions used in cryptography. These include the message-digest hash functions MD2, MD4, and MD5, used for hashing digital signatures into a shorter value called a message-digest, and the Secure Hash Algorithm (SHA), a standard algorithm, that makes a larger (60-bit) message digest and is similar to MD4 |
| |||
The purpose of this one-way-hash filter is to have large scale hashing in a small subset of data. It is similar to a bloom filter except that instead of using a binary system the filter uses an a-z system where each point can hold more that one value and is not limited to binary values. View source code v.01 going to add in bitwise shifting soon. View example of a spellchecker done entirely in php with a 300,000 word dictionary (2.5M), the hash table is a 50x50 (1 in 3,250,000 probabilty) at that size the hash table is 419k serialized and uncompressed, it could be done in a 25x25 hash table which would be around 200k however the probabilty of false positives is much much higher (Compare 25x25 matrix example against the above example). -------------------------------------------------------------------------------- Why use these? A 250x250 matrix (large) has a unique factor of over 40 million and is around 16 megabytes blank. The structure is designed for large hashes. It would be easily possible to hold the entire domain name system hashed in around 20 megs. What this means is you could have very fast lookups on if a name is in the system. -------------------------------------------------------------------------------- How does this filter work? Take an item and hash it four times, modulus each value to the following values. A marker for the X axis of the matrix A marker for the Y axis of the matrix A marker for how deep into the hash at the matrix point to go into. A marker for the letter to use, Insert it twice by flipping the X and Y values. Each item hashed in the database will have a unique value set to each of the above markers. For lookups, the letter must be in two exact places withing the entire filter. In the above example of the spelling checker, there is a 1/1.625M probabily of a duplicate at any given point (due to doubling data inserting). These filters can take a while to generate, they are very processor intensive to create. These filters use a matrix format for speed in lookups, each matrix point holds a small filter. -------------------------------------------------------------------------------- How is that this filter can hold so much information? Example small filter 0000000000 Insert "d" at position '2' (note:Counting starts at zero) 00d0000000 Insert "z" at position '5' (note:Counting starts at zero) 00d00z0000 Now it becomes interesting. insert "f" at position '2' (on top of another letter in place) 00Df00z0000 If another needs to be positioned at the same point. insert "g" at position '2' (on top of another two letters in place) 00DFg00z0000 The array maintains all positions and if it encounters an uppercase letter it concats until a non-uppercase letter is found. Array ( [0] => 0 [1] => 0 [2] => DFg [3] => 0 [4] => 0 [5] => z [6] => 0 [7] => 0 [8] => 0 [9] => 0 ) Multiple hashing algorithms are used for placements. TO DO: Non-XY-flip version that uses a unique secondary value or a combination of other values, however I felt that at this time the number of hashes is approaching maximum. More documentation and list of papers cited. List of links to projects/experiments using bloom filters for large datasets. |
| |||
just in case i dinnt type all of this...... A hash function (or hash algorithm) is a way of creating a small digital "fingerprint" from any kind of data. The function chops and mixes (i.e., substitutes or transposes) the data to create the fingerprint, often called a hash value. The hash value is commonly represented as a short string of random-looking letters and numbers (Binary data written in hexadecimal notation). A good hash function is one that yields few hash collisions in expected input domains. In hash tables and data processing, collisions inhibit the distinguishing of data, making records more costly to find. A typical hash function at workContents [hide] 1 Properties of hash function 2 Applications of hash function 2.1 Cryptography 2.2 Hash tables 2.3 Error correction 2.4 Audio identification 2.5 Rabin-Karp string search algorithm 3 Origins of the term 4 See also 5 References 6 External links [edit] Properties of hash function A fundamental property of all hash functions is that if two hashes (according to the same function) are different, then the two inputs are different in some way. This property is a consequence of hash functions being deterministic. On the other hand, a hash function is not injective, i.e. the equality of two hash values ideally strongly suggests, but does not guarantee, the equality of the two inputs. If a hash value is calculated for a piece of data, and then one bit of that data is changed, a hash function with strong mixing property usually produces a completely different hash value. Typical hash functions have an infinite domain, such as byte strings of arbitrary length, and a finite range, such as bit sequences of some fixed length. In certain cases, hash functions can be designed with one-to-one mapping between identically sized domain and range. Hash functions that are one-to-one are also called permutations. Reversibility is achieved by using a series of reversible "mixing" operations on the function input. [edit] Applications of hash function Because of the variety of applications for hash functions (details below), they are often tailored to the application. For example, cryptographic hash functions assume the existence of an adversary who can deliberately try to find inputs with the same hash value. A well designed cryptographic hash function is a "one-way" operation: there is no practical way to calculate a particular data input that will result in a desired hash value, so it is also very difficult to forge. Functions intended for cryptographic hashing, such as MD5, are commonly used as stock hash functions. Functions for error detection and correction focus on distinguishing cases in which data has been disturbed by random processes. When hash functions are used for checksums, the relatively small hash value can be used to verify that a data file of any size has not been altered. [edit] Cryptography Main article: cryptographic hash function A typical cryptographic one-way function is not one-to-one and makes an effective hash function; a typical cryptographic trapdoor function is one-to-one and makes an effective randomization function. [edit] Hash tables Main article: Hash table Hash tables, a major application for hash functions, enable fast lookup of a data record given its key. (Note: Keys are not usually secret as in cryptography, but both are used to "unlock" or access information.) For example, keys in an English dictionary would be English words, and their associated records would contain definitions. In this case, the hash function must map alphabetic strings to indexes for the hash table's internal array. The generally impossible/impractical ideal for a hash table's hash function is to map each key to a unique index (see perfect hashing), because this guarantees access to each data record in the first probe into the table. Hash functions that are truly random with uniform output (including most cryptographic hash functions) are good in that, on average, only one or two probes will be needed (depending on the load factor). Perhaps as important is that excessive collision rates with random hash functions are highly improbable—if not computationally infeasible for an adversary. However, a small, predictable number of collisions is virtually inevitable (see birthday paradox). In many cases, a heuristic hash function can yield many fewer collisions than a random hash function. Heuristic functions take advantage of regularities in likely sets of keys. For example, one could design a heuristic hash function such that file names such as FILE0000.CHK, FILE0001.CHK, FILE0002.CHK, etc. map to successive indices of the table, meaning that such sequences will not collide. Beating a random hash function on "good" sets of keys usually means performing much worse on "bad" sets of keys, which can arise naturally—not just through attacks. Bad performance of a hash table's hash function means that lookup can degrade to a costly linear search. Aside from minimizing collisions, the hash function for a hash table should also be fast relative to the cost of retrieving a record in the table, as the goal of minimizing collisions is minimizing the time needed to retrieve a desired record. Consequently, the optimal balance of performance characteristics depends on the application. One of the most respected hash functions for use in typical hash tables is Bob Jenkins' LOOKUP2 hash function, published in an article in Dr. Dobb's Journal. The hash function performs well as long as there is no adversary, for it is trivially reversible and useless as a cryptographic hash function. [edit] Error correction Main article: Error correction and detection Using a hash function to detect errors in transmission is straightforward. The hash function is computed for the data at the sender, and the value of this hash is sent with the data. The hash function is performed again at the receiving end, and if the hash values do not match, an error has occurred at some point during the transmission. This is called a redundancy check. For error correction, a distribution of likely perturbations is assumed at least approximately. Perturbations to a string are then classified into large (improbable) and small (probable) errors. The second criterion is then restated so that if we are given H(x) and x+s, then we can compute x efficiently if s is small. Such hash functions are known as error correction codes. Important sub-class of these correction codes are cyclic redundancy checks and Reed-Solomon codes. [edit] Audio identification For audio identification such as finding out whether an MP3 file matches one of a list of known items, one could use a conventional hash function such as MD5, but this would be very sensitive to highly likely perturbations such as time-shifting, CD read errors, different compression algorithms or implementations or changes in volume. Using something like MD5 is useful as a first pass to find exactly identical files, but another more advanced algorithm is required to find all items that would nonetheless be interpreted as identical to a human listener. Though they are not common[citation needed], hashing algorithms do exist that are robust to these minor differences. Most of the algorithms available are not extremely robust, but some are so robust that they can identify music played on loud-speakers in a noisy room.[citation needed] One example of this in practical use is the service Shazam. Customers call a number and place their telephone near a speaker. The service then analyses the music, and compares it to known hash values in its database. The name of the music is sent to the user (for a charge!). An open source alternative to this service is MusicBrainz which creates a fingerprint for an audio file and matches it to its online community driven database. [edit] Rabin-Karp string search algorithm Rabin-Karp string search algorithm is a relatively fast string searching algorithm that works in O(n) time on average. It is based on the use of hashing to compare strings. [edit] Origins of the term The term "hash" apparently comes by way of analogy with its standard meaning in the physical world, to "chop and mix." Knuth notes that Hans Peter Luhn of IBM appears to have been the first to use the concept, in a memo dated January 1953; the term hash came into use some ten years later. In the SHA-1 algorithm, for example, the domain is "flattened" and "chopped" into "words" which are then "mixed" with one another using carefully chosen mathematical functions. The range ("hash value") is made to be a definite size, 160 bits (which may be either smaller or larger than the domain), through the use of modular division. |
| |
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
File size filter a must! | kis46 | New Feature Requests | 37 | January 21st, 2007 08:21 AM |
File Size Filter??? | brianosaur | Open Discussion topics | 0 | May 28th, 2006 01:38 PM |
File Size Filter | Trip LD | New Feature Requests | 16 | November 8th, 2005 08:56 PM |
File size filter | Nvi | New Feature Requests | 6 | November 5th, 2005 09:10 AM |
Can you filter by file size??? | motter28218 | Open Discussion topics | 0 | October 10th, 2005 06:52 PM |