The limit on 3 characters was designed at a time when only ASCII searches were reliable.
But since we now support Unicode for handling any language, this rule should be rewritten so that it will require a minimum 3 UTF-8 encoded bytes for a search.
This won't change anything for ASCII searches: it will still be 3 characters.
But for geenral European Latin/Greek searches it will mean that 2 characters will be enough if at least one is not ASCII (note however that searches ignores and drop accents, even if combining accents are still returned in the results)
For Asian languages, 3 UTF-8 bytes will code 1 ideograph or 1 Hiragana or Katakana. May be this limit of 3 bytes is too little.
So as a prudent alternative, I would say that 3 ASCII-only characters or 4 bytes of UTF-8 encoding will be needed to perform a search (For European languages, this is 3 ASCII, or 2 ASCII and 1 extended character, or 2 extended characters; for Asian texts, this means a minimum of 2 ideographs or 2 hiragana/katakana, ignoring the combining voice or tone marks) |