HOME





Metaphone
Metaphone is a phonetic algorithm, published by Lawrence Philips in 1990, for indexing words by their English pronunciation. It fundamentally improves on the Soundex algorithm by using information about variations and inconsistencies in English spelling and pronunciation to produce a more accurate encoding, which does a better job of matching words and names which sound similar. As with Soundex, similar-sounding words should share the same keys. Metaphone is available as a built-in operator in a number of systems. Philips later produced a new version of the algorithm, which he named #Double Metaphone, Double Metaphone. Contrary to the original algorithm whose application is limited to English only, this version takes into account spelling peculiarities of a number of other languages. In 2009 Philips released a third version, called Metaphone 3, which achieves an accuracy of approximately 99% for English words, non-English words familiar to Americans, and first names and family nam ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Phonetic Algorithm
A phonetic algorithm is an algorithm for indexing of words by their pronunciation. If the algorithm is based on orthography, it depends crucially on the spelling system of the language it is designed for: as most phonetic algorithms were developed for English they are less useful for indexing words in other languages. Because English spelling varies significantly depending on multiple factors, such as the word's origin and usage over time and borrowings from other languages, phonetic algorithms necessarily take into account numerous rules and exceptions. More general phonetic matching algorithms take articulatory features into account Phonetic search has many applications, and one of the early use cases has been that of trademark search to ensure that newly registered trade marks do not risk infringing on existing trademarks by virtue of their pronunciation. Fall, Caspas J., and Christophe Giraud-Carrier. "Searching trademark databases for verbal similarities." World Patent Info ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Soundex
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly encodes consonants; a vowel will not be encoded unless it is the first letter. Soundex is the most widely known of all phonetic algorithms (in part because it is a standard feature of popular database software such as IBM Db2, PostgreSQL, MySQL, SQLite, Ingres, MS SQL Server, Oracle, ClickHouseSnowflakeand SAP ASE.) Improvements to Soundex are the basis for many modern phonetic algorithms. History Soundex was developed by Robert C. Russell and Margaret King Odell and patented in 1918 and 1922. A variation, American Soundex, was used in the 1930s for a retrospective analysis of the US censuses from 1890 through 1920. The Soundex code came to prominence in the 1960s when it was the subject of several articles in the ''Communications'' an ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Caverphone
The Caverphone within linguistics and computing, is a phonetic matching algorithm invented to identify English names with their sounds, originally built to process a custom dataset compound between 1893 and 1938 in southern Dunedin, New Zealand. Started from a similar concept as metaphone, it has been developed to accommodate and process general English since then. Etymology The Caverphone was created by David Hood in the Caversham Project at the University of Otago in New Zealand in 2002, revised in 2004. It was created to assist in data matching between late 19th century and early 20th century electoral rolls, where the name only needed to be in a "commonly recognisable form". The algorithm was intended to apply to those names that could not easily be matched between electoral rolls, after the exact matches were removed from the pool of potential matches. The algorithm is optimised for accents present in the study area (southern part of the city of Dunedin, New Zealand). Proc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Approximate String Matching
In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately. Overview The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact match. This number is called the edit distance between the string and the pattern. The usual primitive operations are: * insertion: ''cot'' → ''coat'' * deletion: ''coat'' → ''cot'' * substitution: ''coat'' → ''cost'' These three operations may be generalized as forms of substitution by adding a NULL character (here symbolized by *) wherever a character has been deleted or inserted: * insertion: ''co*t'' → ''coat'' * del ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control characters a total of 128 code points. The set of available punctuation had significant impact on the syntax of computer languages and text markup. ASCII hugely influenced the design of character sets used by modern computers; for example, the first 128 code points of Unicode are the same as ASCII. ASCII encodes each code-point as a value from 0 to 127 storable as a seven-bit integer. Ninety-five code-points are printable, including digits ''0'' to ''9'', lowercase letters ''a'' to ''z'', uppercase letters ''A'' to ''Z'', and commonly used punctuation symbols. For example, the letter is represented as 105 (decimal). Also, ASCII specifies 33 non-printing control codes which originated with ; most of which are now obsolete. The control cha ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

French Language
French ( or ) is a Romance languages, Romance language of the Indo-European languages, Indo-European family. Like all other Romance languages, it descended from the Vulgar Latin of the Roman Empire. French evolved from Northern Old Gallo-Romance, a descendant of the Latin spoken in Northern Gaul. Its closest relatives are the other langues d'oïl—languages historically spoken in northern France and in southern Belgium, which French (Francien language, Francien) largely supplanted. It was also substratum (linguistics), influenced by native Celtic languages of Northern Roman Gaul and by the Germanic languages, Germanic Frankish language of the post-Roman Franks, Frankish invaders. As a result of French and Belgian colonialism from the 16th century onward, it was introduced to new territories in the Americas, Africa, and Asia, and numerous French-based creole languages, most notably Haitian Creole, were established. A French-speaking person or nation may be referred to as Fra ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


New York State Identification And Intelligence System
The New York State Identification and Intelligence System Phonetic Code, commonly known as NYSIIS, is a phonetic algorithm devised in 1970 as part of the New York State Identification and Intelligence System (now a part of the New York State Division of Criminal Justice Services). It features an accuracy increase of 2.7% over the traditional Soundex Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for homophones to be encoded to the same representation so that they can be matched despite minor differences in spelling. The algorithm mainly enc ... algorithm. Procedure The algorithm, as described in ''Name Search Techniques'', is: #If the first letters of the name are #:'MAC' then change these letters to 'MCC' #:'KN' then change these letters to 'NN' #:'K' then change this letter to 'C' #:'PH' then change these letters to 'FF' #:'PF' then change these letters to 'FF' #:'SCH' then change these letters to 'SSS' #If the last l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Consonant
In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract, except for the h sound, which is pronounced without any stricture in the vocal tract. Examples are and [b], pronounced with the lips; and [d], pronounced with the front of the tongue; and [g], pronounced with the back of the tongue; , pronounced throughout the vocal tract; , [v], , and [z] pronounced by forcing air through a narrow channel (fricatives); and and , which have air flowing through the nose (nasal consonant, nasals). Most consonants are Pulmonic consonant, pulmonic, using air pressure from the lungs to generate a sound. Very few natural languages are non-pulmonic, making use of Ejective consonant, ejectives, Implosive consonant, implosives, and Click consonant, clicks. Contrasting with consonants are vowels. Since the number of speech sounds in the world's languages is much greater than the number of letters in any one alphabet, Linguis ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Várzea Paulista
Várzea Paulista is a municipality in the state of São Paulo in Brazil. The population is 123,071 (2020 est.) in an area of 35.1 km2. The elevation is 745 m. It is part of the agglomeration of Jundiaí. Media In telecommunications, the city was served by Telecomunicações de São Paulo. In July 1998, this company was acquired by Telefónica, which adopted the Vivo brand in 2012. The company is currently an operator of cell phones, fixed lines, internet (fiber optics/4G) and television (satellite and cable). Geography It is located at latitude 23º12'41" South and longitude 46º49'42" West, at an altitude of 745 meters. The municipality has its urban area conurbated with Jundiaí. Metropolitan Region The city is part of the Jundiaí Metropolitan Region. Várzea Paulista has ceased to be a dormitory town and has become an industrial city. The city is one of the cities that has been generating the most jobs (according to the IBGE), surpassing the average of other citie ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Brazilian Portuguese
Brazilian Portuguese (; ; also known as pt-BR) is the set of Variety (linguistics), varieties of Portuguese language native to Brazil. It is spoken by almost all of the 203 million inhabitants of Brazil and widely across the Brazilian diaspora, today consisting of about two million Brazilians who have emigrated to other countries. With a population of over 203 million, Brazil is by far the world's largest List of Portuguese speaking countries, Portuguese-speaking nation and the only one in the Americas where Portuguese, of which Brazilian Portuguese is a variety, is the official language under Article 13 of the Constitution. The Academia Brasileira de Letras (ABL) plays a significant cultural role in its development but has no legal regulatory authority over the language, which is shaped primarily by usage and educational norms in Brazil. Brazilian Portuguese differs notably from European Portuguese in phonetics, vocabulary, and grammar, though it remains a variety of Portuguese ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]