HOME





Non-native Speech Database
A non-native speech database is a Speech corpus, speech database of non-native pronunciations of English. Such databases are used in the development of: multilingual automatic speech recognition systems, Text-to-speech, text to speech systems, pronunciation trainers, and Computer-assisted language learning, second language learning systems. List __FORCETOC__ The actual table with information about the different databases is shown in Table 2. Legend In the table of non-native databases some abbreviations for language names are used. They are listed in Table 1. Table 2 gives the following information about each corpus: The name of the corpus, the institution where the corpus can be obtained, or at least further information should be available, the language which was actually spoken by the speakers, the number of speakers, the native language of the speakers, the total amount of non-native utterances the corpus contains, the duration in hours of the non-native part, the date ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Speech Corpus
A speech corpus (or spoken corpus) is a database of speech audio files and text Transcription (linguistics), transcriptions. In speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition or speaker identification engine). In linguistics, spoken corpora are used to do research into phonetic, conversation analysis, dialectology and other fields. A corpus is one such database. Corpora is the plural of corpus (i.e. it is many such databases). There are two types of speech corpora: # Read Speech, which includes: #* Book excerpts #* Broadcast news #* Lists of words #* Sequences of numbers # Spontaneous Speech, which includes: #* Dialogs – between two or more people (includes meetings; one such corpus is the KEC); #* Narratives – a person telling a story (one such corpus is the Buckeye Corpus); #* Map-tasks – one person explains a route on a map to another; #* Appointment-tasks – two people try t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Non-native Pronunciations Of English
Non-native pronunciations of English result from the common linguistic phenomenon in which non-native speakers of any language tend to transfer the intonation, phonological processes and pronunciation rules of their first language into their English speech. They may also create innovative pronunciations not found in the speaker's native language. Overview Non-native English speakers may pronounce words differently than native speakers either because they apply the speech rules of their mother tongue to English ("interference") or through implementing strategies similar to those used in first language acquisition. They may also create innovative pronunciations for English sounds not found in the speaker's first language. The extent to which native speakers can identify a non-native accent is linked to the age at which individuals begin to immerse themselves in a language. Scholars disagree on the precise nature of this link, which might be influenced by a combination of fact ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Automatic Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition applications include voice user interfaces su ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Text-to-speech
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware, hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition. Synthesized speech can be created by Concatenative synthesis, concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phone (phonetics), phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. The ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Computer-assisted Language Learning
Computer-assisted language learning (CALL), known as computer-aided instruction (CAI) in British English and computer-aided language instruction (CALI) in American English, Levy (1997: p. 1) briefly defines it as "the exploration and study of computer applications in language teaching and learning."Levy, M. (1997). CALL: Context and Conceptualisation. Oxford: Oxford University Press. CALL embraces a wide range of information and communications technology "applications and approaches to teaching and learning foreign languages, ranging from the traditional drill-and-practice programs that characterized CALL in the 1960s and 1970s to more recent manifestations of CALL, such as those utilized virtual learning environment and Web-based distance learning. It also extends to the use of corpora and concordancers, interactive whiteboards,Schmid, Euline Cutrim. (2009). Interactive Whiteboard Technology in the Language Classroom: Exploring New Pedagogical Opportunities. Saarbrücken, Germa ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




ICASSP
ICASSP, the International Conference on Acoustics, Speech, and Signal Processing, is an annual flagship conference organized by IEEE Signal Processing Society. Ei Compendex has indexed all papers included in its proceedings. The first ICASSP was held in 1976 in Philadelphia, Pennsylvania, based on the success of a conference in Massachusetts four years earlier that had focused specifically on speech signals. As ranked by Google Scholar's h-index The ''h''-index is an author-level metric that measures both the productivity and citation impact of the publications, initially used for an individual scientist or scholar. The ''h''-index correlates with success indicators such as winning t ... metric in 2016, ICASSP has the highest h-index of any conference in the Signal Processing field. The Brazilian ministry of education gave the conference an 'A1' rating based on its h-index. Conference list References IEEE conferences Academic conferences Computer science con ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Karen Livescu
Karen Livescu is an American computer scientist specializing in speech processing and natural language processing, and applications of deep learning to these topics. She is a professor in the Toyota Technological Institute at Chicago and a part-time associate professor of computer science at the University of Chicago. Education and career Livescu majored in physics at Princeton University, graduating in 1996 with an honors thesis on signal processing in speech supervised by computer scientist Kenneth Steiglitz. After visiting the Technion – Israel Institute of Technology, she received a master's degree in 1999 and a Ph.D. in 2005 from the Massachusetts Institute of Technology (MIT). Her doctoral dissertation, ''Feature-Based Pronunciation Modeling for Automatic Speech Recognition'', was supervised by James Glass. She became a Clare Boothe Luce Postdoctoral Lecturer at MIT, and then a research assistant professor at the Toyota Technological Institute, before becoming a regular ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lori Lamel
Lori Faith Lamel is a speech processing researcher known for her work with the TIMIT corpus of American English speech and for her work on voice activity detection, speaker recognition, and other non-linguistic inferences from speech signals. She works for the French National Centre for Scientific Research (CNRS) as a senior research scientist in the Spoken Language Processing Group of the Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur. Education and career Lamel was a student at the Massachusetts Institute of Technology (MIT), where she earned bachelor's and master's degrees in electrical engineering and computer science in 1980 as a co-op student with Bell Labs. She earned her Ph.D. at MIT in 1988, with the dissertation ''Formalizing Knowledge used in Spectrogram Reading: Acoustic and perceptual evidence from stops'' supervised by Victor Zue. She completed a habilitation in 2004 at Paris-Sud University. She was a visiting researcher at CNRS in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

English As A Second Or Foreign Language
English as a second or foreign language refers to the use of English by individuals whose native language is different, commonly among students learning to speak and write English. Variably known as English as a foreign language (EFL), English as a second language (ESL), English for speakers of other languages (ESOL), English as an additional language (EAL), or English as a new language (ENL), these terms denote the study of English in environments where it is not the dominant language. Programs such as ESL are designed as academic courses to instruct non-native speakers in English proficiency, encompassing both learning in English-speaking nations and abroad. Teaching methodologies include teaching English as a foreign language (TEFL) in non-English-speaking countries, teaching English as a second language (TESL) in English-speaking nations, and teaching English to speakers of other languages (TESOL) worldwide. These terms, while distinct in scope, are often used intercha ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition applications include voice user interfaces ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]