Concordancer

	Concordancer A concordancer is a computer program that automatically constructs a concordance. The output of a concordancer may serve as input to a translation memory system for computer-assisted translation, or as an early step in machine translation. Concordancers are also used in corpus linguistics to retrieve alphabetically or otherwise sorted lists of linguistic data from the corpus in question, which the corpus linguist then analyzes. A number of concordancers have been published, notably Oxford Concordance Program (OCP), a concordancer first released in 1981 by Oxford University Computing Services, which claims to be used in over 200 organisations worldwide. The Oxford Concordance Program Version 2 S. Hockey J. Martin Literary and Linguistic Computing, Volume 2, Issue 2, 1 January 1987, Pages 125–131, https://doi.org/10.1093/llc/2 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Concordance (publishing) A concordance is an alphabetical list of the principal words used in a book or body of work, listing every instance of each word with its immediate context (language use)#Verbal context, context. Historically, concordances have been compiled only for works of special importance, such as the Vedas, Bible, Qur'an or the works of William Shakespeare, Shakespeare, James Joyce or classical Latin and Greek authors, because of the time, difficulty, and expense involved in creating a concordance in the pre-computer era. A concordance is more than an Subject indexing, index, with additional material such as commentary, definitions and topical cross-indexing which makes producing one a labor-intensive process even when assisted by computers. In the precomputing era, search engine technology, search technology was unavailable, and a concordance offered readers of long works such as the Bible something comparable to search results for every word that they would have been likely to search fo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Computer-assisted Translation Computer-aided translation (CAT), also referred to as computer-assisted translation or computer-aided human translation (CAHT), is the use of software, also known as a translator, to assist a human translator in the translation process. The translation is created by a human, and certain aspects of the process are facilitated by software; this is in contrast with machine translation (MT), in which the translation is created by a computer, optionally with some human intervention (e.g. pre-editing and post-editing). CAT tools are typically understood to mean programs that specifically facilitate the actual translation process. Most CAT tools have (a) the ability to translate a variety of source file formats in a single editing environment without needing to use the file format's associated software for most or all of the translation process, (b) translation memory, and (c) integration of various utilities or processes that increase productivity and consistency in translation. ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Computer Program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangible components. A ''computer program'' in its human-readable form is called source code. Source code needs another computer program to Execution (computing), execute because computers can only execute their native machine instructions. Therefore, source code may be Translator (computing), translated to machine instructions using a compiler written for the language. (Assembly language programs are translated using an Assembler (computing), assembler.) The resulting file is called an executable. Alternatively, source code may execute within an interpreter (computing), interpreter written for the language. If the executable is requested for execution, then the operating system Loader (computing), loads it into Random-access memory, memory and ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Translation Memory A translation memory (TM) is a database that stores "segments", which can be sentences, paragraphs or sentence-like units (headings, titles or elements in a list) that have previously been translated, in order to aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called "translation units". Individual words are handled by terminology bases and are not within the domain of TM. Software programs that use translation memories are sometimes known as translation memory managers (TMM) or translation memory systems (TM systems, not to be confused with a translation management system (TMS), which is another type of software focused on managing the process of translation). Translation memories are typically used in conjunction with a dedicated computer-assisted translation (CAT) tool, word processing program, terminology management systems, multilingual dictionary, or even raw machine translation output. Research ind ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Machine Translation Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. Early approaches were mostly rule-based or statistical. These methods have since been superseded by neural machine translation and large language models. History Origins The origins of machine translation can be traced back to the work of Al-Kindi, a ninth-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation. The idea of machine translation later appeared in the 17th century. In 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. The idea of using digital computers for translation of natural languages was proposed as early as 1947 by England's A. D. Booth and Warr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Corpus Linguistics Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural ''corpora''). Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety. Today, corpora are generally machine-readable data collections. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. Large collections of text, though corpora may also be small in terms of running words, allow linguists to run quantitative analyses on linguistic concepts that may be difficult to test in a qualitative manner. The text-corpus method uses the body of texts in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other language ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Oxford Concordance Program The Oxford Concordance Program (OCP) was first released in 1981 and was a result of a project started in 1978 by Oxford University Computing Services (OUCS) to create a machine independent text analysis program for producing word lists, indexes and concordances in a variety of languages and alphabets. In the 1980s it was claimed to have been licensed to around 240 institutions in 23 countries. History OCP was designed and written in FORTRAN by Susan Hockey and Ian Marriott of Oxford University Computing Services in the period 1979–1980 and its authors acknowledged that it owed much to the earlier COCOA and CLOC (University of Birmingham) concordance systems. During 1985–86 OCP was completely rewritten as version 2 to increase the efficiency of the program, a version was also produced for the IBM PC called Micro-OCP. The ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Oxford University Computing Services Oxford University Computing Services (OUCS) until 2012 provided the central Information Technology services for the University of Oxford. The service was based at 7-19 Banbury Road in central north Oxford, England, near the junction with Keble Road. OUCS became part of IT Services, when the new department was created at the University of Oxford on 1 August 2012 through a merger of the three previous central IT departments: Oxford University Computing Services (OUCS), Business Services and Projects (BSP) and ICT Support Team (ICTST). At the time when Oxford University Computing Services ceased to operate as an independent department, it offered facilities, training and advice to members of the university in all aspects of academic computing. OUCS was responsible for the core networks reaching all departments and colleges of Oxford University. OUCS was made up of 5 technical and one administration group. Each group had responsibility for different aspects of OUCS services supplied t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	COCOA (digital Humanities) COCOA (an acronym derived from COunt and COncordance Generation on Atlas) was an early text file utility and associated file format for digital humanities, then known as humanities computing. It was approximately 4000 punched cards of FORTRAN and created in the late 1960s and early 1970s at University College London and the Atlas Computer Laboratory in Harwell, Oxfordshire. Functionality included word-counting and concordance building. Oxford Concordance Program The Oxford Concordance Program format was a direct descendant of COCOA developed at Oxford University Computing Services. The Oxford Text Archive holds items in this format. Later developments The COCOA file format bears at least a passing similarity to the later markup languages such as SGML and XML. A noticeable difference with its successors is that COCOA tags are flat and not tree structured. In that format, every information type and value encoded by a tag should be considered true until the same tag chang ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Cross-reference The term cross-reference (abbreviation: xref) can refer to either: * An instance within a document which refers to related information elsewhere in the same document. In both printed and online dictionaries cross-references are important because they form a network structure of relations existing between different parts of data, dictionary-internal as well as dictionary external. * In an index Index (: indexes or indices) may refer to: Arts, entertainment, and media Fictional entities * Index (''A Certain Magical Index''), a character in the light novel series ''A Certain Magical Index'' * The Index, an item on the Halo Array in the ..., a cross-reference is often denoted by ''See also''. For example, under the term ''Albert Einstein'' in the index of a book about Nobel Laureates, there may be the cross-reference ''See also: Einstein, Albert''. * In hypertext, cross-references take the form of "live" references within the text that, when activated by mouse click, touch, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Ctags Ctags is a programming tool that generates an index file (or tag file) of names found in source and header files of various programming languages to aid code comprehension. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor, a code search engine, or other utility. Alternatively, there is also an output mode that generates a cross reference file, listing information about various names found in a set of language files in human-readable form. The original Ctags was introduced in BSD Unix 2.0 and was written by Ken Arnold, with Fortran support by Jim Kleckner and Pascal support by Bill Joy. It is part of the initial release of Single Unix Specification and XPG4 of 1992. Editors that support ctags ''Tag index files'' are supported by many source code editors, including: * Atom * BBEdit 8+ * CodeLite (via built-in ctagsd language server) * Cloud9 IDE (u ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Key Word In Context Key Word In Context (KWIC) is the most common format for concordance lines. The term KWIC was coined by Hans Peter Luhn. The system was based on a concept called ''keyword in titles'', which was first proposed for Manchester libraries in 1864 by Andrea Crestadoro. A KWIC index is formed by sorting and aligning the words within an article title to allow each word (except the stop words) in titles to be searchable alphabetically in the index. It was a useful indexing method for technical manuals before computerized full text search became common. For example, a search query including all of the words in an example definition ("KWIC is an acronym for Key Word In Context, the most common format for concordance lines") and the Wikipedia slogan in English ("the free encyclopedia"), searched against a Wikipedia page, might yield a KWIC index as follows. A KWIC index usually uses a wide layout to allow the display of maximum 'in context' information (not shown in the following example). ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]