Culturomics
   HOME
*



picture info

Culturomics
Culturomics is a form of computational lexicology that studies human behavior and cultural trends through the quantitative analysis of digitized texts. Researchers data mine large digital archives to investigate cultural phenomena reflected in language and word usage. The term is an American neologism first described in a 2010 ''Science'' article called ''Quantitative Analysis of Culture Using Millions of Digitized Books'', co-authored by Harvard researchers Jean-Baptiste Michel and Erez Lieberman Aiden. Michel and Aiden helped create the Google Labs project Google Ngram Viewer which uses n-grams to analyze the Google Books digital library for cultural patterns in language use over time. Because the Google Ngram data set is not an unbiased sample, and does not include metadata, there are several pitfalls when using it to study language or the popularity of terms. Medical literature accounts for a large, but shifting, share of the corpus, which does not take into account how o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Google Ngram Viewer
The Google Ngram Viewer or Google Books Ngram Viewer is an online search engine that charts the frequencies of any set of search strings using a yearly count of n-grams found in printed sources published between 1500 and 2019 in Google's text corpora in English, Chinese (simplified), French, German, Hebrew, Italian, Russian, or Spanish. There are also some specialized English corpora, such as American English, British English, and English Fiction. The program can search for a word or a phrase, including misspellings or gibberish. The n-grams are matched with the text within the selected corpus, optionally using case-sensitive spelling (which compares the exact use of uppercase letters), and, if found in 40 or more books, are then displayed as a graph. The Google Ngram Viewer supports searches for parts of speech and wildcards. It is routinely used in research. History The program was developed by Jon Orwant and Will Brockman and released in mid-December 2010. It was inspired ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Erez Lieberman Aiden
Erez Lieberman Aiden (born 1980, né Erez Lieberman) is an American research scientist active in multiple fields related to applied mathematics. He is an assistant professor at the Baylor College of Medicine, and formerly a fellow at the Harvard Society of Fellows and visiting faculty member at Google. He is an adjunct assistant professor of computer science at Rice University. Using mathematical and computational approaches, he has studied evolution in a range of contexts, including that of networks through evolutionary graph theory and languages in the field of culturomics. He has published scientific articles in a variety of disciplines. Lieberman Aiden has won awards including the Lemelson–MIT Student Prize and the American Physical Society's Award for Outstanding Doctoral Thesis Research in Biological Physics. In 2009, Lieberman Aiden was named as one of 35 top innovators under 35 by ''Technology Review'' and in 2011 he was one of the recipients of the Presidential E ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Computational Lexicology
Computational lexicology is a branch of computational linguistics, which is concerned with the use of computers in the study of lexicon. It has been more narrowly described by some scholars (Amsler, 1980) as the use of computers in the study of '' machine-readable dictionaries''. It is distinguished from ''computational lexicography'', which more properly would be the use of computers in the construction of dictionaries, though some researchers have used computational lexicography as synonymous. History Computational lexicology emerged as a separate discipline within computational linguistics with the appearance of machine-readable dictionaries, starting with the creation of the machine-readable tapes of the ''Merriam-Webster Seventh Collegiate Dictionary'' and the ''Merriam-Webster New Pocket Dictionary'' in the 1960s by John Olney et al. at System Development Corporation. Today, computational lexicology is best known through the creation and applications of WordNet. As the comput ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Arab Spring
The Arab Spring ( ar, الربيع العربي) was a series of anti-government protests, uprisings and armed rebellions that spread across much of the Arab world in the early 2010s. It began in Tunisia in response to corruption and economic stagnation. From Tunisia, the protests then spread to five other countries: Libya, Egypt, Yemen, Syria and Bahrain. Rulers were deposed (Zine El Abidine Ben Ali, Muammar Gaddafi, Hosni Mubarak, Ali Abdullah Saleh) or major uprisings and social violence occurred including riots, civil wars, or insurgencies. Sustained street demonstrations took place in Morocco, Iraq, Algeria, Lebanon, Jordan, Kuwait, Oman and Sudan. Minor protests took place in Djibouti, Mauritania, Palestine, Saudi Arabia and the Moroccan-occupied Western Sahara. A major slogan of the demonstrators in the Arab world is '' ash-shaʻb yurīd isqāṭ an-niẓām!'' (). The importance of external factors versus internal factors to the protests' spread and success ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Fukushima Daiichi Nuclear Disaster
The was a nuclear accident in 2011 at the Fukushima Daiichi Nuclear Power Plant in Ōkuma, Fukushima, Japan. The proximate cause of the disaster was the 2011 Tōhoku earthquake and tsunami, which occurred on the afternoon of 11 March 2011 and remains the most powerful earthquake ever recorded in Japan. The earthquake triggered a powerful tsunami, with 13–14-meter-high waves damaging the nuclear power plant's emergency diesel generators, leading to a loss of electric power. The result was the most severe nuclear accident since the Chernobyl disaster in 1986, classified as level seven on the International Nuclear Event Scale (INES) after initially being classified as level five, and thus joining Chernobyl as the only other accident to receive such classification. While the 1957 explosion at the Mayak facility was the second worst by radioactivity released, the INES ranks incidents by impact on population, so Chernobyl (335,000 people evacuated) and Fukushima (154,000 evacu ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Twitter
Twitter is an online social media and social networking service owned and operated by American company Twitter, Inc., on which users post and interact with 280-character-long messages known as "tweets". Registered users can post, like, and ' retweet' tweets, while unregistered users only have the ability to read public tweets. Users interact with Twitter through browser or mobile frontend software, or programmatically via its APIs. Twitter was created by Jack Dorsey, Noah Glass, Biz Stone, and Evan Williams in March 2006 and launched in July of that year. Twitter, Inc. is based in San Francisco, California and has more than 25 offices around the world. , more than 100 million users posted 340 million tweets a day, and the service handled an average of 1.6 billion search queries per day. In 2013, it was one of the ten most-visited websites and has been described as "the SMS of the Internet". , Twitter had more than 330 million monthly active users. In practice, the va ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Public Opinion
Public opinion is the collective opinion on a specific topic or voting intention relevant to a society. It is the people's views on matters affecting them. Etymology The term "public opinion" was derived from the French ', which was first used in 1588 by Michel de Montaigne in the second edition of his '' Essays'' (ch. XXII). The French term also appears in the 1761 work ''Julie, or the New Heloise'' by Jean-Jacques Rousseau. Precursors of the phrase in English include William Temple's "general opinion" (appearing in his 1672 work ''On the Original and Nature of Government'') and John Locke's "law of opinion" (appearing in his 1689 work ''An Essay Concerning Human Understanding''). History The emergence of public opinion as a significant force in the political realm dates to the late 17th century, but opinion had been regarded as having singular importance much earlier. Medieval ''fama publica'' or ''vox et fama communis'' had great legal and social importance from the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Information Extraction
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction Due to the difficulty of the problem, current approaches to IE (as of 2010) focus on narrowly restricted domains. An example is the extraction from newswire reports of corporate mergers, such as denoted by the formal relation: :\mathrm(company_1, company_2, date), from an online news sentence such as: :''"Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp."'' A broad goal of IE is to allow computation to be done on the previously unstructured data. A more sp ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Text Categorisation
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification. The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied. Documents may be classified according to their subjects or according to other attributes (such as document type, author, printing year etc.). In the rest of this article only subject classification is considered. T ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Statistical Machine Translation
Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation as well as with example-based machine translation, and has more recently been superseded by neural machine translation in many applications (see this article's final section). The first ideas of statistical machine translation were introduced by Warren Weaver in 1949, including the ideas of applying Claude Shannon's information theory. Statistical machine translation was re-introduced in the late 1980s and early 1990s by researchers at IBM's Thomas J. Watson Research Center and has contributed to the significant resurgence in interest in machine translation in recent years. Before the introduction of neural machine translation, it was by far the most widely studied machine transl ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Eurovision Song Contest
The Eurovision Song Contest (), sometimes abbreviated to ESC and often known simply as Eurovision, is an international songwriting competition organised annually by the European Broadcasting Union (EBU), featuring participants representing primarily European countries. Each participating country submits an original song to be performed on live television and radio, transmitted to national broadcasters via the EBU's Eurovision and Euroradio networks, with competing countries then casting votes for the other countries' songs to determine a winner. Based on the Sanremo Music Festival held in Italy since 1951, Eurovision has been held annually since 1956 (apart from ), making it the longest-running annual international televised music competition and one of the world's longest-running television programmes. Active members of the EBU, as well as invited associate members, are eligible to compete, and 52 countries have participated at least once. Each participating broadcaster s ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

News Coverage
News is information about current events. This may be provided through many different media: word of mouth, printing, postal systems, broadcasting, electronic communication, or through the testimony of observers and witnesses to events. News is sometimes called "hard news" to differentiate it from soft media. Common topics for news reports include war, government, politics, education, health, the environment, economy, business, fashion, entertainment, and sport, as well as quirky or unusual events. Government proclamations, concerning royal ceremonies, laws, taxes, public health, and criminals, have been dubbed news since ancient times. Technological and social developments, often driven by government communication and espionage networks, have increased the speed with which news can spread, as well as influenced its content. Throughout history, people have transported new information through oral means. Having developed in China over centuries, newspapers became est ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]