HOME

TheInfoList



OR:

Collostructional analysis is a family of
methods Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to: *Scien ...
developed by (in alphabetical order) Stefan Th. Gries (University of California, Santa Barbara) and Anatol Stefanowitsch (Free University of Berlin). Collostructional analysis aims at measuring the degree of attraction or repulsion that words exhibit to constructions, where the notion of construction has so far been that of Goldberg's
construction grammar Construction grammar (often abbreviated CxG) is a family of theories within the field of cognitive linguistics which posit that constructions, or learned pairings of linguistic patterns with meanings, are the fundamental building blocks of human ...
.


Collostructional methods

Collostructional analysis so far comprises three different methods: * collexeme analysis, to measure the degree of attraction/repulsion of a lemma to a slot in one particular construction; * distinctive collexeme analysis, to measure the preference of a lemma to one particular construction over another, functionally similar construction; multiple distinctive collexeme analysis extends this approach to more than two alternative constructions; * covarying collexeme analysis, to measure the degree of attraction of lemmas in one slot of a construction to lemmas in another slot of the same construction.


Input frequencies

Collostructional analysis requires frequencies of words and constructions and is similar to a wide variety of collocation statistics. It differs from raw frequency counts by providing not only observed co-occurrence frequencies of words and constructions, but also (i) a comparison of the observed frequency to the one expected by chance; thus, collostructional analysis can distinguish attraction and repulsion of words and constructions; (ii) a measure of the strength of the attraction or repulsion; this is usually the log-transformed
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
of a Fisher-Yates exact test.


Versus other collocation statistics

Collostructional analysis differs from most
collocation In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words ...
statistics such that (i) it measures not the association of words to words, but of words to syntactic patterns or constructions; thus, it takes syntactic structure more seriously than most collocation-based analyses; (ii) it has so far only used the most precise statistics, namely the Fisher-Yates exact test based on the
hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without'' ...
; thus, unlike ''t''-scores, ''z''-scores,
chi-square test A chi-squared test (also chi-square or test) is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables ...
s etc., the analysis is not based on, and does not violate, any distributional assumptions.


See also

*
Collocation extraction Collocation extraction is the task of using a computer to extract collocations automatically from a corpus. The traditional method of performing collocation extraction is to find a formula based on the statistical quantities of those words to calc ...
{{no footnotes, date=June 2012


References


General references

* Gries, Stefan Th. & Anatol Stefanowitsch. 2004a. Extending collostructional analysis: A corpus-based perspectives on 'alternations'. ''International Journal of Corpus Linguistics'' 9.1:97-129. * Gries, Stefan Th. & Anatol Stefanowitsch. 2004b. Co-varying collexemes in the into-causative. In: Achard, Michel & Suzanne Kemmer (eds.). ''Language, Culture, and Mind''. Stanford, CA: CSLI, p. 225-36. * Gries, Stefan Th. & Anatol Stefanowitsch. 2011. Cluster analysis and the identification of collexeme classes. In: Newman, John & Sally Rice (eds.). ''Empirical and Experimental Methods in Cognitive/Functional Research''. Stanford, CA: CSLI. * Stefanowitsch, Anatol & Stefan Th. Gries. 2003. Collostructions: Investigating the interaction between words and constructions. ''International Journal of Corpus Linguistics'' 8.2:209-43. * Stefanowitsch, Anatol & Stefan Th. Gries. 2005. Co-varying collexemes. ''Corpus Linguistics and Linguistic Theory'' 1.1:1-43. * Stefanowitsch, Anatol. 2006. Negative evidence and the raw frequency fallacy. ''Corpus Linguistics and Linguistic Theory'' 2.1:61-77.


Applications

* Gries, Stefan Th. 2005. Syntactic priming: A corpus-based approach. ''Journal of Psycholinguistic Research'' 34.4:365-99. * Gries, Stefan Th. & Stefanie Wulff. 2005. Do foreign language learners also have constructions? Evidence from priming, sorting, and corpora. ''Annual Review of Cognitive Linguistics'' 3:182-200. * Hilpert, Martin. 2006. Distinctive collexeme analysis and diachrony. ''Corpus Linguistics and Linguistic Theory'' 2.2:243-57. * Jensen, Kim Ebensgaard. 2012. Fatal attraction: inheritance and collostruction in the ''ihjel''-construction. ''Skandinaviske Sprogstudier'' 3.2:1-30. * Stefanowitsch, Anatol. 2005. The function of metaphor: developing a corpus-based perspective. ''International Journal of Corpus Linguistics'' 10.2: 161–198. * Wiechmann, Daniel. 2008. Sense-contingent lexical preferences and early parsing decisions .. ''Cognitive Linguistics'' 19.3: 439–455.


Papers that document its predictive superiority over raw frequency counts

* Gries, Stefan Th., Beate Hampe, & Doris Schönefeld. 2005. Converging evidence: .. ''Cognitive Linguistics'' 16.4:635-76. * Gries, Stefan Th., Beate Hampe, & Doris Schönefeld. to appear. Converging evidence II: .. In: Newman, John & Sally Rice (eds.). ''Experimental and Empirical Methods in Cognitive/Functional Research''. Stanford, CA: CSLI. (working title) * Wiechmann, Daniel. 2008. On the Computation of Collostruction Strength: .. ''Corpus Linguistics and Linguistic Theory'' 4.2: 253–290. Natural language parsing Methods in linguistics Statistical natural language processing