Aims
Corpus-assisted discourse studies aim to uncover ''non-obvious meaning'', that is, meaning which might not be readily available to naked-eye perusal. Much of what carries meaning in texts is not open to direct observation: “you cannot understand the world just by looking at it” (Stubbs fter Gellner 19591996: 92). We use language “semi-automatically”, in the sense that speakers and writers make semi-conscious choices within the various complex overlapping systems of which language is composed, including those of transitivity, modality ( Michael Halliday 1994), lexical sets (e.g. ''freedom'', ''liberty'', ''deliverance''), modification, and so on. Authors themselves are, famously, generally unaware of all the meanings their texts convey. By combining theIn different countries
*In German-speaking countries: Pioneering work in corpus-based discourse analysis was conducted in Europe, in particular by Hardt-Mautner/Mautner (1995, 2000) and Stubbs (1996, 2001). CADS and other types of corpus-based discourse analysis are inspired by this important early work. *In Italy: A considerable body of research has been conducted in Italy either by individual researchers or under the aegis of combined inter-university projects such as ''Newspool'' (Partington et al. 2004) and ''CorDis'' (Morley and Bayley eds, 2009). It has concentrated on political and media language, mainly because a nucleus of linguists in Italian universities work in Political Science faculties and are increasingly interested in the use of corpus techniques to conduct a particular type of sociopolitical discourse analysis, including the unearthing of noteworthy ideological metaphors and motifs in the language of political figures and institutions. Italian researchers also developed Modern diachronic corpus-assisted discourse studies (MD-CADS). This approach contrasts the language contained in comparable corpora from different but recent points in time in order to track changes in modern language usage but also social, cultural and political changes over modern times, as reflected - and shared among people - in language. It is this Italian body of research that makes most use of the label CADS. *In the UK: Linguists in the UK tend to undertake corpus-based critical discourse analysis (CDA). CDA generally adopts a leftist political stance, focusing on the ways that social and political domination is reproduced by text and talk. This type of corpus-based research was originally associated withComparison with traditional corpus linguistics
Traditional corpus linguistics has, quite naturally, tended to privilege the quantitative approach. In the drive to produce more authentic dictionaries and grammars of a language, it has been characterised by the compilation of some very large corpora of heterogeneric discourse types in the desire to obtain an overview of the greatest quantity and variety of discourse types possible, in other words, of the chimerical but useful fiction called the “general language” (“general English”, “general Italian”, and so on). This has led to the construction of immensely valuable research tools such as the Bank of English and the British National Corpus. Some branches of corpus linguistics have also promoted an approach that is "corpus-driven", in which we need, grammatically speaking, a mental ''tabula rasa'' to free ourselves of the baleful prejudice exerted by traditional models and allow the data to speak entirely for itself. The aim of corpus-assisted discourse studies and related approaches is radically different. Here the aim of the exercise is to acquaint oneself as much as possible with the discourse type(s) in hand. Researchers typically engage with their corpus in a variety of ways. As well as via wordlists and concordancing, intuitions for further research can also arise from reading or watching or listening to parts of the data-set, a process which can help provide a feel for how things are done linguistically in the discourse-type being studied. Corpus-assisted discourse analysis is also typically characterised by the compilation of ad hoc specialised corpora, since very frequently there exists no previously available collection of the discourse type in question. Often, other corpora are utilized in the course of a study for purposes of comparison. These may include pre-existing corpora or may themselves need to be compiled by the researcher. In some sense, all work with corpora – just as all work with discourse - is properly comparative. Even when a single corpus is employed, it is used to test the data it contains against another body of data. This may consist of the researcher's intuitions, or the data found in reference works such as dictionaries and grammars, or it may be statements made by previous authors in the field.CADS as a specific type of corpus-based discourse analysis
Researchers in Italy have developed CADS as a specific type of corpus-based discourse analysis, creating a standard set of methods: 'A basic, standard methodology in CADS may resemble the following:' # Step 1: Decide upon the research question; # Step 2: Choose, compile or edit an appropriate corpus; # Step 3: Choose, compile or edit an appropriate reference corpus / corpora; # Step 4: Make frequency lists and run a keywords comparison of the corpora; # Step 5: Determine the existence of sets of key items; # Step 6: Concordance interesting key items (with differing quantities of co-text); # Step 7: (Possibly) refine the research question and return to Step 2. ''This basic procedure can of course vary according to individual research circumstances and requirements.'' A particular way of conceptualising research questions has also been proposed in such CADS projects: * Given that P is a discourse participant (or possibly an institution) and G is a goal, often a political goal: # How does P achieve G with language? # What does this tell us about P? # Comparative studies: how do P1 and P2 differ in their use of language? Does this tell us anything about their different principles and objectives? A second general type of CADS research question, which might be asked of interactive discourse data, has been conceptualised as follows: * Given that P(x) is a particular participant or set of participants, DT is the discourse type, and R is an observed relationship between or among participants: * How do achieve / maintain R in DT sing language Another common type of research question has been conceptualised thus: * Given that A is an author, Ph(x) is a phenomenon or practice or behaviour, and DT(x) is a particular discourse type. * A has said P(x) is the case in DT(a) * Is Ph(x) the case in DT(b)? This is a classic “hypothesis-testing” research question: we test the hypothesis that whatever practice has been observed by a previous author in some discourse type will be observable in another. It is a process we might call ''para-replication'', that is, the replication of an experiment with either a fresh set of texts of the same discourse type or of a related discourse type, “in order to see whether indingswere an artefact of one single data set” (Stubbs 2001: 124). A final example of conceptualising a CADS research question is the following: * Given that P(x) is a participant or category thereof, and LF(x) is a particular language feature: * Do and use LF(x) in the same way? Such research aims to ascertain whether different participants use a particular linguistic feature in the same or different ways. The research may proceed to attempt to explain why this is the case.Some research to date
Studies that bring together corpus linguistics and discourse analysis include the following: * How ideas about groups of people and race are constructed and disseminated through repeated language use (Krishnamurthy 1996). * A study of German loan words in English and their connection to cultural stereotyping (Stubbs 1998). * A study of the religious rhetoric of U.S. presidential candidates (Vincent 2020). * Analyses of the language of the Euro-sceptic debate in the UK (Teubert 2000; see also Mautner 2000). * The typical language strategies, metaphors and motifs used by journalists and spokespersons in US press conferences, and how these reflect their respective world-views (Partington 2003, 2007). * How prediction is effected in economic texts, that is, how economic forecasts are presented and hedged (Walsh 2004). * How government witnesses in the Hutton Inquiry constructed their professional identity (Duguid 2007, 2008). * The typical language features of US television series and how they are similar or different to unscripted conversation (Bednarek 2010, 2018). * How speakers useBibliography
* Paul Baker (linguist), Baker, P. (2006) ''Using Corpora in Discourse Analysis''. London: Continuum. * Baker, P., Gabrielatos, C., Khosravinik, M., Krzyzanowski, M., McEnery, T. & Wodak, R. (2008) A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK Press. ''Discourse and Society'' 19(3), 273–306. * Baker, P. Gabrielatos, C. and McEnery. T. (2013) ''Discourse Analysis and Media Attitudes: The Representation of Islam in the British Press.'' Cambridge: Cambridge University Press. * Bednarek, M. (2010). ''The Language of Fictional Television. Drama and Identity''. London: Continuum. * Bednarek, M. (2018). ''Language and Television Series. A Linguistic Approach to TV Dialogue''. Cambridge: Cambridge University Press. * Duguid, A. (2007) Men at Work: how those at Number 10 Construct their working Identity. In Garzone, G. and Srikant, S. (eds) ''Discourse, Ideology and Specialized Communication''. Bern: Peter Lang, pp. 453–484. * Duguid, A. (2007) Soundbiters Bit. Contracted dialogistic space and textual relations through Corpus Assisted Discourse Studies. In Fairclough, N., Cortese, G. and Ardizzone, P. (eds) ''Discourse and Contemporary Social Change''. Bern: Peter Lang, pp. 73–93. * Gabrielatos, C. & Baker, P. (2008). Fleeing, sneaking, flooding: A corpus analysis of discursive constructions of refugees and asylum seekers in the UK Press 1996-2005. Journal of English Linguistics 36(1), 5–38. * Gellner, E. (1959) ''Words and Things''. London: Gollancz. * Krishnamurthy, R. (1996) Ethnic, racial and tribal: the language of racism? In R Caldas-Coulthard and M. Coulthard (eds), ''Texts and Practices: Readings in Critical Discourse Analysis''. London: Routledge, pp. 129–49. * Haarman, L. and L. Lombardo (eds) (2008) ''Evaluation and stance in war news: A linguistic analysis of American, British and Italian television news reporting of the 2003 Iraqi war''. London: Continuum. * Halliday, M. (1994) ''An Introduction to Functional Grammar'', 2nd edn. London: Edward Arnold. * Hardt-Mautner, G. (1995) “Only Connect.” Critical discourse analysis and corpus linguistics, University of Lancaster. Online Available HTTP:References
{{Reflist Corpus linguistics Sociolinguistics