HOME

TheInfoList



OR:

Text annotation is the practice and the result of adding a note or gloss to a text, which may include highlights or underlining, comments, footnotes, tags, and links. Text annotations can include notes written for a reader's private purposes, as well as shared annotations written for the purposes of collaborative writing and
editing Editing is the process of selecting and preparing written, photographic, visual, audible, or cinematic material used by a person or an entity to convey a message or information. The editing process can involve correction, condensation, ...
, commentary, or social reading and sharing. In some fields, text annotation is comparable to metadata insofar as it is added post hoc and provides information about a text without fundamentally altering that original text. Text annotations are sometimes referred to as
marginalia Marginalia (or apostils) are marks made in the margin (typography), margins of a book or other document. They may be scribbles, comments, gloss (annotation), glosses (annotations), critiques, doodles, drolleries, or illuminated manuscript, ...
, though some reserve this term specifically for hand-written notes made in the margins of books or manuscripts. Annotations have been found to be useful and help to develop knowledge of English literature. Annotations can be both private and socially shared, including hand-written and information technology-based annotation. Annotations are different than notetaking because annotations must be physically written or added on the actual original piece. This can be writing within the page of a book or highlighting a line, or, if the piece is digital, a comment or saved highlight or underline within the document. For information on annotation of Web content, including images and other non-textual content, see also Web annotation.


History

Text annotation may be as old as writing on media, where it was possible to produce an additional copy with a reasonable effort. It became a prominent activity around 1000 AD in Talmudic commentaries and Arabic rhetorics treaties. In the Medieval era, scribes who copied manuscripts often made marginal annotations that then circulated with the manuscripts and were thus shared with the community; sometimes annotations were copied over to new versions when such manuscripts were later recopied. With the rise of the
printing press A printing press is a mechanical device for applying pressure to an inked surface resting upon a print medium (such as paper or cloth), thereby transferring the ink. It marked a dramatic improvement on earlier printing methods in which the ...
and the relative ease of circulating and purchasing individual (rather than shared) copies of texts, the prevalence of socially shared annotations declined and text annotation became a more private activity consisting of a reader interacting with a text. Annotations made on shared copies of texts (such as library books) are sometimes seen as devaluing the text, or as an act of defacement. Thus, print technologies support the circulation of annotations primarily as formal scholarly commentary or textual footnotes or endnotes rather than marginal, handwritten comments made by private readers, though handwritten comments or annotations were common in collaborative writing or editing. Computer-based technologies have provided new opportunities for individual and socially shared text annotations that support multiple purposes, including readers' individual reading goals, learning, social
reading Reading is the process of taking in the sense or meaning of Letter (alphabet), letters, symbols, etc., especially by Visual perception, sight or Somatosensory system, touch. For educators and researchers, reading is a multifaceted process invo ...
,
writing Writing is a medium of human communication which involves the representation of a language through a system of physically Epigraphy, inscribed, Printing press, mechanically transferred, or Word processor, digitally represented Symbols (semiot ...
and
editing Editing is the process of selecting and preparing written, photographic, visual, audible, or cinematic material used by a person or an entity to convey a message or information. The editing process can involve correction, condensation, ...
, and other practices. Text annotation in Information Technology (IT) systems raises technical issues of access, linkage, and
storage Storage may refer to: Goods Containers * Dry cask storage, for storing high-level radioactive waste * Food storage * Intermodal container, cargo shipping * Storage tank Facilities * Garage (residential), a storage space normally used to store car ...
that are generally not relevant to paper-based text annotation, and thus research and development of such systems often addresses these areas.


Functions and applications

Text annotations can serve a variety of functions for both private and public reading and communication practices. In their article "From the Margins to the Center: The Future of Annotation," scholars Joanna Wolfe and Christine Neuwirth identify four primary functions that text annotations commonly serve in the modern era, including: (1)"facilitat ngreading and later writing tasks," which includes annotations that support reading for both personal and professional purposes; (2)"eavesdrop ingon the insights of other readers," which involves sharing of annotations; (3)"provid ngfeedback to writers or promote communication with collaborators," which can include personal, professional, and education-related feedback; and (4)"call ngattention to topics and important passages," for which scholarly annotations, footnotes, and call-outs often function. Regarding the ways that annotations can support individual reading tasks, Catherine Marshall points out that the ways that readers annotate texts depends on the purpose,
motivation Motivation is the reason for which humans and other animals initiate, continue, or terminate a behavior at a given time. Motivational states are commonly understood as forces acting within the agent that create a disposition to engage in goal-dire ...
, and context of reading. Readers may annotate to help interpret a text, to call attention to a section for future reference or reading, to support
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
and recall, to help focus attention on the text as they read, to work out a problem related to the text, or create annotations not specifically related to the text at all.


Educational applications

Educational research in text annotation has examined the role that both private and shared text annotations can play in supporting learning goals and
communication Communication (from la, communicare, meaning "to share" or "to be in relation with") is usually defined as the transmission of information. The term may also refer to the message communicated through such transmissions or the field of inqu ...
. Much educational research examines how students' private annotation of texts supports
comprehension Comprehension may refer to: * Comprehension (logic), the totality of intensions, that is, properties or qualities, that an object possesses * Comprehension approach, several methodologies of language learning that emphasize understanding languag ...
and memory; for example, research indicates that annotating texts causes more in-depth processing of information, which results in greater recall of information. Because annotations are done while reading with a writing utensil in hand, readers are supposed to be more aware of their thoughts as they read. This means that readers are, along with making notes to help them remember or better understand the content, actively engaged during the activity and are therefore more receptive to the information when annotating a text. Other areas of educational research investigate the benefits of socially shared text annotations for collaborative learning, both for paper-based and IT-based annotation sharing. For example, studies by Joanna Wolfe have investigated the benefits of exposure to others' annotations on student readers and writers. In a 2000 study, Wolfe found that exposing students to others' annotations influenced their perceptions of the annotators, which in turn shaped their responses to the material and their written products. In a later study, Wolfe found that viewing others' written comments on a paper text, especially pairs of annotations that present opposing responses to the text, can help students engage in the type of critical reading and stance-taking necessary for effective argumentative writing. While shared annotations can benefit individual readers, it is important to note that, "since the 1920s,
literacy Literacy in its broadest sense describes "particular ways of thinking about and doing reading and writing" with the purpose of understanding or expressing thoughts or ideas in written form in some specific context of use. In other words, hum ...
theory has increasingly emphasized the importance of social factors in the development of literacy." Thus, shared annotations can not only help one to better understand the content of a particular text, but may also aid in the acquirement of literacy skills. For example, a mother may leave marks inside a book to draw the attention of her child to a particular theme or concept; thanks to the development of audio annotations, parents may now leave notes for children who are just starting to read and may struggle with textual annotations. More recent research in the effects of shared text annotations has focused on the learning applications for web-based annotation systems, some of which were developed based on design recommendations from studies outlined above. For example, Ananda Gunawardena, Aaron Tan, and David Kaufer conducted a pilot study to examine whether annotating documents in Classroom Salon, a web-based annotation and social reading platform, encouraged active reading, error detection, and collaboration in a computer science course at
Carnegie Mellon University Carnegie Mellon University (CMU) is a private research university in Pittsburgh, Pennsylvania. One of its predecessors was established in 1900 by Andrew Carnegie as the Carnegie Technical Schools; it became the Carnegie Institute of Technology ...
. This study suggested a correlation between students' overall performance in the course and their ability to identify errors in a text that they annotated in Classroom Salon; it also found that students were likely to change their annotations in response to annotations made by others in the course. Similarly, the web-based annotation tool HyLighter was used in a first-year writing course and shown to improve the development of students' mental models of texts, including supporting reading comprehension,
critical thinking Critical thinking is the analysis of available facts, evidence, observations, and arguments to form a judgement. The subject is complex; several different definitions exist, which generally include the rational, skeptical, and unbiased an ...
, and the ability to develop a thesis. The collaboration with peers and experts around a shared text improved these skills and brought the communities' understanding closer together. A meta-analysis of empirical studies into the higher-education uses of social annotation (SA) tools indicates such tools have been tested in several courses, among them English,
sport psychology Sport psychology was defined by the European Federation of Sport in 1996, as the study of the psychological basis, processes, and effects of sport. Otherwise, sport is considered as any physical activity where the individuals engage for competi ...
, and
hypermedia Hypermedia, an extension of the term hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term ''multimedia'', which may include non-interact ...
. Studies have indicated that social annotation functions, including commenting, information sharing, and highlighting, can support instruction designed to foster collaborative learning and communication, as well as reading comprehension, metacognition, and critical analysis. Several studies indicated that students enjoyed using social annotation tools, and that it improved motivation in the course. " Multi Sensory" annotations have also been found to help students retain not only information in the classroom, but this can also help those who are trying to learn a new language. Images can be placed next to or linked to words for people to get a better understand of what that word means by looking at it. The same can be done with an audio clip of how that word is pronounced and also its meaning. Of course this is easier done using technology and in order to be specifically an annotation it must be embedded within the referenced document. However in physical copies of text a picture can be drawn next to a word and still be a sensory annotation. This form of annotation furthers comprehension, specifically in the classroom because it requires more of students' brains to retain the information being given.


Writing and text-centered collaboration

Text annotations have long been used in writing and revision processes as a way for reviewers to suggest changes and communicate about a text. In book publishing, for example, the collaboration of authors and editors to develop and revise a manuscript frequently involves exchanges of both in-line revisions or notes as well as marginal annotations. Similarly, copyeditors often make marginal annotations or notes that explain or suggest revisions or are directed at the author as questions or suggestions (commonly called "queries"). Asynchronous collaborative writing and document development often depend on text annotations as a way not only to suggest revisions but also to exchange ideas during document development or to facilitate group decision making, though such processes are often complicated by the use of different communication technologies (such as phone calls or emails as well as document sharing) for distinct tasks. Text annotations can also function to allow group or community members to communicate about a shared text, such as a doctor annotating a patient's chart. Much research into the functionality and design of collaborative IT-based writing systems, which often support text annotation, has occurred in the area of computer-supported cooperative work.


Linguistic annotation

In
corpus linguistics Corpus linguistics is the study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora ...
, digital philology and
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
, annotations are used to explicate linguistic, textual or other features of a text (or other digital representations of natural language). In
linguistics Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Lingu ...
, annotations include comments and metadata; non-transcriptional annotations are also non-linguistic. In these disciplines, annotations are the basis for
quantitative research Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philos ...
, empirical studies and the application of
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
. Unlike annotations in the above-mentioned uses (that appear very sparsely), linguistic annotation usually requires that every element (token) within a text carries one or multiple annotations, and that complex relations between different annotations exist. A number of specialized formats (and
tools A tool is an object that can extend an individual's ability to modify features of the surrounding environment or help them accomplish a particular task. Although many animals use simple tools, only human beings, whose use of stone tools dates ba ...
) for this purpose exist, the following illustrates an annotation with as used in the Universal Dependencies project. For clarity, the tab-separated values normally used have been replaced by an HTML table. A visualization of the example is given in Fig. 2. In addition to word-level annotations, the word (and the sentence, etc.) in this format can carry metadata. Various other annotation formats do exist, often coupled with certain pieces of software for their creation, processing or querying, see Ide et al. (2017) for an overview. The Linguistic Annotation Wiki describes tools and formats for creating and managing linguistic annotations. Selected problems and applications are also discussed under Overlapping markup and Web annotation. Aside from tab-separated values and other text formats, formats for linguistic annotations are often based on
markup language Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the document ...
s such as
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
(and formerly,
SGML The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates": * Declarative: Markup should d ...
), more complex annotations may also employ
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
-based data models and formats such as JSON-LD, e.g., in accordance with the Web Annotation standard. Linguistic annotation comes with an independent research tradition and its own terminology: The target of an annotation is usually referred to as a 'markable', the body of the annotation as 'annotation', the relation between annotation and markable is usually expressed in the annotation format (e.g., by having annotations and text side-by side), so that explicit anchors are not necessary.


Structure and design

Research in the design and development of annotation systems uses specific terminology to refer to distinct structural components of annotations and also distinguishes among options for digital annotation displays.


Annotation structure

The structural components of any annotation can be roughly divided into three primary elements: a ''body'', an ''anchor'', and a ''marker''. The body of an annotation includes reader-generated
symbols A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, object, or relationship. Symbols allow people to go beyond what is known or seen by creating linkages between otherwise very different co ...
and text, such as handwritten commentary or stars in the margin. The anchor is what indicates the extent of the original text to which the body of the annotation refers; it may include circles around sections, brackets, highlights, underlines, and so on. Annotations may be anchored to very broad stretches of text (such as an entire document) or very narrow sections (such as a specific letter, word, or phrase). The marker is the visual appearance of the anchor, such as whether it is a grey underline or a yellow highlight. An annotation that has a body (such as a comment in the margin) but no specific anchor has no marker.


Annotation display types

IT-based annotation systems utilize a variety of display options for annotations, including: * Footnote interfaces that display annotations below the corresponding text * Aligned annotations that display comments and notes vertically in the text margins, sometimes in multiple columns or as a "sidebar" layer * Interlinear annotations that attach annotations directly into a text * Sticky note interfaces, where annotations appear in popup dialogs over the source text * Voice annotations, in which reviewers record annotations and embed them within a document * Pen or digital-ink based interfaces that allow writing directly on a document or screen Annotation interfaces may also allow highlighting or underlining, as well as threaded discussions. Sharing and communicating through annotations anchored to specific documents is sometimes referred to as ''anchored discussion''.


IT-based text annotation systems

IT-based annotation systems include standalone and client-server systems. In the 1980s and 1990s, a number of such systems were built in the context of
libraries A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
, patent offices, and legal text processing. Their design led researchers to produce taxonomies of annotation forms. Text annotation research has taken place at several institutions, including Xerox research centers in
Palo Alto Palo Alto (; Spanish for "tall stick") is a charter city in the northwestern corner of Santa Clara County, California, United States, in the San Francisco Bay Area, named after a coastal redwood tree known as El Palo Alto. The city was es ...
and Grenoble (France), the Hitachi Central Research Lab (in particular for annotation of patents), and in relation with the construction of the new French National Library between 1989 and 1995 at the Institut de Recherche en Informatique de Toulouse and in the company AIS (Advanced Innovation Systems). Annotation functionality has been present in text processing software for many years through inline notes displayed as pop-ups, footnotes, and endnotes; however, it is only recently that functionality for displaying annotations as marginalia has appeared in programs such as OpenOffice.org/ LibreOffice Writer and
Microsoft Word Microsoft Word is a word processing software developed by Microsoft. It was first released on October 25, 1983, under the name ''Multi-Tool Word'' for Xenix systems. Subsequent versions were later written for several other platforms includi ...
. Personal or standalone annotation include word processing software that supports embedded or anchored text annotations as well as
Adobe Acrobat Adobe Acrobat is a family of application software and Web services developed by Adobe Inc. to view, create, manipulate, print and manage Portable Document Format (PDF) files. The family comprises Acrobat Reader (formerly Reader), Acrobat (former ...
, which in addition to commenting allows highlights, stamps, and other types of markup.


Web-based text annotation systems

Tim Berners-Lee Sir Timothy John Berners-Lee (born 8 June 1955), also known as TimBL, is an English computer scientist best known as the inventor of the World Wide Web. He is a Professorial Fellow of Computer Science at the University of Oxford and a profe ...
had already implemented the concept of directly editing web documents in 1990 in WorldWideWeb, the first web browser, but later ported versions removed this collaborative ability. An early version of
NCSA Mosaic NCSA Mosaic is a discontinued web browser, one of the first to be widely available. It was instrumental in popularizing the World Wide Web and the general Internet by integrating multimedia such as text and graphics. It was named for its support ...
in 1993 also included a collaborative annotation capability, though it was quickly removed. Web Distributed Authoring and Versioning, WebDAV, was then reintroduced as an extension. A different approach to distributed authoring consists in first gathering many annotations from a wide public, and then integrate them all in order to produce a further version of a document. This approach was pioneered by Stet, the system put in place to gather comments on drafts of version 3 of the
GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end user In product development, an end user (sometimes end-user) is a person who ultimately uses or is intended to ulti ...
. This system arose after a specific requirement, which it served egregiously, but was not so easily configurable as to be convenient for annotating any other document on the web. The co-ment system uses annotation interface concepts similar to Stet's, but it is based on an entirely new implementation, using Django/ Python on the server side and various
AJAX Ajax may refer to: Greek mythology and tragedy * Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea * Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris * ''Ajax'' (play), by the ancient Gree ...
libraries such as JQuery on the client side. Both Stet and co-ment are licensed under the GNU
Affero General Public License The Affero General Public License (Affero GPL and informally Affero License) is a free software license. The first version of the Affero General Public License (AGPLv1), was published by Affero, Inc. in March 2002, and based on the GNU General Pu ...
. Since 2011, the non-profit Hypothes Is Project has offered the free, open web annotation service
Hypothes.is Hypothes.is is an open-source software project that aims to collect comments about statements made in any web-accessible content, and filter and rank those comments to assess each statement's credibility. It has been summarized as "a peer re ...
. The service features annotation via a Chrome extension, bookmarklet or proxy server, as well as integration into a
LMS LMS may refer to: Science and technology * Labeled magnitude scale, a scaling technique * Learning management system, education software * Least mean squares filter, producing least mean square error * Leiomyosarcoma, a rare form of cancer * Lenz ...
or CMS. Both webpages and PDFs can be annotated. Other web-based text annotation systems are collaborative software for distributed text editing and versioning, which also feature annotation and commenting interfaces. Specialized Web-based text annotations exist in the context of scientific publication, either for refereeing or post-publication. The on-line journal PLoS ONE, published by the
Public Library of Science PLOS (for Public Library of Science; PLoS until 2012 ) is a nonprofit publisher of open-access journals in science, technology, and medicine and other scientific literature, under an open-content license. It was founded in 2000 and launc ...
, has developed its own Web-based system where scientists and the public can comment on published articles. The annotations are displayed as pop-ups with an anchor in the text.


See also

*
Annotation An annotation is extra information associated with a particular point in a document or other piece of information. It can be a note that includes a comment or explanation. Annotations are sometimes presented in the margin of book pages. For anno ...
* Web annotation *
Gloss (annotation) A gloss is a brief notation, especially a marginal one or an interlinear one, of the meaning of a word or wording in a text. It may be in the language of the text or in the reader's language if that is different. A collection of glosses is a '' ...
*
Interlinear gloss In linguistics and pedagogy, an interlinear gloss is a gloss (annotation), gloss (series of brief explanations, such as definitions or pronunciations) placed between lines, such as between a line of original text and its translation into another l ...
*
Footnote A note is a string of text placed at the bottom of a page in a book or document or at the end of a chapter, volume, or the whole text. The note can provide an author's comments on the main text or citations of a reference work in support of th ...
* PDF annotation *
Marginalia Marginalia (or apostils) are marks made in the margin (typography), margins of a book or other document. They may be scribbles, comments, gloss (annotation), glosses (annotations), critiques, doodles, drolleries, or illuminated manuscript, ...
* Social bookmarking *
Comment (computer programming) In computer programming, a comment is a programmer-readable explanation or '' annotation'' in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ign ...


References


External links


Effects of annotations on student readers and writers

Annotations and the Collaborative Digital Library: Effects of an Aligned Annotation Interface on Student Argumentation and Reading Strategies

From the Margins to the Center: ''The Future of Annotation''

Bringing Social Media to the Writing Classroom: Classroom Salon
which discusses how social media can facilitate collaboration in writing classrooms
Asynchronous Collaborative Writing through Annotations
which describes how the benefits of physical annotations can be brought into a digital environment {{Book structure Book design Reference