A screen reader is a form of

assistive technology Assistive technology (AT) is a term for assistive, adaptive, and rehabilitative devices for Disability, people with disabilities and the elderly. Disabled people often have difficulty performing activities of daily living (ADLs) independently, ...

(AT) that renders text and image content as speech or braille output. Screen readers are essential to blind people, and are useful to visually impaired people, illiterate, or have a

learning disability Learning disability, learning disorder, or learning difficulty (British English) is a condition in the brain that causes difficulties comprehending or processing information and can be caused by several different factors. Given the "difficulty ...

. Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like

text-to-speech Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or Computer hardware, hardware products. A text-to-speech (TTS) system conv ...

, sound icons, or a braille device. They do this by applying a wide variety of techniques that include, for example, interacting with dedicated accessibility APIs, using various

operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...

features (like

inter-process communication In computer science, interprocess communication (IPC) is the sharing of data between running Process (computing), processes in a computer system. Mechanisms for IPC may be provided by an operating system. Applications which use IPC are often cat ...

and querying

user interface In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine fro ...

properties), and employing

hooking In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed ...

techniques.

Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...

operating systems An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...

have included the Microsoft Narrator screen reader since

Windows 2000 Windows 2000 is a major release of the Windows NT operating system developed by Microsoft, targeting the server and business markets. It is the direct successor to Windows NT 4.0, and was Software release life cycle#Release to manufacturing (RT ...

, though separate products such as Freedom Scientific's commercially available JAWS screen reader and ZoomText screen magnifier and the

free and open source Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...

screen reader NVDA by NV Access are more popular for that operating system.

Apple Inc. Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Comput ...

macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...

iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...

, and

tvOS tvOS (formerly Apple TV Software) is an operating system developed by Apple for the Apple TV, a digital media player. In the first-generation Apple TV, Apple TV Software was based on Mac OS X. The software for the second-generation and later ...

include VoiceOver as a built-in screen reader, while

Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...

's Android provides the Talkback screen reader and its

ChromeOS ChromeOS, sometimes styled as chromeOS and formerly styled as Chrome OS, is an operating system designed and developed by Google. It is derived from the open-source operating system and uses the Google Chrome web browser as its principal user ...

can use ChromeVox. Similarly, Android-based devices from Amazon provide the VoiceView screen reader. There are also free and open source screen readers for

Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...

and

Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...

systems, such as Speakup and

Orca The orca (''Orcinus orca''), or killer whale, is a toothed whale and the largest member of the oceanic dolphin family. The only extant species in the genus '' Orcinus'', it is recognizable by its black-and-white-patterned body. A cosmopol ...

History

Around 1978, Al Overby of IBM Raleigh developed a prototype of a talking terminal, known as SAID (for Synthetic Audio Interface Driver), for the IBM 3270 terminal. SAID read the ASCII values of the display in a stream and spoke them through a large vocal track synthesizer the size of a suitcase, and it cost around $10,000. Dr. Jesse Wright, a blind research mathematician, and Jim Thatcher, formerly his graduate student from the University of Michigan, working as mathematicians for IBM, adapted this as an internal IBM tool for use by blind people. After the early IBM Personal Computer (PC) was released in 1981, Thatcher and Wright developed a software equivalent to SAID, called PC-SAID, or ''Personal Computer Synthetic Audio Interface Driver''. This was renamed and released in 1984 as IBM Screen Reader, which became the proprietary eponym for that general class of assistive technology.

Types

Command-line (text)

In early

s, such as

MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few op ...

, which employed

command-line interface A command-line interface (CLI) is a means of interacting with software via command (computing), commands each formatted as a line of text. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user ...

s (CLIs), the screen display consisted of characters mapping directly to a

screen buffer A framebuffer (frame buffer, or sometimes framestore) is a portion of random-access memory (RAM) containing a bitmap that drives a video display. It is a memory buffer containing data representing all the pixels in a complete video frame. Moder ...

memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...

and a cursor position. Input was by keyboard. All this information could therefore be obtained from the system either by

the flow of information around the system and reading the screen buffer or by using a standard hardware output socket and communicating the results to the user. In the 1980s, the Research Centre for the Education of the Visually Handicapped (RCEVH) at the

University of Birmingham The University of Birmingham (informally Birmingham University) is a Public university, public research university in Birmingham, England. It received its royal charter in 1900 as a successor to Queen's College, Birmingham (founded in 1825 as ...

developed a Screen Reader for the

BBC Micro The BBC Microcomputer System, or BBC Micro, is a family of microcomputers developed and manufactured by Acorn Computers in the early 1980s as part of the BBC's Computer Literacy Project. Launched in December 1981, it was showcased across severa ...

and NEC Portable.

Graphical

Off-screen models

With the arrival of

graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...

s (GUIs), the situation became more complicated. A GUI has characters and graphics drawn on the screen at particular positions, and therefore there is no purely textual representation of the graphical contents of the display. Screen readers were therefore forced to employ new low-level techniques, gathering messages from the

and using these to build up an "off-screen model", a representation of the display in which the required text content is stored. For example, the operating system might send messages to draw a command button and its caption. These messages are intercepted and used to construct the off-screen model. The user can switch between controls (such as buttons) available on the screen and the captions and control contents will be read aloud and/or shown on a refreshable braille display. Screen readers can also communicate information on menus, controls, and other visual constructs to permit blind users to interact with these constructs. However, maintaining an off-screen model is a significant technical challenge; hooking the low-level messages and maintaining an accurate model are both difficult tasks.

Accessibility APIs

Operating system and application designers have attempted to address these problems by providing ways for screen readers to access the display contents without having to maintain an off-screen model. These involve the provision of alternative and accessible representations of what is being displayed on the screen accessed through an

API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...

. Existing APIs include: * Android Accessibility Framework * Apple Accessibility API * AT-SPI * IAccessible2 * Microsoft Active Accessibility (MSAA) * Microsoft UI Automation * Java Access Bridge Screen readers can query the operating system or application for what is currently being displayed and receive updates when the display changes. For example, a screen reader can be told that the current focus is on a button and the button caption to be communicated to the user. This approach is considerably easier for the developers of screen readers, but fails when applications do not comply with the accessibility API: for example,

Microsoft Word Microsoft Word is a word processor program, word processing program developed by Microsoft. It was first released on October 25, 1983, under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platf ...

does not comply with the MSAA API, so screen readers must still maintain an off-screen model for Word or find another way to access its contents. One approach is to use available operating system messages and application object models to supplement accessibility APIs. Screen readers can be assumed to be able to access all display content that is not intrinsically inaccessible. Web browsers, word processors, icons and windows and email programs are just some of the applications used successfully by screen reader users. However, according to some users, using a screen reader is considerably more difficult than using a GUI, and many applications have specific problems resulting from the nature of the application (e.g. animations) or failure to comply with accessibility standards for the platform (e.g. Microsoft Word and Active Accessibility).

Self-voicing programs and applications

Some programs and applications have voicing technology built in alongside their primary functionality. These programs are termed self-voicing and can be a form of

if they are designed to remove the need to use a screen reader.

Cloud-based

Some telephone services allow users to interact with the internet remotely. For example, TeleTender can read web pages over the phone and does not require special programs or devices on the user side.

Virtual assistant A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to streaml ...

s can sometimes read out written documents (textual web content,

PDF Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...

documents, e-mails etc.) The best-known examples are Apple's

Siri Siri ( , backronym: Speech Interpretation and Recognition Interface) is a digital assistant purchased, developed, and popularized by Apple Inc., which is included in the iOS, iPadOS, watchOS, macOS, Apple TV, audioOS, and visionOS operating sys ...

Google Assistant Google Assistant is a virtual assistant software application developed by Google that is primarily available on home automation and mobile devices. Based on artificial intelligence, Google Assistant can engage in two-way conversations, unlike ...

, and

Amazon Alexa Amazon Alexa is a virtual assistant technology marketed by Amazon and implemented in software applications for smart phones, tablets, wireless smart speakers, and other electronic appliances. Alexa was largely developed from a Polish speech s ...

Web-based

A relatively new development in the field is web-based applications like Spoken-Web that act as web portals, managing content like news updates, weather, science and business articles for visually-impaired or blind computer users. Other examples are ReadSpeaker or BrowseAloud that add

functionality to web content. The primary audience for such applications is those who have difficulty reading because of learning disabilities or language barriers. Although functionality remains limited compared to equivalent desktop applications, the major benefit is to increase the accessibility of said websites when viewed on public machines where users do not have permission to install custom software, giving people greater "freedom to roam". This functionality depends on the quality of the software but also on a logical structure of the text. Use of headings, punctuation, presence of alternate attributes for images, etc. is crucial for a good vocalization. Also a web site may have a nice look because of the use of appropriate two dimensional positioning with CSS but its standard linearization, for example, by suppressing any CSS and Javascript in the browser may not be comprehensible.

Customization

Most screen readers allow the user to select whether most

punctuation Punctuation marks are marks indicating how a piece of writing, written text should be read (silently or aloud) and, consequently, understood. The oldest known examples of punctuation marks were found in the Mesha Stele from the 9th century BC, c ...

is announced or silently ignored. Some screen readers can be tailored to a particular application through scripting. One advantage of scripting is that it allows customizations to be shared among users, increasing accessibility for all. JAWS enjoys an active script-sharing community, for example.

Verbosity

Verbosity is a feature of screen reading software that supports vision-impaired computer users. Speech verbosity controls enable users to choose how much speech feedback they wish to hear. Specifically, verbosity settings allow users to construct a mental model of web pages displayed on their computer screen. Based on verbosity settings, a screen-reading program informs users of certain formatting changes, such as when a frame or table begins and ends, where graphics have been inserted into the text, or when a list appears in the document. The verbosity settings can also control the level of descriptiveness of elements, such as lists, tables, and regions. For example, JAWS provides low, medium, and high web verbosity preset levels. The high web verbosity level provides more detail about the contents of a webpage.

Language

Some screen readers can read text in more than one

language Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed language, signed forms, and may also be conveyed through writing syste ...

, provided that the language of the material is encoded in its

metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...

. Screen reading programs like JAWS, NVDA, and VoiceOver also include language verbosity, which automatically detects verbosity settings related to speech output language. For example, if a user navigated to a website based in the United Kingdom, the text would be read with an English accent.

References

{{authority control Assistive technology