Uniterm is a
subject indexing
Subject indexing is the act of describing or classifying a document
A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as ...
system introduced by
Mortimer Taube
Mortimer Taube (December 6, 1910 – September 3, 1965) was an American librarian. He is recognized as one the 100 most important leaders in American Library and Information Science of the 20th century. He was important to the Library Science f ...
in 1951. The name is a contraction of "unit" and "term", referring to its use of single words as the basis of the index, the "uniterms". Taube referred to the overall concept as "Coordinate Indexing", but today the entire concept is generally referred to as Uniterm as well.
Uniterm is designed to allow rapid lookups on topic keywords and then
cross-reference
The term cross-reference (abbreviation: xref) can refer to either:
* An instance within a document which refers to related information elsewhere in the same document. In both printed and online dictionaries cross-references are important because ...
those keywords across multiple topics in order to find documents that match all of the terms. The result of a uniterm search is a set of
accession numbers that can then be used to retrieve the matching documents. Uniterm is based on existing accession numbers, so it is technically a ''post-coordinate'' system. This is opposed to a pre-coordinate system, where the subject of the document results it being given a particular number, as in the
Dewey Decimal Classification
The Dewey Decimal Classification (DDC) (pronounced ) colloquially known as the Dewey Decimal System, is a proprietary library classification system which allows new books to be added to a library in their appropriate location based on subject. ...
. Uniterm was among the most popular post-coordinate indexing systems, although some of its success was due to Taube's company winning contracts to index huge technical libraries.
History
The development of Uniterm, and other new indexing systems, ultimately traces its history to the late
World War II
World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...
period. Aware of the advanced aircraft and rocket technologies developed in Germany, the US formed
Operation Lusty and UK the similar
Fedden Mission in order to gather as much of these materials as possible. Along with examples of the aircraft and various weapons, these efforts returned millions of pages of technical documentation. The desire to ease access into these enormous collections led to a great expansion in the field of
information retrieval
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
.
In the US, the aeronautical collection was first sent to
US Army Air Force
The United States Army Air Forces (USAAF or AAF) was the major land-based aerial warfare service component of the United States Army and ''de facto'' aerial warfare service branch of the United States during and immediately after World War II ...
at
Wright Field
Wilbur Wright Field was a military installation and an airfield used as a World War I pilot, mechanic, and armorer training facility and, under different designations, conducted United States Army Air Corps and Air Forces flight testing. Loc ...
, but over time it was merged with similar caches of US research to form an ever-growing collection of technical papers. The collection grew so large and varied that a new operational group, the
Armed Services Technical Information Agency
Armed (May, 1941–1964) was an American Thoroughbred gelding race horse who was the American Horse of the Year in 1947 and Champion Older Male Horse in both 1946 and 1947. He was inducted into the National Museum of Racing and Hall of Fame in ...
(ASTIA), was formed in 1951 to manage it. This group eventually came under the management of the
Atomic Energy Commission. ASTIA began running experiments in indexing the collection, and it was from this work that Uniterm emerged.
Taube introduced the Uniterm concept in a 1951 paper, "Coordinate Indexing of Scientific Fields", part of the Symposium on Mechanical Aids to Chemical Documentation. The next year, in partnership with Gerald Sophar, Taube formed Documentation, Inc. The company offered commercial retrieval and indexing services. Among their largest efforts was a 1958 contract with the newly formed
NASA
The National Aeronautics and Space Administration (NASA ) is an independent agencies of the United States government, independent agency of the federal government of the United States, US federal government responsible for the United States ...
to index their entire technical library, and later, make
microfilm
A microform is a scaled-down reproduction of a document, typically either photographic film or paper, made for the purposes of transmission, storage, reading, and printing. Microform images are commonly reduced to about 4% or of the original d ...
copies of it.
Taube's original paper indicates that a significant advantage of the Uniterm concept is its ability to be automated. In essence, the uniterm lookup process is looking for the
intersection
In mathematics, the intersection of two or more objects is another object consisting of everything that is contained in all of the objects simultaneously. For example, in Euclidean geometry, when two lines in a plane are not parallel, their ...
of several terms, or as Taube referred to it, the "coordinates". To this end, they partnered with
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
to develop the "Continuous Multiple Access Collator", or COMAC. Users would make search term selections on a
punch card
A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were wide ...
writer and then feed them into the COMAC, also known as the IBM 9900. The COMAC pulled those uniterm cards and then used optical systems to find matching items. It then returned a new card with those numbers that was then sent into the
IBM 305 RAMAC
The IBM 305 RAMAC was the first commercial computer that used a moving-head hard disk drive (magnetic disk storage) for secondary storage. The system was publicly announced on September 14, 1956, , the first computer with a
hard drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
, which returned the complete document information for those numbers.
Concept
Uniterm is based on the concept of making a separate
card catalog
A library catalog (or library catalogue in British English) is a register of all bibliographic items found in a library or group of libraries, such as a network of libraries at several locations. A catalog for a group of libraries is also ...
that refers to the documents in the collection by their
accession numbers. The accession numbers have no meaning in the Uniterm index, so they may use any of the common systems like the
Dewey Decimal Classification
The Dewey Decimal Classification (DDC) (pronounced ) colloquially known as the Dewey Decimal System, is a proprietary library classification system which allows new books to be added to a library in their appropriate location based on subject. ...
or
Universal Decimal Classification
The Universal Decimal Classification (UDC) is a bibliographic and library classification representing the systematic arrangement of all branches of human knowledge organized as a coherent system in which knowledge fields are related and inter-lin ...
, or in many cases, simply an incrementing
serial number
A serial number (SN) is a unique identifier used to ''uniquely'' identify an item, and is usually assigned incrementally or sequentially.
Despite being called serial "numbers", they do not need to be strictly numerical and may contain letters ...
.
As new works are added to the collection, the librarian will make a normal
index card
An index card (or record card in British English and system cards in Australian English) consists of card stock (heavy paper) cut to a standard size, used for recording and storing small amounts of discrete data. A collection of such cards ei ...
for the primary card index as they would for any work. Additionally, they will select a small number of keywords from the title or body of the work that can be used to look it up, and these are also written on the card. For instance, a document on icing of air ducts in aircraft might be filed under "air", "ducts" and "icing", but perhaps not "aircraft" which would be found on too many documents.
The librarian then looks in the Uniterm catalog for cards with those terms on them. If they are not found, they are created by writing the keyword at the top of the card and then dividing the lower portion into ten vertical sections, labeled 0 to 9. The last digit of the accession number is then written on the card in that column, for instance, if the last digit of the accession number is 5, the entire accession number would be written in column 5. If the card for that term is found in the collection, the new accession is simply added to the correct column of the existing card.
To retrieve a document, the user selects potentially useful key terms and extracts those cards from the uniterm index. To find this article, the user might select "indexing" and "library", and retrieves those cards from the uniterm catalog. These cards will have numbers for many different documents, for instance, the "library" card might contain a listing for a book on the
Library of Alexandria
The Great Library of Alexandria in Alexandria, Egypt, was one of the largest and most significant libraries of the ancient world. The library was part of a larger research institution called the Mouseion, which was dedicated to the Muses, ...
. However, only those documents on "library indexing" will appear on ''both'' cards.
The user then scans the card to see if a particular accession number appears on both cards; splitting the cards into 10 columns is intended to make the visual scanning process simpler. Numbers that appear on both cards are likely relevant to the search, and can then be looked up directly or by looking in the main
card catalog
A library catalog (or library catalogue in British English) is a register of all bibliographic items found in a library or group of libraries, such as a network of libraries at several locations. A catalog for a group of libraries is also ...
if partial accession numbers are used.
The cards in the main catalog also contain the uniterms used to file that entry, forming a cross-index. A user that selects the cards for "propeller" and "aeroplane" may find many intersecting works on the cards. Returning to the main index they can look at the uniterms recorded on the main index cards and find that there are other terms that commonly appear, perhaps "aerodynamics". These might suggest additional terms that could be used to narrow their search. They can then return to the uniterm catalog to apply these new terms to return additional documents or further focus their search.
Advantages and criticisms
Uniterm was popular in the United States for large technical collections, which led to considerable study on the system. One particularly useful effort was the
National Security Agency
The National Security Agency (NSA) is an intelligence agency of the United States Department of Defense, under the authority of the director of national intelligence (DNI). The NSA is responsible for global monitoring, collection, and proces ...
's effort to catalog their 70,000-work collection.
They found one major advantage of the Uniterm system was that the librarians did not have to have an understanding of the material in order to correctly catalog it. Simply selecting terms that appeared in the title or were obviously important within the text would often result in a useful uniterm entry. This contrasted with traditional hierarchical approaches, where selecting the proper spot within the hierarchy often required some, or considerable, knowledge of the underlying field.
The same effort also revealed a number of problems and suggested solutions. One was that
synonyms
A synonym is a word, morpheme, or phrase that means precisely or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are a ...
presented a problem; was a paper on "air ducts" the same or different than one on "air intakes"? They suggested this could be addressed by splitting the works into sets of about 1,000 entries and building the catalog out in sections. The first set of 1,000 documents might produce 1,000 uniterms, which were then studied to weed out synonyms. When synonyms were found, they added "see also" headings to those cards. The second set would then be added, using those synonyms. They found that the addition of new terms started to flatten out at about 4,000 entries, and after 10,000 only very specific technical terms were being added.
A concern that was raised when the concept was first introduced was that the terms might return a large number of false positives due to terms being used to describe completely different concepts. In particular, terms that might mean different things depending on their order were believed to be an issue. If one was looking for "American exports to Canada", "Canada", "US" and "exports" would return a large number of documents on Canadian exports into the US as well, perhaps overwhelming the result set.
However, this was found not to be a serious problem in practice, and those few examples that did crop up were solved by adding "delta cards", see-also entries that incorporated a direction. In this case, the "US" card would have a see-also entry for "USΔ", that card would only contain those entries ''from'' the US. Uniterms on the USΔ page are only those for US exports.
Notes
References
Citations
Bibliography
*
*
*
*
* {{cite journal
, first1=John , last1=Sanford
, first2=Frederick , last2=Theriault
, title=Problems in the Application of Uniterm Coordinate Indexing
, journal=College and Research Libraries
, date=January 1956
, volume=17
, url=https://crl.acrl.org/index.php/crl/article/download/10962/12408
, pages=19–23
, doi=10.5860/crl_17_01_19
, doi-access=free
, hdl=2142/36851
, hdl-access=free
Library cataloging and classification