HOME

TheInfoList



OR:

The postings list is a
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
commonly used in
information retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
(IR) systems to store indexing information about a corpus. It is central to the design and efficiency of search engines and database management systems that need to retrieve information rapidly. At the bare minimum, a postings list is associated with a term from a document and records the places where that term appears. Each term found in documents within a corpus is mapped to a corresponding postings list containing information such as the documents the term appears in and often the positions within those documents.


Structure

A postings list consists of posting elements, sometimes referred to as postings. Each posting typically contains: * A document identifier (DocID), which uniquely identifies a document in the corpus. * Frequency information (Term Frequency), indicating how often the term appears within the document. * Position information, indicating where in the text the term appears. * Additional metadata may include fields such as document titles, headings, or other relevant document-specific information. The exact structure of a postings list can vary based on its application, with some using linked lists, arrays, or more complex data structures like skip lists to optimize for different types of searches. During a search query, the IR system retrieves postings lists for each term in the query to determine which documents contain the terms and how relevant those documents could be based on the frequency and positions of the terms.


Variants

Some variants of postings lists include: *
Inverted index In computer science, an inverted index (also referred to as a postings list, postings file, or inverted file) is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of d ...
: A form of postings list that points from terms to documents. * Impact-ordered postings: Lists where postings are ordered by the weight or "impact" of the term in the document. * Positional postings lists: Enhanced postings lists that include position information for phrase queries and proximity searches.


References

{{Reflist Data structures Information retrieval