HOME

TheInfoList



OR:

The probabilistic relevance model was devised by Stephen E. Robertson and Karen Spärck Jones as a framework for probabilistic models to come. It is a formalism of
information retrieval Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an Information needs, information need. The information need can be specified in the form ...
useful to derive ranking functions used by
search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
s and
web search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
s in order to rank matching documents according to their
relevance Relevance is the connection between topics that makes one useful for dealing with the other. Relevance is studied in many different fields, including cognitive science, logic, and library and information science. Epistemology studies it in gener ...
to a given search query. It is a theoretical model estimating the probability that a document ''dj'' is relevant to a query ''q''. The model assumes that this probability of relevance depends on the query and document representations. Furthermore, it assumes that there is a portion of all documents that is preferred by the user as the answer set for query ''q''. Such an ideal answer set is called ''R'' and should maximize the overall probability of relevance to that user. The prediction is that documents in this set ''R'' are relevant to the query, while documents not present in the set are non-relevant. sim(d_,q) = \frac


Related models

There are some limitations to this framework that need to be addressed by further development: * There is no accurate estimate for the first run probabilities * Index terms are not weighted * Terms are assumed mutually independent To address these and other concerns, other models have been developed from the probabilistic relevance framework, among them the Binary Independence Model from the same author. The best-known derivatives of this framework are the Okapi (BM25) weighting scheme and its multifield refinement, BM25F.


References

{{reflist Information retrieval techniques Probabilistic models