Skyline operator
   HOME

TheInfoList



OR:

The skyline operator is the subject of an
optimization problem In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions. Optimization problems can be divided into two categories, depending on whether the variables ...
, used in a query to filter results from a database to keep only those objects that are not worse than any other. This operator is an extension to SQL proposed by Börzsönyi et al. A classic example of application of the skyline operator involves selecting a hotel for a holiday. The user wants the hotel to be both cheap and close to the beach. However, hotels that are close to the beach may also be expensive. In this case, the skyline operator would only present those hotels that are not worse than any other hotel in both price and distance to the beach.


Proposed syntax

To give an example in SQL: Börzsönyi et al. proposed the following syntax for the skyline operator: SELECT ... FROM ... WHERE ... GROUP BY ... HAVING ... SKYLINE OF ISTINCTd1 MAX , DIFF ..., dm MAX , DIFFORDER BY ... where d1, ... dm denote the dimensions of the skyline and MIN, MAX and DIFF specify whether the value in that dimension should be minimised, maximised or simply be different.


Implementation

The skyline operator can be implemented directly in SQL using current SQL constructs, but this has been shown to be very slow. Other algorithms have been proposed that make use of divide and conquer, indices,
MapReduce MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
and general-purpose computing on graphics cards. Skyline queries on data streams (i.e. continuous skyline queries) have been studied in the context of parallel query processing on multicores, owing to their wide diffusion in real-time decision making problems and data streaming analytics.


See also

*
Pareto efficiency Pareto efficiency or Pareto optimality is a situation where no action or allocation is available that makes one individual better off without making another worse off. The concept is named after Vilfredo Pareto (1848–1923), Italian civil engi ...
*
Multi-objective optimization Multi-objective optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, multiattribute optimization or Pareto optimization) is an area of multiple criteria decision making that is concerned with ...
* Convex hull *
Nearest neighbor search Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function ...
*
Selection algorithm In computer science, a selection algorithm is an algorithm for finding the ''k''th smallest number in a list or array; such a number is called the ''k''th ''order statistic''. This includes the cases of finding the minimum, maximum, and median e ...


References

Data management Query languages Relational database management systems SQL {{database-software-stub