Data drilling (also drilldown) refers to any of various operations and transformations on tabular, relational, and multidimensional data. The term has widespread use in various contexts, but is primarily associated with specialized
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
designed specifically for
data analysis
Data analysis is the process of inspecting, Data cleansing, cleansing, Data transformation, transforming, and Data modeling, modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Da ...
.
Common data drilling operations
There are certain operations that are common to applications that allow data drilling. Among them are:
Query operations:
* tabular query
* pivot query
Tabular query
Tabular query operations consist of standard operations on data tables.
Among these operations are:
* search
* sort
* filter (by value)
* filter (by extended function or condition)
* transform (e.g., by adding or removing columns)
Consider the following example:
Fred and Wilma table (Fig 001):
gender, fname, lname, home
male, fred, chopin, Poland
male, fred, flintstone, bedrock
male, fred, durst, usa
female, wilma, flintstone, bedrock
female, wilma, rudolph, usa
female, wilma, webb, usa
male, fred, johnson, usa
The preceding is an example of a simple flat file table formatted as comma-separated values. The table includes first name, last name, gender and home country for various people named fred or wilma. Although the example is formatted this way, it is important to emphasize that tabular query operations (as well as all data drilling operations) can be applied to any conceivable
data type
In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
, regardless of the underlying formatting. The only requirement is that the data be readable by the software application in use.
Pivot query
A pivot query allows multiple representations of data according to different dimensions. This query type is similar to tabular query, except it also allows data to be represented in summary format, according to a flexible user-selected
hierarchy
A hierarchy (from Ancient Greek, Greek: , from , 'president of sacred rites') is an arrangement of items (objects, names, values, categories, etc.) that are represented as being "above", "below", or "at the same level as" one another. Hierarchy ...
. This class of data drilling operation is formally, (and loosely) known by different names, including
crosstab query,
pivot table
A pivot table is a table of values which are aggregations of groups of individual values from a more extensive table (such as from a database, spreadsheet, or business intelligence program) within one or more discrete categories. The aggregatio ...
, data pilot, selective hierarchy,
intertwingularity and others.
To illustrate the basics of pivot query operations, consider the Fred and Wilma table (Fig 001). A quick scan of the data reveals that the table has redundant information. This redundancy could be consolidated using an outline or a
tree structure
A tree structure, tree diagram, or tree model is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, although the chart is gen ...
or in some other way. Moreover, once consolidated, the data could have many different alternate layouts.
Using a simple text outline as output, the following alternate layouts are all possible with a pivot query:
Summarize by gender (Fig 001):
female
flintstone, wilma
rudolph, wilma
webb, wilma
male
chopin, fred
flintstone, fred
durst, fred
johnson, fred
(Dimensions = gender; Tabular fields = lname, fname;)
Summarize by home, lname (Fig 001):
bedrock
flintstone
fred
wilma
Poland
chopin
fred
usa
...
(Dimensions = home, lname; Tabular fields = fname;)
Uses
Pivot query operations are useful for summarizing a corpus of data in multiple ways, thereby illustrating different representations of the same basic information. Although this type of operation appears prominently in
spreadsheet
A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
s and desktop
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
software, its flexibility is arguably under-utilized. There are many applications that allow only a 'fixed' hierarchy for representing data, and this represents a substantial limitation.
Drillup
Drillup is the opposite of drilldown. For example, if you drilldown to see the revenue of one product, then you might want to drillup to see the revenue of all products.
[
]
References
Hierarchy
Information science
{{Comp-sci-stub