''The single most dramatic way to affect performance in a large data warehouse is to provide a proper set of aggregate (summary) records that coexist with the primary base records. Aggregates can have a very significant effect on performance, in some cases speeding queries by a factor of one hundred or even one thousand. No other means exist to harvest such spectacular gains.''Having aggregates and atomic data increases the complexity of the dimensional model. This complexity should be transparent to the users of the data warehouse, thus when a request is made, the data warehouse should return data from the table with the correct grain. So when requests to the data warehouse are made, aggregate navigator functionality should be implemented, to help determine the correct table with the correct grain. The number of possible aggregations is determined by every possible combination of dimension granularities. Since it would produce a lot of overhead to build all possible aggregations, it is a good idea to choose a subset of tables on which to make aggregations. The best way to choose this subset and decide which aggregations to build is to monitor queries and design aggregations to match query patterns.
Aggregate navigator
Having aggregate data in the dimensional model makes the environment more complex. To make this extra complexity transparent to the user, functionality known as aggregate navigation is used to query the dimensional and fact tables with the correct grain level. The aggregate navigation essentially examines the query to see if it can be answered using a smaller, aggregate table. Implementations of aggregate navigators can be found in a range of technologies: * OLAP engines * Materialized views *Relational OLAP ( ROLAP) services * BI application servers or query tools It is generally recommended to use either of the first three technologies, since the benefits in the latter case is restricted to a single front end BI toolProblems/challenges
*Since dimensional models only gain from aggregates on large data sets, at what size of the data sets should one start considering using aggregates? *Similarly, is a data warehouse always handling data sets that are too large for direct queries, or is it sometimes a good idea to omit the aggregate tables when starting a new data warehouse project? Thus, will omitting aggregates in the first iteration of building a new data warehouse make the structure of the dimensional model simpler?References
{{Data warehouse Data warehousing