Partitioning Data

Filed under: General — admin at 3:30 pm on Saturday, June 28, 2008

In anything other than a small data warehouse the fact data tables can become extremely large. This poses a number of problems for management, backup, and daily processing. These problems are avoided by partitioning each fact table into multiple separate partitions. To do this effectively you need to ask the question: How active is the information? If we consider how many organizations use decision support data, we shall expect to find that a majority of the questions are asked concerning recent history. This represents the period of time that is relevant to the decision-making process. This statement is particularly true for canned queries, where the queries themselves tend to be fairly run-of-the-mill and repetitive. For example, in a retail operation, a merchant will tend to ask the same kinds of question day in, day out, such as queries relating the sales performance of the products that he or she is responsible for.

Queries of this nature tend to operate on the. basis of year-on-year comparisons of sales transactions over a relatively short historical period. Week to date, month to date, six weeks to date, are all examples of time periods relevant to the business. Bear in mind that the actual week/month will vary at the start of a new week/month. In other words, at the start of every week, “week to date” represents a different set of data.