6 data warehouse design mistakes to avoid

Although difficult, flawless data warehouse design is a must for a successful BI system. Avoid these six mistakes to make your data warehouse perfect.

A data warehouse lies at the foundation of any business intelligence (BI) system. But if this foundation is flawed, the towering BI system cannot possibly be stable. Given BI’s importance as a decision enabler today, such flaws are undesirable. For a flawless data warehouse design and process, avoid the following common mistakes.  

Mistake 1: Basing data warehouse design entirely on current business needs  

Investments in data warehouse design and process rarely bear fruit in the short term, and hence it would be inappropriate to be guided purely by current business requirements. Ideally, a three to five year information management roadmap of the organization should be factored in when creating the data warehouse design, paying as much attention to business strategy as to technical aspects.

Mistake 2: Negligence while creating the metadata layer

While designing a data warehouse, poor design of the metadata has far-reaching implications. Metadata is the integrator between data models, extract, transform, load (ETL) and BI. However, the metadata layer often is created only to fit short-sighted data criteria and its documentation is haphazard.

It is necessary to add descriptions to the tables or columns at the data warehouse design stage itself. When business users reject BI reports they cannot decipher, the problem can usually be traced to poorly designed data models that lack easy-to-understand descriptions, and have inconsistent naming conventions. Prevent this by setting the appropriate metadata strategy at the data modeling stage of data warehouse design.

Mistake 3: Underestimating the value of ad hoc querying and self-service BI

Generating a simple report can sometimes expend considerable bandwidth and be a drain on productivity for the IT team. However, a self-service BI can simplify the task by relying on the metadata layer to generate the reports, without affecting the sanctity of the underlying data model. For example, to produce a report on top-performing branches, the user simply selects hierarchical columns with the titles “region,” “branch,” “number of customers” and “relationship packages.” Thus, incorporation of a self-service or ad hoc query layer in the data warehouse process can help you gain user acceptance.

Mistake 4: Preferring visual appeal to speed

When developing the reporting layer of a data warehouse, the design should focus on ease of use and speedy action Although business users tend to be swayed by fancy charts and reports, do not succumb to the temptation of sacrificing speed at the altar of beautification. Indeed, in the course of data warehouse project implementation, IT teams have noted that speedy response time in report generation is of prime importance for the BI system to gain popularity amongst users. To give a comparison, if a simple report takes a couple of seconds to load, the chart may take three minutes. Speed should therefore be given priority over visual appeal when designing data warehouse process.

Mistake 5: Disregarding data quality prior to finalization of data warehouse design

A large amount of aggregation takes place at the data mart level. The data warehouse is the source of data, and the data contained therein should be clean and accurate. If not, the output from the system is likely to show discrepancies, and the data warehouse design as well as process, is unfairly thought of as the culprit.

It must be understood that discrepancies usually creep in at transactional data levels. For instance, consider a bank customer whose transaction details are captured correctly, but whose mobile number on record is incorrect. A change in the mobile number could have occurred subsequent to recording the original form. This example relates to dimensional data. In such cases, it may not be feasible to stop the data warehouse process and cleanse the data.

Rather, active monitoring of dimensional data should be incorporated right at the data warehouse design stage. Quality control is governed by usage too, and the business could point out faults that the technology may forego; hence, the data warehouse design must provide for strong data governance processes in order to maintain clean data.

Mistake 6: Data warehousing is considered to be an IT initiative.

Focusing on data warehouse implementation as a pure IT project can amount to diluting its essence. Ideally, the benefits of a data warehouse design and process have to be quantified; if not as monetary returns on investments, then at least in terms of growth in business (defined by key performance indicators). This growth can be viewed and measured in terms of increased usage of the BI system and efficient data management, leading to smoother operations, and increased efficiency for top-level management.


About the author: Amit Agarwal is a seasoned BI professional, heading the APAC and MENA businesses at iCreate, a specialist provider of packaged data warehousing and analytical solutions to the global banking industry. 

(As told to Sharon D’Souza)



<< EDW Business Case

DW best practices >>


Read more on Data warehousing

Data Center
Data Management