Data Warehousing 101

Most of you already know why you need a warehouse. But for those of you that are new to warehousing let’s cover the highpoints.

First, a caution:

As a business leader, your focus with a BI/DW project is business insight – the Business Intelligence part of the BI/DW project. Your primary concern: How easily can I get at my data and how will it be presented to me? And rightfully so. That is the part of the system that most people will use.

Unfortunately, no reasonable BI system can be built without an accurate, reliable, and timely Data Warehouse to use as its data source. If you don’t get the database right, you are building on an unstable foundation. Be very wary of BI Tool vendors that claim that they can use whatever data you have wherever it is. That may be true in a purely technical sense, but we have never, ever seen that approach work well.

What is a Data Warehouse?

The simplest answer is that a Data Warehouse is a database. But all databases are not Data Warehouses. What is different about a Data Warehouse? The data model design is optimized for reporting and analysis.

Most businesses have some sort of transaction processing system. The data model of the database behind that system is optimized for transaction flow - processing large numbers of transactions as quickly as possible. All of the transactional data is there but running queries against that database for reporting or analysis purposes can be very challenging. Typical queries against a transactional system will require joins across many tables and very challenging logic in order to bring the data together in a meaningful way.

As a result, many businesses build a Data Warehouse to extract data from their transaction processing system and bring it together into a model that is easier to navigate with reporting and query tools.

Why would I need a Data Warehouse?

Business leaders need facts in order to make informed decisions. As Peter Drucker famously observed: "You can't manage what you can't measure."  What facts are needed to determine the health of your specific business will vary, but undoubtedly there is some set of meaningful metrics that can be used to help you make the best possible short and long term decisions.

Once you understand what these key metrics are, you need a way to collect the underlying data and calculate the metrics in a reliable and timely manner.

If your business is small enough, you may be able to do this without a warehouse. But as a business grows, data tends to get spread across multiple systems at multiple locations. Data collection across these systems is often a manual process. Different people collecting and gathering data can sometimes use different methodologies. Another common problem is that the resource usage of reporting queries can overwhelm your processing systems. All of these challenges can affect the reliability and timeliness of your business metrics.

How do I know when I need a Data Warehouse?

If you find that you are having reliability and timeliness challenges collecting your key business metrics you are a prime candidate for a warehouse.

One of the great benefits of building a warehouse is that the extraction of data and calculation of metrics is streamlined and automated. An equally important benefit is that the data can be cleansed, standardized, and quality controlled. Reliability. Timeliness.

What are some common uses of a Data Warehouse?

  • Understanding your customers.
  • Finding new customers.
  • Measuring the effectiveness of marketing campaigns.  And driving those campaigns.
  • Understanding your operational effectiveness.
  • Data mining – i.e. finding relationships in your business data that you did not know existed.

What do I need to build a Data Warehouse?

  • An initial attainable goal for the warehouse that will allow you to put some scope around the project and obtain funding.
  • A hardware platform – Server, Data Storage, and Network along with the infrastructure to properly manage and secure it.
  • An appropriate database platform that can handle the load.
  • An ETL (Extract, Transform, Load) Tool. ETL tools allow you to move data from your various data sources into the warehouse in a very efficient manner.
  • One or more tools to access the data in the warehouse –query tools, reporting tools, cube builders, statistical analysis packages, campaign management tools, data mining tools… What you will need will vary depending on what your business goals are.
  • Expertise to design, build, and maintain the warehouse.

How do I get started?

If you have never built a warehouse before it is unlikely that you will be able to tackle a warehouse project on your own. Warehouse Business Analysis, Warehouse Data Modeling, ETL Development, and Warehouse Database Administration are all specialty areas. Whether you hire your own full-time employees or bring in consultants like us, definitely get some expertise on board before you get started.

Warehouse projects are a significant investment and quickly become a critical piece of your business infrastructure. Many warehouse project fail for a variety of reasons. Reduce your risk. Bring warehouse expertise in that has a proven track record and knows how to guide you around the potential pitfalls. At Birch Hill we only do warehouses. We are the Data Warehousing Experts.