google search

Custom Search

Saturday, October 25, 2008

The Case Against Data Warehousing

The literature is full of testimonials for data warehousing. There is almost nothing about the arguments against data warehousing. In this paper I attempt to slightly fill that void by shedding light on business and cultural factors that greatly lessen the value of data warehousing for certain organizations. By the way, when I refer to data warehousing, I refer to both centralized data warehousing systems and data marts.

Some of the reasons data warehousing efforts may not be appropriate for certain organizations are:
Data warehousing systems, for the most part, store historical data that have been generated in internal transaction processing systems. This is a small part of the universe of data available to manage a business. Sometimes this part has limited value.

That is, sometimes the business end user community does not have a strong interest in old transaction processing system data beyond what are available in basic reports generated in transaction processing systems. This lack of interest often stems from the fact that the markets in which a business competes are in great flux or that the internal structure of the organization is in perpetual transition. If these conditions exist, there may not be a solid historical base to compare current performance with. Also, sometimes there is a lack of interest in looking at this data in any in-depth way because a business is so simple that a data warehouse is overkill.
Data warehousing systems can complicate business processes significantly.

Though the interest in business process reengineering seems to have waned, some of the appreciation of how complicated processes can slowly strangle a business has remained. Data warehousing, if unchecked, can foster the "institutionalization" of easily created reports whose reason for being quickly is forgotten while people still toil to process these reports. If your organization does not know how to throw out processes (pardon my calling producing, distributing, and reading a report a "process"), data warehousing can quickly add clutter to the business environment.
If most of your business needs are to report on data in one transaction processing system and/or all the historical data you need are in that system and/or the data in the system are clean and/or your hardware can support reporting against the live system data and/or the structure of the system data is relatively simple and/or your firm does not have much interest in end user ad hoc query/report tools, data warehousing may not be for your business.

Whew! You can say that again. - Anyway, you may find that as more of these conditions are met, the less value data warehousing may add to your firm. And once you get away from the big "Fortune 500, centralized IS" type shops most of the data warehousing vendors slant their marketing to, these conditions describe the reporting needs of many firms.
Data warehousing can have a learning curve that may be too long for impatient firms.

Despite the speed of the data warehousing development effort, it takes time for an organization to figure how it can change its business practices to get a substantial return on its data warehousing investment. I speculate that rigorous analysis of the return on most of the major data warehousing implementers' investments would find a much longer average payback period that you would surmise from reading the trade press.
Data warehousing can become an exercise in data for the sake of the data.

Organizations find that there are unlimited opportunities to add data to their data warehouse. Data warehouses, like most other complex systems, take a life of their own. Unfortunately, adding data without questioning the business value of the data can lessen the business value of the data warehouse and quickly increase the cost of maintaining the data warehouse.
In certain organizations ad hoc end user query/reporting tools do not "take".

This is of concern to organizations that believe they can get their return on investment by having users write many of their own queries and reports. In some firms there are profound cultural barriers in the business organization to the acceptance of a tool that allows a person to ask questions on his own. Trying to promote the use of such a tool in these organizations is setting yourself up for failure. Or, sometimes these tools do not take because a business is so complicated that only relatively simple reports with little business value can be written by end users.
Many "strategic applications" of data warehousing have a short life span and require the developers to put together a technically inelegant system quickly. Some developers are reluctant to work this way.

Again, the importance of the culture cannot be underestimated. This time, though, the issue is in the IS organization. If your sell of the data warehousing project is the ability to do this strategic work (which is probably now being done by your users with large and complex spreadsheets) as opposed to the usual development of canned and semi-canned reports and queries, ask yourself if the IS culture can accept this mode of working. For many organizations this approach to systems work is much harder to accept than most people realize.
There is a limited number of people available who have worked with the full data warehousing system project "life cycle".

I refer to availability of both employees and consultants. Systems of some depth require a considerable amount of time to develop fully. In other words, it takes a long time to gain experience with the usual problems that develop at different phases of a data warehousing effort. You should be wary of a consultant who says he has experience implementing scores of data warehouses in a couple of years. Usually this is experience will be with a well-defined part of a data warehousing project that was amenable to outsourcing or with minor projects.
Data warehousing systems can require a great deal of "maintenance" which many organizations cannot or will not support.

Despite the best efforts to architect a system so "maintenance" (in quotation marks because it seems often there is never the closure to the initial data warehousing effort that the term "maintenance" implies) demands are minimized, many systems by their very nature require a great deal of care and feeding once they are in "production". It is important to note that the more successful a warehouse is with the users, the more maintenance it may require. Organizations who cannot or will not staff to meet these maintenance demands should think twice before they jump into the data warehousing business. By the way, it's very easy for the users to quickly go sour on a system they were enthusiastic about at roll-out time if the system personnel do not support the maturing of the system.
Sometimes the cost to capture data, clean it up, and deliver it in a format and time frame that is useful for the end users is too much of a cost to bear.


The percentage of time that must be devoted to extracting, cleaning, and loading data has been well discussed in the literature. It should be pointed out that there are some potential "show-stoppers" in these efforts. Loading data from previous years can require the knowledge of transaction processing system developers who have long since moved on. Cleaning data so they are in a form that is acceptable to users from different functional areas may require arbitration skills the typical data warehousing developer may not possess. Finally, data may have to be loaded into a data warehousing system in a processing window that just isn't big enough. Sometimes compromises are acceptable get-arounds. Often, though, compromises end up substantially compromising the value of the information in the data warehouse.



You may have gotten the impression from reading the trade press that data warehousing is only for large organizations because it requires huge staffs and huge budgets. Well, most of the trade press is dominated by vendors/consultants/publications trying to market to large organizations with huge staffs and huge budgets. - Though I have no way to prove this, in terms of numbers, I think most data warehousing efforts are done by small staffs with modest budgets. In fact, smaller organizations are probably much more "into" data warehousing than larger organizations. It is only recently that practical technology for huge organizations who lust for multi-terabyte databases has become available. The technology for more modestly sized data warehouses, on the other hand, has been available for many years.

Finally, you may have seen articles that state that data warehousing failure rates are between 10% and 90%. Though how these failure rates are determined is suspect, there is no denying that data warehousing is risky. Now the fact that these efforts are risky does not bolster the case against data warehousing. Data warehousing has not repealed the positive relationship between risk and expected return in capital projects. However, if your organization does not know how to manage risky projects, then data warehousing may not be for you.

0 comments:

 

blogger templates | Make Money Online