The following are some suggestions for the warehouse builder. These are points I rarely see discussed or I do not see discussed enough in the barrage of articles about data warehousing.
From day one establish that warehousing is a joint user/builder project
Warehouse projects will fail if the builders get specs from the users, go off for 6 months, and then come back with the 'finished' project. Warehouses are iterative! (I think the word iterative means there are lots of mistakes in the projects.) Builders and users working with each other will not reduce the number of iterations, but it will reduce the size of them. By the way, see Peter Block's Flawless Consulting for a great discussion of how to bring about 'joint' projects.
Establish that maintaining data quality will be an ONGOING joint user/builder responsibility
Organizations undertaking warehousing efforts almost continually discover data problems. Best to establish right up front that this project is going to entail some additional ongoing responsibility.
Train the users one step at a time
Typically users are trained once. In several days they learn both the basics and intermediate and sometimes advanced aspects of using a tool. Slow down! Consider providing training initially in the minimum needed for the user to get something useful from the tool. Then let the user use the tool for a while (meaning several days, weeks, or months). Having basic training and some hands on experience, the user will have a much better context with which to grasp the next level. Also, once the basics and the next level are learned, keep training the users! After a year using the tool, schedule advanced training.
Train the users about the data stored in the data warehouse
Users often need more training about the stored data than about the tools used to access the data. Do not assume the data are self-explanatory or that any metadata you may provide will answer any questions. Note that users are often used to seeing data in canned reports and seeing data in its "raw" form can be confusing.
Consider doing a high level corporate data model / data warehouse architecture "exercise" in three weeks
Actually, the key point regarding time is to "time-box" the exercise into a relatively short time. After about three weeks, the marginal benefits from additional time devoted to these types of exercises rapidly decrease. - The corporate model is going to identify, at a high level, subjects and relationships and most importantly, what are the chunks of information that it makes sense to deliver in different projects. The architecture part of the exercise to determine the dimensions, definitions of derived data, attribute names, and information sources that you will attempt to use consistently in your data warehousing efforts. The exercise also consists of coming to an agreement as to how to keep the corporate model up-to-date and how to make sure future data warehousing efforts pay attention to the architectural principles.
Implement a user accessible automated directory to information stored in the warehouse
The majority of successful warehousing efforts I have seen included providing some means for the warehouse user to locate stored information. Most of the times this involved building a separate database with directory information. And most of the time, a pretty simple database sufficed for initial use.
Once you know what raw data you want to feed into the data, request that data
If you have done some reading on data warehouse development you probably have read that figuring out the process of extracting, transforming, and loading (ETL) usually takes the majority of the time in initial data warehouse development. In project management lingo, figuring out ETL is usually on the critical path. - If you know what raw data you need, request it as soon as you know it. You are probably going to have to ask one of the programmers of the legacy feeder systems to initially get this data for you. For reasons of politics, overwork, and just plain lack of knowledge of how data are physically stored in a system, the feeder system programmer often can take a while to get you that data.
Determine a plan to test the integrity of the data in the warehouse
Do not underestimate the importance of user faith in the integrity of the warehouse data. Huge warehouse efforts quickly go sour if after system roll-out users find multiple mistakes. A good investment of time in the initial stages of a warehouse project is for the builder and user to jointly determine what checks will be made on the warehouse data during development and what checks need to be made on an ongoing basis. The checks including tying warehouse data controls back to controls in feeder systems, checking the correctness of aggregation logic, testing whether classifications codes were assigned correctly.
From the start get warehouse users in the habit of 'testing' complex queries
Many people will assume that the query result is correct. At the very least, get the user in the habit of eyeballing the query or report to check if several records that should be included are, in fact, included and that several records that should not be included are, in fact, not included.
Coordinate system roll-out with network administration personnel
Use of data warehousing systems can bring about some strange spikes in network activity. If you keep network administration people informed of the roll-out schedule, chances are they will monitor network activity for you and be ready to make adjustments to the network as necessary.
Have a good grasp of desktop databases and spreadsheets
Even if you are dealing with a 100 TB database, there are so many little tasks to be done in a data warehousing project where knowledge of these tools will be helpful. Skillful use of these tools during development can be a huge productivity enhancer.
Understand that the spreadsheet is your users' primary analytical tool
That is the analytical tool most users are most familiar with. Be prepared to build in capabilities that amplify the poer of spreadsheets.
Be prepared to support beginning users immediately and at any time
We developers often greatly underestimate users' hesitation to begin using the data warehouse. This hesitation could be because of user fear of technology or user fear that they will not get IS support. So, the first point is to be available to help when the user wants to try to use the data warehouse the first time. Users also may want to use the data warehouse for the first time during the weekend or at 6:00 in the morning or 8:00 at night. The distractions are less at those times. If you want to make that beginning user as a committed customer of your data warehouse, you better be available to support the user when he starts out whatever the day or the hour.
Maintain the audit trail to the feeder systems
That is, make it as easy as possible to tie the data in the data warehouse to the feeder systems. Your users have to trust the numbers in the data warehouse. You owe this to the users in order to maintain their trust.
Market and sell your data warehousing systems
For the most part, use of data warehousing systems is optional. This means you have to identify the potential users of the systems, help them understand what are the benefits of the system, and then make them want to keep coming back to use the system.
google search
Custom Search
Saturday, October 25, 2008
Actions for Data Warehouse Success
Posted by thirupal at 10:59 PM
Labels: datawarehousing
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment