Database Technologies - Data Warehousing:
Building, Using, and Managing a Data Warehouse
Unit aim and objectives
To describe how to plan the development of a data warehouse within an
After completing this unit you should be able to:
- Describe how to select your own data wharhouse project development
- Explain the process of obtaining a sponser...
- Define the objectives of the orgaization...
- Group your users into four categories...
- Understand each group of users and their information needs...
- Describe datamarts and enterprise models and briefly outline their
- Describe the factors involved when choosing a storage system and user
query tool the data warehouse platform...
When a data warehouse becomes an important means by which an organization's
information is accumulated, you know it is a success.
It is relied upon as the primary source for data for all users within the
The key to a successful data warehouse is to define the key objectives before it
This can be a difficult task.
Do you think the following statement is true or false?
The main objective of the data warehouse is to provide consitent, clear, and
The main objective of the data warehouse is to provide consistent, clear, and
See if you can choose the correct option to fill the blank.
The system's users will require "different" kinds of information.
The system's users will require diffenent kinds of information.
These information needs change over time and more information will be demanded as
the data warehouse becomes more successful.
Do you think that decision support requirements are static? NO! Of course not.
This makes defining decision support requirements more complicated because they
During the initial plainning stage, the first task is to establish the project team
that will implement the data warehouuse.
The team members may change over the course of the project - some people may be used
in the initial stages and others may be drafted at later stages.
Ideally, your project team should consist of 12 to 15 people, and have representatives
from each of the following categories:
- Data warehouse administrators (DWAs)
- Database Designers
- Database Archeologiest
- System Programmers
- Users and Trainers
The DWA is the project team leader responsible for coordinating the development team
and the entire data warehouse project.
The administrator can be a computer technician, a business mangager, or a mixture
Database designers are responsible for good physical and logical database design.
You need database designers who can communicate with both business individuals and
Database designers frequently become metadata managers, so they must be able to
communicate in both business and technical terms.
Database designers must be able to understand the organization so that they can
present its business in the data structures of the warehouse.
In many projects, over half of the team's time and effort is spent understanding the
Database archeologiests know the legacy systems inside and out, and you need them
to extract the data from these systems.
It is highly recommended that these people are included on your team to avoid delays
and increased costs.
System programmers will work with the database archeologists to ensure that the data
from the legacy systems moves smoothly into your warehouse.
Like database archeologists, system programmers are a necessary, benficial resource
for your team.
Users and trainers should come on board together as early as possible, enabling them
to establish a strong communication channel.
Therefore, their business knowledge (users) will complement the technical expertise
of the rest of your team.
The project team is responsible for gaining acceptance of the overall data
This is achieved by defining the technical architecture and approach for the warehouse,
defining a high-level outline, and initiating the pilot warehouse project.
It is very important that the data warehouse project obtain sponsorship. This is vital
to the long-term success of the warehouse, and is often overlooked in the initial
Allocate a long-term budget for the data warehouse.
Offset costs against benefits for the process.
The sponser is also reposible for setting up and chairing a steering committe to
prioritize needs. Resoveing conflicts until the steering committee is set up.
Apart from seeing a clearly defined business need for the data warehouse, sponsors
must also understand how data is created and distributed in the company.
They can then appreciate how this need can best be satisfied through the data
The effective sponsor of the data warehouse should reside at the highest levels of
the organization, where cross-divisional and longer-term investment views are
When planning the data warehouse, you need to be familiar with the data, databases,
object classes, and business items and rules represented in the system.
You need to understand all the major functions and processes that need to be
supported within the system.
You need to know where components of the system may reside or be used.
The pertinent users need to be identified so that interaction and security can be
In building a system, time is a key factor and yu need to know what events, along
with their timescales, are pertinent.
You need to understand the motivations behind the new system.
Do you think the following staement is true or false? Motivations must not affect
the design of rules and constraints for he system, of course this is false. Rules
or constraints for system design and operation are derived from motivations.
If motivations are ignored, the result may be a system that addresses the wrong
When analyzing the prospects of the data warehouse, you should ask yourself the
following questions. How easily can the data warehouse be created? Is there a degree
of resistance involved? How much support does the data warehouse require?
There are also some specific questions relating to the organization itself that you
should consider: Is the organization slow or quick to change and can you effect
change in the existing systems?
Is the organization an early or late adopter of technology?
Are the end-users empowered and encouraged to ask questions? The answers to these
questions will determine how you plan your project.
You should involve the managers when defining performance measures.
Do you think the following staement is true or false? You need to identify
quantifiable performance measers to show whether the organization is achieving
its goals. Of cousrs this is true. You need to identify quantifialbe
performance measures to show whether the organization is archieving its goals.
These performance measures are calculated using data available within the
For example, increaseing sales is a goal of any retail store.
Total monthly sales is a quantitative performance measure of progress towards
Monitoring the progress of data is a basic requirement in a decision support
system. You should document thoroughly and precisely the input data and
algorithms for each performance measure.
Performance measures should be provided reliably and quickly by the data
See if you can identify the correct approach for designing a data warehouse.
Talk to as many different users as possible, or limit the number of users you
intervie? Yes, limit the number of users you interview. So, when designing the
data warehouse, you should limit the number of users you talk to.
The more users you interview, the more different information needs you get, and
scope increases accordingly.
In time, it will be essential that everyone's needs are met by the data
Initially however, the development team should find out what key users need to
know. They can then identify which of the most important information needs can be
satisfied with the data available and build the system to deliver that informaton.
The proposal for the data warehouse should outline parameters of the data warehouse,
provided information relative to ist constraints (cost and benefits), and request
permission to proceed. Keep the proposal at a general level, as too much detail is
unnecessary and can be misleading.
The proposal for the data warehouse will be brought forward for review and comment
to the key business and IS team members. Particular business areas identified for
the pilot warehouse will bring about the inclusion in the team of key people in
After defining the objectives and passing the pilot data warehouse, your team can
develop a data model to support your findings.
Do you think the following statement is true or false? Designing a data warehouse
is an open-ended process, therefore you do not have to follow a set of predetermined
rules. This of course is a true statement. Designing a data warehouse is an
open-ended process - you do not have to follow a set of predetermined rules.
Your resources are your team, your users, the data, a server, and the tools.
This means you can design the schema that makes sense to your users.
It is best to design and build a data warehouse using incremental steps.
Try to decide which option goes in which blank. A "small" project is more
likely to succeed than a large project. This of occurs is correct. Small projects
tend to succeed where large ones fail because small projects are easier to manipulate
and produce results more quickly.
Do you think the development of a data warehouse is ever likely to be fully
completed? Of course not. Data warehouses are dynamic - therefore, you should take
this into account at the design stage. You need to make the warehouse flexible, by
picking tools and technologies that are modular, and using open standards as much
The success of a data warehouse means that the system will grow as users find new
needs for information.
The data warehouse team needs to encourage users to suggest changes. Users
requesting changes reflect a successful, dynamic system.