Alpha School Supplies Information:
Alpha is an
organization that buys and sells school supplies, books, teaching
aids, games, furniture, etc. to school districts, teachers,
educators, parents, and students. Alpha School Supplies
sells supplies in all fifty states via brick and mortar
store fronts and sells on the Web World Wide, with its
e-commerce site.
|
|
|
Alpha Lab Project:
At a minimum,
- The team will develop a dimensional data model,
- Translate the dimensional model into a dimensional
data warehouse,
- Populate the data warehouse with data and
- Demonstrate a drill down,
- Drill up, and
- Pivot capability within the data warehouse.
Drill DOWN -- means nothing more then give more
detail by, for instance, adding a row header (attribute)
to an existing SQL request.
Drill UP -- if drilling down is adding row headers
from the dimension tables, then drilling up is subtracing
row headers (attributes) from the SQL requests. Of course,
it is not necessary to subtract the row headers in the
same order that they were added. In general, each time
a user adds or subtracts a row header, a new multi-table
join query must be launched.
Pivoting -- Pivoting is the basis of multidimensional
analysis. The row-based "results set" that SQL generates
almost always end up being presented in a format with one
or more dimensions displayed across the top of the report
and one or more down the side. The report title usually
gives it away(e.g., monthly sales report by region, or
monthly sales by sales rep by product.
A crosstab query, sometimes known as a pivot query,
groups the data in a slightly different way from those
we have seen with standard SQL queries. A crosstab query
can be used to get a result with three rows (in this
case one for each project shown below), with each row
having three columns (the first listing the projects and
then one column for each year) -- like this:
Project 2001 2002
ID CHF CHF
----------------------------------
100 123.00 234.50
200 543.00 230.00
300 238.00 120.50
Example:
Let's say you want to show the top 3 salary earners
in each department as columns. The query needs to
return exactly 1 row per department and the row would
have 4 columns. The DEPTNO, the name of the highest
paid employee in the department, the name of the next
highest paid, and so on. Using Oracle analytic
functions this is almost easy, without analytic
functions this would be virtually impossible.
SELECT DEPTNO,
MAX(DECODE(seq, 1, ENAME, null)) FIRST,
MAX(DECODE(seq, 2, ENAME, null)) SECOND,
MAX(DECODE(seq, 3, ENAME, null)) THIRD
FROM (SELECT DEPNO, ENAME, row_number()
OVER (PARTITION BY DEPTNO
ORDER BY SAL desc NULLS LAST) seq
FROM emp)
WHERE seq <= 3
GROUP BY DEPTNO
/
Results:
DEPTNO FIRST SECOND THIRD
------ ----- ------ -----
10 KING CLARK MILLER
20 SCOTT FORD JONES
30 BLAKE ALLEN TURNER
|
In order to receive the grade for the lab everyone must
complete a group project. The objective of the project is
to design a star schema data model, implement the model,
and prepare a final report. Additional details of the
requirement are documented in a separate posting. The
final report is due at 11PM on Thursday, May 6, 2004.
The objective of the project is to develop a VERY SIMPLE data
warehouse schema, which consists of one FACT table and four
DIMENSION tables.
Group Project Requirements:
At the minimum, the final report should contain the
following:
COVER PAGE
ABSTRACT
INTRODUCTION
- Write a problem statement and requirements for your
data warehouse (DW), including analyses that the DW
generates.
BODY
- Project management. This section should document a
project plan which includes work breakdown structure
(WBS) that shows how your team completed the project.
Identify roles and responsibilities of all team
members. Use Microsoft Project or a tool of your
choosing.
- Requirement analysis. This section should include
description of the data sources. Demonstrate how you
analyzed the requirements documented in the
introduction.
- Data model. This section should include the
logical modeling and physical modeling processes. The
DW should have 4-6 dimensions and 1 fact table. You
can use ERwin or a tool of your choosing. Identify
the DDL scripts, and include them in the
Appendix.
- ETL. This section should include identification of the
data sources, data map, and the methodology used to
load the data. Identify and describe the DML,
SQL*Loader scripts. Include the actual scripts in
the Appendix.
- Report analysis. This section should include all the
report scenarios that demonstrate how your team met the
requirements identified in the introduction. At least
4 scenarios should be done. Identify the SQL scripts,
and include them in the Appendix.
- Lesson learned. Include any problems and
issues.
CONCLUSION
APPENDIX
- Document all the necessary DDL, DML, SQL*Loader, and
queries scripts that support the report. Make sure that
the order of the scripts is in proper sequence so that
they will load without generating errors. For example,
load the parent table before load the child table. We
will execute the scripts to verify their accuracy. In
an event that your scripts expand beyond 50 pages, you
may choose to only list the descriptions of the scripts
in the Appendix, and include the actual scripts in a
separate zipped file.
|
|