Friday, March 4, 2016

Types of Dimention table in DWH

We know Dimension tables contain descriptive information about a subject or matter. Here "ID" column in each dimention table is very important.
Because "Id" has used inside Fact table to express full Information of Each row in Dimention table. That is way the name of "ID" column is Primary Key Column.

There are many more types of dimention table exists in DWH.
Among them the main 5 types are:

1. Conformed Dimention
2. Degenerate Dimention
3. Junk Dimention
4. Role Playing Dimention
5. Slowly Chenging Dimention

Now we will discuss each topic one by one.

Conformed Dimention

It is dimension table that relates to multiple fact tables within the same data warehouse.
It has the same meaning to every fact with which it relates.

Example :
For HR Core (Employee management) DWH, the Employee Dimention table is a conformed dimension because it have the same meaning when joined to any fact table in a HR Core DWH.
and similar to this Dimention_Date table is also a conformed dimension (day, week, month, quarter, year, etc.) because it have the same meaning when joined to any fact table in a DWH.


Degenerate Dimention


A degenerate dimension is a attribute, which stored as part of fact table, but donot have separate dimension table.It is standalone attribute.


Example: For sales related DWH, the "Invoice Number" attribute is degenerate dimension
because there are no dimention table for that. But it plays vital role in fact table.

Junk Dimention

It is a dimension that exists in fact table and measures to answer business questions.
It is just a collection of random transnational codes, Identifiers, flags or text attributes.



Example: For sales related DWH, "Is_Food" & "Is_offer" is the Junk Dimention.

Thursday, March 3, 2016

Dimension Table VS Fact Table in Kimball DWH

Dimension tables contain descriptive information about a subject or matter. 
Dimension tables contain hierarchies of attributes that aid in summarization. 

For example, 
A "Student" dimension table contains descriptive information about Student -> (Name, Age, Gender, Address, Phone_No etc.)



A class dimension table contains descriptive information about class -> (Class_Name)



and a Subject dimension table contains detail information about subject -> (Subject_Name).



Note: Dimension tables contain attributes that describe fact records in the fact table.

Fact table contain summarized data to provide useful information to the analyst. 
The fact data is a relation/ Referential integration between various Dimention(Subject) Table data.

Here we can see it is a collection of only ids of other dimension Tables and forms useful information to the analyst.

So, from this table we understood following details given below:




  1.  Pritam Das is studying in class "Spoken English" and "Hindi" with subject "ABC Tutor" and "Hindi Tutor".
  2. Rina Das is studying in class "Hindi" with subject "Hindi Tutor"
  3. Ramaprasad Das is studying in class "Spoken English" and "Bengali" with subject "ABC Tutor" and "Bengali Tutor".





Wednesday, March 2, 2016

Data Warehouse VS Data Mart

A Data ware house is a logical collection of information gathered from many different operational subject area used to create business intelligence that supports business analysis and top level decision making purpose.

It can holds multiple subject areas with very detailed (Historical) information.
We created Data ware house to integrate all homogeneous/ heterogeneous data sources.

Example: A "HRCORE" Data ware house is collection of 4 subject areas

1. Employee Assignment
2. Employee request
3. Employee Time sheet
4. Employee Expenses

A Data mart is a logical collection of only one subject area information.It is 
a subset of the data warehouse which is usually oriented to a specific business line or team.

Example: In "HRCORE" Data ware house, each subject area is one data mart.
Employee Assignment, Employee request, Employee Time sheet and Employee Expenses.



MORAL: Data warehouse can contain many subject areas, and a data mart can contain just one of those subject areas. A Data Mart is subset of a Data ware house.

Data warehouse design concepts Inmon vs. Kimball

There are two most commonly approaches in DWH architecture introduced by Bill Inmon and Ralph Kimball.

Bill Inmon -> Top-Down Approach



According to Bill Inmon, first make a normalized data model (ware house) then split into data marts.
Dimensional data marts are created only after the complete data warehouse has been created. 
To built data ware house it will take more time compare to Ralph Kim-ball approach. Client requirement should be fixed before start of development.




Ralph Kimball -> Bottom-Up Approach



Ralph Kimball says "The data warehouse is nothing more than the union of all the data marts"


It means, first create data marts and then make a schema relation.
Dimensional data marts (Di mention and Fact table) provide a narrow view into the organizational data, then they related by start or snow flex schema based on requirement.
To built data ware house it will take less time compare to Bill Inmon approach. Client requirement need not be fixed before start of development.





What is operational database ?

It is collection of current information or data that is required to run the business.
It is also referred to as OLTP On Line Transaction Processing databases.
Operational databases are used to store, manage and track real-time business information, not historical data. Operational databases are just part of the entire enterprise data management 
For example, an operational database might contain fact and dimension data describing transactions, data on customer complaints, employee information, etc.