Data Warehouse | Concept & Applications

When translational data is no longer of value to the operational environment, it is removed from the database. If a business is without decision support environment, the data ican be archive and eventually destroy. However, if there is a decision support system, the data is transport to some type of interactive medium commonly refer to as a data warehouse.

In short, we define it as a historical centralized database designed for decision support.

Understanding OLAP and OLTP:-

OLAP – Online Analytical Processing (OLAP) is a query-based methodology that supports data analysis in a multidimensional environment. It is a valuable tool for verifying or refuting human-generated hypothesis and for performing manual data mining.

OLTP – Online Transactional Processing (OLTP) database are process-orient and organize so as to maximize data update and retrieval. OLTP system supports data processing, collection and management and deals with a large number of transactions by a large set of users.

What is a Data Warehouse?

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process.

Subject oriented – The data warehouse is structure in terms of subjects eg for a university scenario “ course, students” etc.

Integrated – Data are frequently collect from different systems and in different formats with a view to providing a unified picture.

Time variant – Data in the warehouse is historic or valid for a given time.

Non-volatile – Data are not update in the data warehouse in real-time, but regularly refresh from the operational system.

Functions:-

1.     Data Extraction – In this process, the data from various sources are gathered which then goes for data transformation.

2.     Data Transformation – Data transformation resolves granularity issues, correct data inconsistences between multiple operational databases, and to timestamp individual data records.

3.     Data Loading – Once data is transform and then clean up the data enters the warehouse, where it is store in as relational or multidimensional format.

                                                   Data Warehouse process model

Schemas in Data warehouse

Star schema – A popular relational modelling technique – the start schema is in use to represent a data warehouse. This schema stores the warehouse data using the relational model and invokes a relational database engine to present data.

Snowflake schema – The snowflake schema is a variation of the star schema where some of the dimension table linked directly to the fact table is further divided. This permits the dimension tables to be normalize which leads to less storage. It has less redundancy and storage efficiency is increase up. But due to the increase number of tables, the complexity of the data queries for information extraction is extend.

Constellational schema – The second variation of star schema that accounts for models requiring more than one central fact table is constellational schema. A warehouse supporting several subjects benefits most from the constellational schema.

Decision support: Analyzing the Warehouse data

The primary function of a data warehouse is to house data for decision support. The data is copied from the warehouse for the analysis by the Decision Support System (DSS).

We also find that data entering the data warehouse from DSS. Any data entering the it in the form of metadata create from information gain through one or more decision support process.

Three categories of decision are:-

 1.Reporting data – Reporting is consider as the lowest level of decision support. However, a reporting facility capable of generalizing informative reports for a variety of clients is of utmost importance for any of the business.

2.Analyzing data – Data analysis is usually accomplish with some form of the multidimensional data analysis tool.

3.Knowledge discovery – Knowledge discovery typically takes place through data mining.

Application of Data Warehouse

Banking

In banking, it is to identify the potential risk of default and manage and control collections and also to track the performance of accounts and user data. Its also helps to do performance analysis of each product, service, interchange, exchange rates and also to provide feedback to bankers regarding customer relationship and profitability.

Manufacturing

It is to predict market changes and analyze current business trends and to analyze previous and current market trends. It is also to identify profitable product lines and require product features.

Education

warehouse helps to store and analyse information about faculty and students and to maintain student portal to facilitate student activities. It integrates information from different sources into a single repository for analysis and strategic decision-making which can be later extracted for research grants and assess student demographics.

Summary

A data warehouse is a relational database that is design for query and analysis rather than for transaction processing which helps to process the data accordingly. It usually contains data extracted from various sources. Transaction workload separate from analysis workload and enables an organization to collect data from various sources into one unit.

thus, In addition to the relational database a warehouse environmental include extraction, transformation and loading solution (ETL), online analytical processing (OLAP) which is an engine analyst tool for client and other applications which are to manage the process or gathering data & developing it to business users.

Written By: Prateek Sharma

Reviewed By: Savya Sachi

If you are Interested In Machine Learning You Can Check Machine Learning Internship Program
Also Check Other Technical And Non Technical Internship Programs

Leave a Comment

Your email address will not be published. Required fields are marked *