extending their average report-oriented data warehouse environments by adding new standalone data
platforms that better help with discovery analytics—such as columnar databases, data appliances, NoSQL
databases, and Hadoop.
Big Data isn’t about the ―bigness,‖ but rather about business analytics: The ideal way to get
business value out of Big Data is through analytics. Therefore, satisfying data requirements of business
analytics (either with Big Data or traditional enterprise data) is the ―leading driver for change in data
warehouse architectures today.‖
Because each department has different requirements, they usually build their own ―shadow
programs‖ for BI and analytics: To prevent the systems in each department from becoming data silos, data
warehouse architectures are becoming more federated. Several databases appear to function as a single
entity and all the data from multiple sources is presented as if it were stored in one place. This enables the
architectural plan to extend across different systems in different departments.
Increasingly, businesses need access to real-time data: The leading edge now is event processing.
Instead of storing data to find out what happened or what could have been, businesses need to act on
events as they occur. Event processing allows businesses to proact instead of react to risk as well as
create opportunities, not chase them. Although traditional data warehouse architectures are designed for
―data-at-rest,‖ real-time capabilities for ―data-in-motion‖ can retrofit into the architecture.
Data Warehousing concepts
What is Data Warehousing?
Data warehousing is the process of constructing and using a data warehouse. A data
warehouse is constructed by integrating data from multiple heterogeneous sources that
support analytical reporting, structured and/or ad hoc queries, and decision making. Data
warehousing involves data cleaning, data integration, and data consolidations.
Using Data Warehouse Information
There are decision support technologies that help utilize the data available in a data
warehouse. These technologies help executives to use the warehouse quickly and effectively.
They can gather data, analyze it, and take decisions based on the information present in the
warehouse. The information gathered in a warehouse can be used in any of the following
Tuning Production Strategies - The product strategies can be well tuned by
repositioning the products and managing the product portfolios by comparing the sales
quarterly or yearly.
Customer Analysis - Customer analysis is done by analyzing the customer's buying
preferences, buying time, budget cycles, etc.
Operations Analysis - Data warehousing also helps in customer relationship
management, and making environmental corrections. The information also allows us to
analyze business operations.
Integrating Heterogeneous Databases
To integrate heterogeneous databases, we have two approaches:
This is the traditional approach to integrate heterogeneous databases. This approach was
used to build wrappers and integrators on top of multiple heterogeneous databases. These
integrators are also known as mediators.
Process of Query-Driven Approach
When a query is issued to a client side, a metadata dictionary translates the query into
an appropriate form for individual heterogeneous sites involved.
Now these queries are mapped and sent to the local query processor.
The results from heterogeneous sites are integrated into a global answer set.
Query-driven approach needs complex integration and filtering processes.
This approach is very inefficient.
It is very expensive for frequent queries.
This approach is also very expensive for queries that require aggregations.