SQL Server Analysis Services (SSAS) is
the technology from the Microsoft Business Intelligence stack, to develop
Online Analytical Processing (OLAP) solutions. In simple terms, you can use SSAS to
create cubes using data from data marts / data warehouse for deeper and faster
data analysis.
Microsoft SQL Server Analysis Services (SSAS) delivers both
online analytical processing (OLAP) and data mining functionality for business
intelligence applications.
OLAP
Online analytical processing (OLAP)
allows the user to access aggregated and organized data from business data
sources, such as data warehouses, in a multidimensional structure called a
cube. Microsoft provides tools and features for OLAP that user can use to
design, deploy, and maintain cubes and other supporting objects.
Data Mining
Data mining gives the user access to
the information that is needed to make intelligent decisions about difficult
business problems. Microsoft provides tools for data mining with which user can
identify rules and patterns in the data, so that the user can determine why
things happen and predict what will happen in the future.
Cubes are multi-dimensional data
sources which have dimensions and facts (also known as measures) as its basic
constituents. From a relational perspective dimensions can be thought of as
master tables and facts can be thought of as measureable details. These details
are generally stored in a pre-aggregated proprietary format and users can
analyze huge amounts of data and slice this data by dimensions very
easily. Multi-dimensional expression (MDX) is the query language used to query
a cube, similar to the way T-SQL is used to query a table in SQL Server.
Simple examples of dimensions can be
product / geography / time / customer, and similar simple examples of facts can
be orders / sales. A typical analysis could be to analyze sales in Asia-pacific
geography during the past 5 years. You can think of this data as a pivot
table where geography is the column-axis and years is the row axis, and
sales can be seen as the values. Geography can also have its own hierarchy like
Country->City->State. Time can also have its own hierarchy
like Year->Semester->Quarter. Sales could then be analyzed using any
of these hierarchies for effective data analysis.
A typical higher level cube development
process using SSAS involves the following steps:
1) Reading data from a dimensional model
2) Configuring a schema in BIDS (Business Intelligence Development Studio)
3) Creating dimensions, measures and cubes from this schema
4) Fine tuning the cube as per the requirements
5) Deploying the cube
1) Reading data from a dimensional model
2) Configuring a schema in BIDS (Business Intelligence Development Studio)
3) Creating dimensions, measures and cubes from this schema
4) Fine tuning the cube as per the requirements
5) Deploying the cube
The basic concepts of OLAP include:
- Cube
- Dimension table
- Dimension
- Hierarchy
- Level
- Fact table
- Measure
- Schema
Cube
The basic unit of storage and analysis
in Analysis Services is the cube. A cube is a collection of data that's
been aggregated to allow queries to return data quickly. For example, a cube of
order data might be aggregated by time period and by title, making the cube
fast when you ask questions concerning orders by week or orders by title.
Cubes are ordered into dimensions
and measures.
The data for a cube comes from a set of
staging tables, sometimes called a star-schema database. Dimensions in the cube
come from dimension tables in the staging database, while measures come
from fact tables in the staging database.
Dimension
table
A dimension table lives in the
staging database and contains data that you'd like to use to group the values
you are summarizing. Dimension tables contain a primary key and any other
attributes that describe the entities stored in the table. Examples would be a
Customers table that contains city, state and postal code information to be
able to analyze sales geographically, or a Products table that contains
categories and product lines to break down sales figures.
Dimension
Each cube has one or more dimensions,
each based on one or more dimension tables. A dimension represents a category
for analyzing business data: time or category in the examples above. Typically,
a dimension has a natural hierarchy so that lower results can be "rolled
up" into higher results. For example, in a geographical level you might
have city totals aggregated into state totals, or state totals into country
totals.
Hierarchy
A hierarchy can be best
visualized as a node tree. A company's organizational chart is an example of a
hierarchy. Each dimension can contain multiple hierarchies; some of them are natural
hierarchies (the parent-child relationship between attribute values occur
naturally in the data), others are navigational hierarchies (the
parent-child relationship is established by developers.)
Level
Each layer in a hierarchy is called a level.
For example, you can speak of a week level or a month level in a fiscal time
hierarchy, and a city level or a country level in a geography hierarchy.
Fact
table
A fact table lives in the
staging database and contains the basic information that you wish to summarize.
This might be order detail information, payroll records, drug effectiveness
information, or anything else that's amenable to summing and averaging. Any
table that you've used with a Sum or Avg function in a totals query is a good
bet to be a fact table. The fact tables contain fields for the individual facts
as well as foreign key fields relating the facts to the dimension tables.
Measure
Every cube will contain one or more measures,
each based on a column in a fact table that you';d like to analyze. In the cube
of book order information, for example, the measures would be things such as
unit sales and profit.
Schema
Fact tables and dimension tables are
related, which is hardly surprising, given that you use the dimension tables to
group information from the fact table. The relations within a cube form a schema.
There are two basic OLAP schemas: star and snowflake. In a star schema,
every dimension table is related directly to the fact table. In a snowflake
schema, some dimension tables are related indirectly to the fact table. For
example, if your cube includes OrderDetails as a fact table, with Customers and
Orders as dimension tables, and Customers is related to Orders, which in turn
is related to OrderDetails, then you're dealing with a snowflake schema.
A nice article here, i think that people who have grown up with the idea of using computers are showing more responsibility towards writing posts that are thoughtful, do not have grammar mistakes and pertinent to the post..
ReplyDeleteMSBI Training in Chennai
Informatica Training in Chennai
very informative blog and useful article thank you for sharing with us , keep
ReplyDeleteposting learn more about BI Tools Tableau Online Training
Through this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training
ReplyDelete