Monday, 15 December 2014

Understanding SQL Server Analysis Services (SSAS) and Components

SQL Server Analysis Services (SSAS) is the technology from the Microsoft Business Intelligence stack, to develop Online Analytical Processing (OLAP) solutions. In simple terms, you can use SSAS to create cubes using data from data marts / data warehouse for deeper and faster data analysis.
Microsoft SQL Server Analysis Services (SSAS) delivers both online analytical processing (OLAP) and data mining functionality for business intelligence applications.

OLAP

Online analytical processing (OLAP) allows the user to access aggregated and organized data from business data sources, such as data warehouses, in a multidimensional structure called a cube. Microsoft provides tools and features for OLAP that user can use to design, deploy, and maintain cubes and other supporting objects.

Data Mining

Data mining gives the user access to the information that is needed to make intelligent decisions about difficult business problems. Microsoft provides tools for data mining with which user can identify rules and patterns in the data, so that the user can determine why things happen and predict what will happen in the future.
Cubes are multi-dimensional data sources which have dimensions and facts (also known as measures) as its basic constituents. From a relational perspective dimensions can be thought of as master tables and facts can be thought of as measureable details. These details are generally stored in a pre-aggregated proprietary format and users can analyze huge amounts of data and slice this data by dimensions very easily. Multi-dimensional expression (MDX) is the query language used to query a cube, similar to the way T-SQL is used to query a table in SQL Server.
Simple examples of dimensions can be product / geography / time / customer, and similar simple examples of facts can be orders / sales. A typical analysis could be to analyze sales in Asia-pacific geography during the past 5 years. You can think of this data as a pivot table where geography is the column-axis and years is the row axis, and sales can be seen as the values. Geography can also have its own hierarchy like Country->City->State.  Time can also have its own hierarchy like Year->Semester->Quarter. Sales could then be analyzed using any of these hierarchies for effective data analysis.
  
A typical higher level cube development process using SSAS involves the following steps:

1) Reading data from a dimensional model
2) Configuring a schema in BIDS (Business Intelligence Development Studio)
3) Creating dimensions, measures and cubes from this schema
4) Fine tuning the cube as per the requirements
5) Deploying the cube
The basic concepts of OLAP include:
  • Cube
  • Dimension table
  • Dimension
  • Hierarchy
  • Level
  • Fact table
  • Measure
  • Schema

Cube

The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that's been aggregated to allow queries to return data quickly. For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.
Cubes are ordered into dimensions and measures.
The data for a cube comes from a set of staging tables, sometimes called a star-schema database. Dimensions in the cube come from dimension tables in the staging database, while measures come from fact tables in the staging database.

Dimension table

A dimension table lives in the staging database and contains data that you'd like to use to group the values you are summarizing. Dimension tables contain a primary key and any other attributes that describe the entities stored in the table. Examples would be a Customers table that contains city, state and postal code information to be able to analyze sales geographically, or a Products table that contains categories and product lines to break down sales figures.

Dimension

Each cube has one or more dimensions, each based on one or more dimension tables. A dimension represents a category for analyzing business data: time or category in the examples above. Typically, a dimension has a natural hierarchy so that lower results can be "rolled up" into higher results. For example, in a geographical level you might have city totals aggregated into state totals, or state totals into country totals.

Hierarchy

A hierarchy can be best visualized as a node tree. A company's organizational chart is an example of a hierarchy. Each dimension can contain multiple hierarchies; some of them are natural hierarchies (the parent-child relationship between attribute values occur naturally in the data), others are navigational hierarchies (the parent-child relationship is established by developers.)

Level

Each layer in a hierarchy is called a level. For example, you can speak of a week level or a month level in a fiscal time hierarchy, and a city level or a country level in a geography hierarchy.

Fact table

A fact table lives in the staging database and contains the basic information that you wish to summarize. This might be order detail information, payroll records, drug effectiveness information, or anything else that's amenable to summing and averaging. Any table that you've used with a Sum or Avg function in a totals query is a good bet to be a fact table. The fact tables contain fields for the individual facts as well as foreign key fields relating the facts to the dimension tables.

Measure

Every cube will contain one or more measures, each based on a column in a fact table that you';d like to analyze. In the cube of book order information, for example, the measures would be things such as unit sales and profit.

Schema


Fact tables and dimension tables are related, which is hardly surprising, given that you use the dimension tables to group information from the fact table. The relations within a cube form a schema. There are two basic OLAP schemas: star and snowflake. In a star schema, every dimension table is related directly to the fact table. In a snowflake schema, some dimension tables are related indirectly to the fact table. For example, if your cube includes OrderDetails as a fact table, with Customers and Orders as dimension tables, and Customers is related to Orders, which in turn is related to OrderDetails, then you're dealing with a snowflake schema.

3 comments:

  1. A nice article here, i think that people who have grown up with the idea of using computers are showing more responsibility towards writing posts that are thoughtful, do not have grammar mistakes and pertinent to the post..

    MSBI Training in Chennai

    Informatica Training in Chennai

    ReplyDelete
  2. very informative blog and useful article thank you for sharing with us , keep

    posting learn more about BI Tools Tableau Online Training

    ReplyDelete
  3. Through this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing on Msbi online training

    ReplyDelete

If you have any doubt, please let me know.

Popular Posts