Sunday, August 7, 2016

Introduction to Star Schema, Attributes, Measures, Dimensions


Attributes and Measures in SAP HANA

Columns of modeling view can be classified as Attribute or Measure.
Measure: A numeric value such as revenue, quantity sold, price on which you can also process arithmetic operations such as sum, average, top n value and calculation.
Attribute: Non-measurable, Descriptive data is used to describe a measure such as customer ID, city, and country.

Sounds confusing? Let me try to simplify it.
Let us take an example of an organization’s sales tables. The table contains columns like SALES, PROFIT, PRODUCT, CUSTOMER NAME, YEAR, COUNTRY etc.
Most important business use cases will be like:
  • What is the sales value for country India for year 2013?
  • What is the profit for product ‘XYZ’ for country USA?
The SALES and PROFIT columns are numeric values. These columns are measurable and can be aggregated.
While the columns PRODUCT, CUSTOMER NAME, YEAR and COUNTRY are descriptive and non-measurable.
Here PRODUCT, CUSTOMER NAME, YEAR and COUNTRY are attributes, while SALES and PROFIT are measures.

Types of Attributes and Measures:
SAP HANA supports three types of Attributes.
Simple Attributes are attributes derived from the data foundation.
Calculated Attributes are attributes derived from one or more existing attributes or constants. For example, deriving the full name of a customer (first and last name), or assigning a constant value to an attribute which can be used for arithmetic calculations.
Private Attributes are attributes used inside a modeling views and cannot be used outside the view.

SAP HANA supports three types of Measures.

Simple Measures are measurable analytical elements derived from the data foundation.

Calculated Measures are created using some calculation on top of an existing measures or attributes.

Private Measures are measures used inside a modeling views and cannot be used outside the view.

Dimensions
In the number of cases, analyzing the measures by grouping the attributes together is easier.
For example, we can group Sales organization as a dimension with two associated attributes Country, Region. Similarly, a Product ID can be a dimension associated to several attributes such as Product Name, Product Category, Supplier.

Star Schema

Star schema is not a new concept. This is a very famous design approach which was there even before SAP HANA came. The understanding of star schema will help you to understand HANA analytic view.
What is Star Schema?
The star schema (also called multi-dimensional schema) is the simplest style of data warehouse schema.
Star Schema – a single object (the fact table) sits in the middle and is connected to other surrounding objects (dimension tables) like a star.
Each dimension is represented as a single table. The primary key in each dimension table is related to a foreign key in the fact table.

Facts and Dimension Tables:
The fact table holds the main data i.e. the actual data. It includes a large amount of aggregated data, such as sales, profit etc.
Dimension tables include the attributes that describe the data in different perspective. For example: If sales is the fact then Sales per Year, Sales per Country are the different perspective of seeing data. Here Country and Year would be 2 different dimensions.
Each dimension table is joined to the fact table using a primary key to foreign key join, but the dimension tables are not joined to each other.


Fact table:
  • Contain key numeric measures of the business
  • Connected to dimensions
  • Foreign keys to dimension tables
  • Primary key is formed by combining foreign keys pointing to dimension tables
Dimension Tables:
  • Contain information that represents attributes or aspects of the business
  • Contain relatively static data
  • Are joined to fact table through a foreign key reference


Advantages of Star Schema:
Performance
  • Faster query operation
Data model is easy to understand
  • Based on business process
  • Support multi-dimensional analysis
Easy to define hierarchies
  • Continent –> Country –> City
  • Shift – Day – month – Year
Easy to navigate
  • Number of table joins reduced
Maintainable and Scalable
  • Can add new fact tables which use existing dimensions
  • Supported by many front-end tools

No comments:

Post a Comment

SAP giới thiệu mã hỗ trợ AI trong ngôn ngữ ABAP riêng của mình

SAP đã ra mắt một loạt tính năng mã hỗ trợ AI trong môi trường phát triển ứng dụng dựa trên đám mây của mình, đồng thời tham gia vào danh sá...