Thursday, August 4, 2016

Key technologies in SAP HANA

What is in-memory database?

An in-memory database means all the data is stored in the memory (RAM). This is no time wasted in loading the data from hard-disk to RAM or while processing keeping some data in RAM and temporary some data on disk. Everything is in-memory all the time, which gives the CPUs quick access to data for processing.
The speed advantages offered by this RAM storage system are further accelerated by the use of multi-core CPUs, and multiple CPUs per board, and multiple boards per server appliance.

Key hardware technology Innovations in SAP HANA
Multi-core architecture (8 CPU x 10 cores per blade). Massive parallel scaling with many blades.
Ram Memory space is extended - 2 TB of memory - with dramatic decline in price.

Key software technology innovations in SAP HANA

To understand the impact SAP HANA has on the ABAPer, it can be helpful to revisit some of the key innovations introduced with the in-memory database technology.
i1_in_memory.png
The first important innovation is (I am sure that you have heard about it) that SAP HANA is capable of storing all relevant business data in main memory. That does not mean that SAP HANA does not support storing data on disk. It does mean that SAP HANA (typically) does not need to access the disk during query execution.

i2_multi_core.png

SAP HANA supports multi-core architectures by distributing queries across multiple CPU cores (and across multiple server nodes). Applications running on SAP HANA hence can benefit from massive parallelization.

i3_column_store.pngWithin SAP HANA data can be organized either in row store (like in ‘traditional’ databases) or in column store. While both stores have specific advantages and disadvantages, the bulk of business data resides in the column store where it can be quickly searched and aggregated.

i4_compression.pngSAP HANA can compress business data (in row store). This not only reduces the amount of main memory needed, but also the amount of data to be transferred between main memory and CPU. The latter is important to handle the bottleneck of in-memory database technology: CPU waiting for data to be loaded from main memory into cache (in contrast to disk I/O which used to be the bottleneck in the past).

i5_partitioning.png
Last not least, I want to mention SAP HANA’s capability to partition datasets. Partitioning is of particular interest for large database tables. It supports the parallelization of queries.
For example, if a database table is spread across multiple partitions (on one server node), the aggregation of one column can be done in two steps. In the first step all partitions are aggregated simultaneously. In a second step the results of the first step are added up. This makes use of parallelization to avoid the CPU idle time.

A new paradigm for ABAP emerges: “code-2-data”

To leverage the outlined innovations introduced with SAP HANA from ABAP, the database access becomes very important. And it is crucial – not so say compulsory – for ABAP applications to move data intense operations (i.e. costly calculations on large datasets) to the database layer. This is in some respects a paradigm shift, which is often referred to as “code pushdown” or “code-2-data” approach (in contrast to the “data-2-code” approach ABAP programs followed in the past).
code_pushdown.png
The above diagram illustrates the two approaches: the “data-2-code” approach on the left and the “code-2-data” approach on the right.

What is the difference between the two?

In the past (based on traditional database technology) ABAP applications considered the database to be the bottleneck. Hence the complete business logic – including costly calculations – was implemented on the application layer. Very often a large number of records was transferred from the database to the application server to calculate only a few results there (for example to aggregate many line items to only a handful of key figures).

Now and in the future (based on SAP HANA) the database is not the bottleneck anymore. ABAP applications need to move – at least – parts of the business logic to the database layer. This not only reduces the amount of data to be transferred between database and application server, but it also ensures that the part of the business logic moved to the database (implicitly) benefits from the innovations built into SAP HANA.

However – especially when looking at existing ABAP programs – rewriting the business logic completely in the database might not be reasonable due to the efforts involved. Instead the “code pushdown” should mainly focus on costly calculations on large datasets. Orchestration, process and display logic should stay on the AS ABAP.

What does all that mean for the ABAPer?

The new capabilities of SAP HANA offers a huge variety of opportunities for ABAP developers:

  • Accelerate: by optimizing the code, the run-time for background jobs can be reduced.
  • Extend: turn background jobs into interactive applications.
  • Innovate: With SAP HANA's analysis and calculation capabilities, ABAP developer can design new applications that would not have been possible in the past.
The arrival of in-memory technology on the database layer also requires a change in the way we design and implement applications.


No comments:

Post a Comment

SAP giới thiệu mã hỗ trợ AI trong ngôn ngữ ABAP riêng của mình

SAP đã ra mắt một loạt tính năng mã hỗ trợ AI trong môi trường phát triển ứng dụng dựa trên đám mây của mình, đồng thời tham gia vào danh sá...