A comparison of OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) takes place in the context of non-analytic versus analytic data processing.
OLTP and OLAP describe two different types of databases. OLTP and OLAP are two different ways of processing data in a database and therefore how to structure the data in the database. OLTP and OLAP are made for different purposes.
OLTP is made to process operational requests in a database. ‘Operational requests’ mean requests to the database from applications that are used in the day-to-day-business. For example, a day-to-day-business application is an application to maintain sales orders or an e-commerce shop.
OLAP is made to process analytical requests in a database. ‘Analytical requests’ means requests to the database from applications that are used for business intelligence and data mining. For example, a business intelligence application is an application that provides information about sales orders, such as what the aggregated net value for a specific timeframe is.
To understand the difference between OLTP and OLAP, plus what the places of OLTP and OLAP in SAP are, it is first necessary to understand the terms:
- DB (Database),
- DBMS (Database Management System),
- RDBMS (Relational Database Management System),
- DW (Data Warehouse),
- OLTP (Online Transaction Processing), and
- OLAP (Online Analytical Processing).
What a DB Is
A database (DB) – or also called database system – is a collection of structured data (data basis) and software to access that data (DBMS). Typically, the data is stored electronically. A database system serves users and applications to:
- save high volumes of data
- provide data subsets in
- different presentation forms
- needs-based presentation forms
A database system consists of two parts:
- an actual database in a narrow sense, and
- a database management system (DBMS).
The actual database is also called data basis and is the data in a database system which is managed by the DBMS.
For example, you have a zoo, and you want to save in a database system all the data about the zoo. The animal names, zoo areas, zookeeper wages, etc. are the actual data that reside in the data basis. To store and retrieve the data, a DBMS is necessary.
What a DBMS Is
A DBMS (Database Management System) is the administration software of a database system which manages the data basis that contains the actual data.
Whenever someone wants anything from the data basis, then he asks the DBMS for it. Then, the DBMS takes the request, translates it, and passes it to the data basis and gets or puts there the data.
The DBMS organizes the structured storage of the data in the data basis. The DBMS controls all reading and writing access to the data basis.
To manage and request data, a database system provides a database language, such as SQL (Structured Query Language). Therefore, if someone wants anything from the data basis, he needs to put his request into the language of the DBMS.
For example, ten DBMS are:
- SAP Sybase ASE
- SAP HANA
- Microsoft SQL Server
- Microsoft Access
- Oracle RDBMS
- IBM DB2
- Amazon RDS
For example, you have an application to manage the zoo animals, and you create a new animal in the application. The application takes the provided animal data and sends it to the application server. The application server translates the request to save the data into the database system into the DBMS language and then sends it to the DBMS. Finally, the DBMS puts the animal data into the data basis.
What an RDBMS Is
An RDBMS (Relational Database Management System) is a DBMS that stores data in a relational structure in the data basis. This means that the data is grouped in logical units in tables, and those units link to each other using keys. The tables have relations with each other through the keys.
DBMSs are made to retrieve and store data in a data basis with the presence of redundant data. RDBMSs are made to retrieve and store data in a data basis without the presence of redundant data.
For example, you would like to save the zoo animals according to their zoo area.
In a so-called flat and non-relational database, you would save all animals and their zoo area altogether in one table, like this:
|ID||Animal Type||Zoo Area|
But in a relational database, you would save the animals and the zoo areas as logical units in separate tables.
Each entry in the animals table would have a key that points to an entry in the zoo areas table. Therefore, the keys relate each animal entry with an entry in the zoo areas table.
By the way, the process of structuring data that belongs logically together in a data basis into separated tables is called normalization.
The animals table with relations via the zoo area ids to the zoo areas table:
|ID||Zoo Area ID||Animal|
And the zoo areas table:
There are 2 key advantages of a relational database over a flat database which saves all data in one table:
- reduced data redundancy, and
- improved data integrity.
For example, you would like to add to the animals and zoo areas the zookeepers, including who is responsible for which animal and therefore works in which zoo area.
In a flat zoo database which is not relational, you would have to extend your one table with the data about the zookeepers. The table already holds the data about your animals and zoo areas. The table gets more and more packed, complex, and redundant.
In a relational database, you would just add another zookeeper table to the relational database and use a key to relate the entries of the zoo area tables to the zookeeper table.
The key relates each entry in the zoo area table with an entry in the zookeeper table:
|ID||Zoo Area ID||Zoo Keeper|
Now you know which zookeeper is responsible for which zoo area through the relation of the zookeeper table to the zoo area table. Plus, you know which zookeeper is responsible for which animal through the relation of the zookeeper table over the zoo area table to the animal table.
In comparison to a DBMS and one single table, an RDBMS and its multiple related tables is less packed, less complex, and does not have any redundancies.
Tables in an RDBMS are well structured and do make sense for a human through their logical groupings, such as animal, zoo areas, or zookeepers.
Relational database systems are the most common database systems.
What a DW Is
A DW (Data Warehouse) is at its core just another relational database. It is a relational database that is structured to fit the needs of analytical requests.
A data warehouse extracts, transforms, structures, and finally stores data from operational databases and external sources.
A data warehouse is made for analytical applications. A data warehouse supports analytical applications in data consolidation, analysis, and reporting, including the aggregation of data. In turn, analytical applications are made for decision support.
For example, you are thinking about closing your zoo for a month because you want to go on vacation to Hawaii. However, you are not sure which month would be the best month to close your zoo in terms of revenue.
You use an analytical application that is connected to a DW to get the net value of all sold zoo tickets for each month. Then, you pick the month with the least net value of sold tickets for your vacation to Hawaii.
What OLTP Is
OLTP (Online Transaction Processing) describes databases that are optimized for operational requests. ‘Operational requests’ means requests from applications that are used in the day-to-day-business, such as an application to manage sales orders or an e-commerce shop.
Operational requests are not requests from applications that are used for analytics such as decision support. An OLTP database is not a DW.
An OLTP database is made to
- process high volumes of request,
- respond without delay to requests,
- process requests as fast as possible, and
- maintain the data integrity despite multiple simultaneous requests.
The effectiveness of an OLTP database is measured in processed transactions per seconds. That says it all about the main purpose of an OLTP database.
Therefore, an OLTP database is made to serve the best operational applications that access the database by fast and consistent processing of the applications’ requests.
As the name says, the requests to an OLTP database are
- transactions, and
- processed online.
The ‘Transaction’ in OLTP
A transaction is a request to a relational database and is a logical unit. ‘Logical unit’ means that, after the request is processed successfully, the request leaves the data basis consistent.
‘Consistent’ means that the data is changed only in a way allowed by the data basis. Otherwise, the data integrity in the database is not given.
Data integrity within a relational database is given if the
- attributes are within their value ranges,
- primary keys are unique,
- primary keys are not empty,
- foreign keys are either empty or an equivalent primary exists, and
- custom integrity rules are fulfilled, such as every zoo area must have a zookeeper.
Either a transaction successfully executes or it is rolled back until the database has the same status as if the transaction had never been executed. Therefore, if a transaction is aborted during its processing, then all the transaction changes in the database until its abortion must be undone.
For example, the zoo has vacations for the animals. At each zoo area’s entry is a note with the vacation times of the zoo area. Each zoo area and the animals inside of it get four weeks of vacation in a year, during which the area is closed for visitors. The zoo area is opened again when the vacation ends.
The animals can decide on their own when they want their zoo area closed and take vacations. Now, the Hawk from the desert zoo area decides to take two weeks of vacation from January 01, 2019 to January 14, 2019.
The hawk communicates its vacation to Michael, the zookeeper for the zoo’s mountain area. Micheal then records the vacation in the mountain area’s vacation note.
The transaction could look formal like this:
start of the transaction read the field remaining vacations for Chad the hawk write January 01, 2019 into the field vacation from for Chad the hawk write January 14, 2019 into the field vacation to for Chad the hawk end of the transaction
That is a valid transaction which leaves the vacation note as the database consistent.
But imagine that the above transaction gets aborted before Michael has written to the field ‘vacation to’ – the zoo area gets closed but never reopened again. Michael just does as the note says.
Therefore, the aborted transaction would leave the note inconsistent. The transaction actually would be no transaction, because of its inconsistent effects to the note. The invalid transaction would be a mere request to the note.
Hence, when a transaction is aborted, all its changes done in the database must be rolled back to retain the consistency of the database.
Or imagine zookeeper Michael did not read the field ‘remaining vacation’ before entering any vacation requests. Chad the hawk could take as much vacation as he wants, even though he has only four weeks of vacation per year.
The transaction again would not be a transaction but a mere request, because the request would not be consistent. The transaction would violate the custom integrity rules of the note, which say that a zoo area has overall only four weeks vacation in a year.
The ‘Online’ in OLTP
The online in OLTP means that the transactions in an OLTP system are processed in real-time without any delay. The data is processed as soon as the request is made and gives an answer as fast as possible.
For example, an ATM: as soon as you confirm your PIN code, the ATM checks immediately if the entered PIN code is correct.
The opposite of real-time is batch processing. In a batch processing system, the requests are processed in bulk at a later time. That means that the answer to the batch processing is provided in a later time as well.
For example, a program to optimize images: during the daytime, new images get uploaded by users to a server, and at nighttime, when the workload of the server is lower, the program optimizes the new images on the server in a batch way.
Real-time processing requires constant user interaction: The user interacts with an application, the application fires the request up to the database, the database processes the request, and it fires a response back to the application.
Batch processing requires user interaction once or no user interaction at all. The user starts the batch process, and the system processes the batch until it is finished. The other option is that the user schedules a batch request, and the system processes the batch automatically.
The naming ‘batch processing’ comes from the 60s/70s. In this time, data and programs were stored on punch cards, and those punch cards were read in and processed in batches, often at night time.
What OLAP Is
OLAP (Online Analytical Processing) describes databases that are optimized for analytical requests. ‘Analytical requests’ means requests from applications that are used for decision support, such as an application to analyze the aggregated net value of sales orders per month or the average empty seats on flights overall.
‘Analytical requests’ does not mean requests from applications that are used for the day-to-day-business. An OLAP database is not an operative database but a database structured for analytics that extracts data from operative databases.
And, as the name says, the requests to an OLAP system are
- analytical, and
- processed online.
The ‘Analytical’ in OLAP
An OLAP database is made to analyze data and thus to answers questions about the data. For example, you, as the zoo owner, would like to know all about your zoo, such as:
- What is the net value of all sold tickets for last year?
- What is the average amount of food that each zoo animal has eaten last month?
- Who is the zookeeper with the most overtime overall?
An OLAP database is a DW. A DW extracts, transforms, and stores data from operational databases or external sources. Imagine a DW as an OLAP database providing an additional layer on top of the operational databases.
An advantage of an OLAP database is that it is separated from the operational databases. Therefore, it does not occupy any computation power of an operational database to answer analytical requests and thus does not slow down the day-to-day-business applications.
The ‘Online’ in OLAP
The online in OLAP means the same as online in OLTP.
Again, the ‘online’ in OLTP, and as well in OLAP, means that the requests to the database are processed in real-time without any delay. The data is processed as soon as the request is made, and an answer is given as fast as possible.
Differences between OLTP and OLAP
|Describes||Structure of an RDBMS database for operational requests||Structure of an RDBMS database for analytical requests|
|Data Source||Original data source||Extracts data from OLTP databases and external sources|
|Purpose||Day-to-day-business requests to run the fundamental business||Analytical requests for decision support, planning, and problem solving|
|Request Types||Create, read, update, and delete||Read|
|Request Aims||Operational such as update the price of the item with the ID 1||Analytical such as what is the net value of all sold items in 2018|
|Performance Scope||High volume of transactions||High volume of data|
|Data Structure||Normalized data||Denormalized data|
|Data Set Age||6 - 18 months||2 - 7 years|
|Request Complexity||Mostly short and simple requests that are standardized||Mostly long and complex requests that contain aggregations|
OLTP and OLAP in SAP
For example, OLTP databases are used in SAP ERP systems, such as the R/3 and ECC.
For example, OLAP databases are used in SAP NetWeaver Business Warehouses.
But the future is bright in the world of SAP: SAP’s own database called HANA combines OLTP and OLAP in a single database.
HANA is an in-memory database. In-memory means that the data does not reside on a hard drive and get loaded into the memory only when it is needed for processing or analysis. Rather, it resides all the time in the memory.
Because the data resides permanently in the memory, and therefore no hard drive access is necessary, a HANA database is super fast. Therefore, in a nutshell, OLTP and OLAP requests can be processed altogether in a single HANA database.