Transaction is a concept that was developed many thousands of years ago to ensure that business transactions could be done safely, such that buyer and seller could trust that they will receive what they expect from the business transaction.
In databases, transactions have been an important concept since the first DBMS was developed. Using transactions makes it easy to reason about changes to your data.
Basic transaction theory#
An important part of the transaction theory is ensuring the ACID properties.
Atomic means that a transaction is either fully done or nothing is done. If the transaction is committed all operations of the transaction is performed, when the transaction is aborted no part of the transaction survives.
RonDB: Fulfilled using two-phase commit protocol (2PC).
Consistency in databases can relate to two properties. Firstly, it can mean that any transaction will leave the database in a consistent state. This disallows any corruption of data in terms of constraints, foreign keys, triggers, etc. This is the property meant primarily in ACID.
However, in distributed databases, consistency can also relate to the equivalency of data between replicas. This is also relevant in RonDB. BASE databases support eventual consistency, which means equivalency will be reached at some point in time. Examples of databases with eventual consistency are InnoDB Cluster, Galera Cluster and MongoDB. ACID however also requires strong consistency, which means that the data is always consistent between replicas.
BASE (opposite of acid in chemistry): Basically Available, Soft state, Eventual consistency
This measures the degree to which different transactions are isolated from one another. It is therefore the extent to which changes in one transaction will affect another transaction. In fully isolated (serializable) transactions, one can reason about transactions as if they happened instantly at commit time and therefore in serial order.
Broadly, there are four different levels of isolation, where the trade-off is between concurrency and read consistency.
Read Uncommitted: Dirty reads allowed
Read Committed: Prevents dirty reads
Repeatable Read: Prevents dirty reads and non-repeatable reads
Serialisable: Prevents dirty reads, non-repeatable reads and phantom reads
RonDB: Read committed isolation level fulfilled using strict two-phase locking (S2PL) on a row level
This describes the property that a committed transaction is safely stored, even in the presence of crashes and similar events. Most early database users were banks, which used them for monetary transfers. This was where this was most important.
RonDB: Fulfilled using Network Durability
RonDB Transaction Model#
RonDB is designed for ACID but has a few compromises. In terms of isolation, it supports the isolation level read committed and in terms of durability, it uses the model network durability.
RonDB achieves its isolation by the use of row locks. Using the isolation level read committed, RonDB however ensures that reads can be performed without using locks. Thus reads in this mode have a very predictable latency since they are independent of other transactions.
The isolation level read committed is also suitable for RonDB since it is designed for real-time data. Thus it is important to consider any changes happening while the query is running. Supporting repeatable read would mean extra memory overhead, extra processing overhead and even the risk of running out of memory due to long-running queries.
Many DBMSs also ensure that ranges of rows can be protected. RonDB does not support this, since the intended set of applications had no immediate need for these range locks. A range lock would make it very hard to meet response time requirements, and applications could easily be made less scalable by misuse of them. But even with correct application usage, range locks in a distributed architecture will limit the scalability of the DBMS.
Durability requires the transaction to be committed on disk. This is hard to meet while still meeting the response time requirements. RonDB therefore uses the alternative model: network durability. The transaction is considered safe if at least two computers have received the transaction at commit time. If multiple nodes crash simultaneously a transaction could be lost. The restored system will always be a consistent database and if it is important to not report a successful transaction before it is durable on all disks in all nodes. There are API calls that make it possible to wait for this durability level.