BLOB handling#
BLOBs are treated in a special manner in RonDB. The reason is that RonDB is a real-time engine. This means that working with large BLOB’s isn’t desirable for the RonDB data nodes. Working with large objects would create challenges around buffer management. A single read of a row would have to be scheduled in several steps.
To avoid these problems we split the BLOB into a number of parts and work with one part at a time.
Currently RonDB data nodes doesn’t support storing the entire BLOBs as part of the original row. A large BLOB must be split into multiple parts and these parts are stored in different rows.
A BLOB (or TEXT) field creates a VARCHAR column that can be up to 256 bytes in the table itself. If the BLOB is smaller, no more parts are needed. If the BLOB is larger, the BLOB is split such that the first 256 bytes is stored in the table itself, the rest is stored in a separate table for the BLOB column. Each row in this table contains a VARCHAR that can store up to 2 kBytes of the BLOB. E.g. a BLOB of 12.8 kBytes will be stored partly in the table itself (256 bytes), there will be 6 rows that use the entire 2 kBytes and a final row that stores the remainder which is about 0.55 kBytes in size. The last 7 rows are stored in a special table that is only used for this BLOB column. Each BLOB column adds one more table in RonDB. It is important to remember this when considering the amount of metadata RonDB has to store.
One reason for storing the extra rows in a separate table is to make it easier to add and drop BLOB columns. Adding a BLOB column can be made as an online operation by adding the VARCHAR column to the base table using the dynamic row format and creating a new table for the BLOB parts.
The part rows use the same partitioning key as the base table and partly the same primary key as the base table with an added columns for part id. Thus the BLOB rows will be stored in the same places since the BLOB tables use the same partitioning scheme.
Some of the problems of BLOBs comes from load regulation. If we store a BLOB that is 100 MByte in size it is obvious that if someone tries to write the entire BLOB as one operation or reading the BLOB as one operation as this would put an enormous load on the network for a short time, most likely stopping many other operations from progressing. To avoid this we will definitely want to send parts of a large BLOB at a time over the network even if at some time the whole BLOB is stored inside the table row.
Using BLOBs for fields that normally isn’t very large is not the optimal solution in RonDB. For example if the majority of the BLOBs are about 1 kBytes in size it makes sense to create a VARBINARY(2048) or something like that instead.
However for storing objects that can be fairly large the BLOB implementation is quite ok. We ensure that reading and writing of BLOBs is split into many operations to ensure that the network isn’t overloaded by single BLOB operations.
Concurrency control of BLOB columns is a bit special due to this split of the column, this will be covered in the coming chapter on Concurrency Control.
When executing a transaction that changes a BLOB it is important that one operation record per BLOB part is needed, in the example above with a BLOB of 12.8 kByte in size there are 7 additional operation records when writing a BLOB. In RonDB all BLOB operations from SQL either write or read the entire object.
A natural thing for many BLOB columns are to store those as disk data columns. The main consideration here is that each extra BLOB part has to store the primary key part in main memory, thus the main memory used for each BLOB part is about 50-60 bytes dependent on the size of the primary key. If we need to store about 1 TBytes of BLOBs we still need to have about 30 GBytes of main memory storage.