RonDB Generic interpreter#
In RonDB data nodes an interpreter is there to support various common operations. Previous to RonDB 24.10 this interpreter was mainly used to support filters on scans and key lookups and for providing autoincrement support.
In 24.10 the interpreter have been extended to support a lot more things such as more arithmetic operations, introduction of a memory, the ability to read memory inte registers, reading column data into memory, even parts of a column can be read for variable size columns using the binary character set.
This has found use cases in situations where RonDB is used in other data servers such as graph databases, key-value stores and so forth. An interpreter in the data node is a way of avoiding data shipping which incurs both CPU costs, network costs and latency. Instead we can sometimes use function shipping to optimise. However it comes at a cost of complexity, so should mostly be used for situations where it really pays off to use them.
The interpreter is a virtual machine that has 8 registers and a 64 kByte memory. The 64 kByte memory was added in RonDB 24.10. The 8 registers stores 64-bit signed integers, but can also be filled with 8-bit, 16-bit and 32-bit unsigned integers. A register can have a NULL value as well, this is the value they start out with.
The interpreter always operate on a single row in either a scan operation or a key lookup. The user of the NDB API can send the program and data as input to the interpreter. The interpreter has access to all columns of the row and it can also send back results to the NDB API user.
Interpreter instructions#
The interpreter program is created in the NDB API. One uses a class called NdbInterpretedCode. The below example shows the steps used to define an interpreted program. First created an object from the class NdbInterpretedCode. If it uses reads and writes on columns it needs to have a table object mapped to it. Next define the set of instructions. The last instruction should always be a call to interpreter_exit_ok that defines a success, interpreter_exit_nok signals a failure and will cause the operation to be aborted.
The final statement is a call to the finalise method. This ensures that the code is consistent and maps labels used in the program appropriately.
The NdbInterpretedCode is sent into the NDB API through an OperationOptions variable that need to set a flag OO_INTERPRETED and map the object to this data structure. This data structure is then passed in a call to readTuple, updateTuple, writeTuple or deleteTuple. It cannot be used in an insertTuple call. In the case of writeTuple the interpreted program will only be executed if the write operation is translated into an update, it will be ignored if the write is translated into an insert operation.
The other parameters to the readTuple call is explained in another chapter.
Uint32 buffer[1024];
NdbInterpretedCode code(pTab, &buffer[0], 1024);
code.load_const_u16(0, 50000);
code.interpret_exit_ok();
code.finalise();
NdbOperation::OperationOptions opts;
std::memset(&opts, 0, sizeof(opts));
opts.optionsPresent = NdbOperation::OperationOptions::OO_INTERPRETED;
opts.interpretedCode = &code;
pOp = pTrans->readTuple(pRowRecord, (char*)pRow,
pRowRecord, (char*)pRow,
NdbOperation::LM_Read,
0,
&opts,
sizeof(opts));
Arithmetic instructions#
The following arithmetic operations are available in the interpreter.
Operation | Output | First Input | Second input |
---|---|---|---|
Add | Register | Register | Register |
Add | Register | Register | Constant |
Sub | Register | Register | Register |
Sub | Register | Register | Constant |
Left shift | Register | Register | Register |
Left shift | Register | Register | Constant |
Right shift | Register | Register | Register |
Right shift | Register | Register | Constant |
Mul | Register | Register | Register |
Mul | Register | Register | Constant |
Div | Register | Register | Register |
Div | Register | Register | Constant |
And | Register | Register | Register |
And | Register | Register | Constant |
Or | Register | Register | Register |
Or | Register | Register | Constant |
Xor | Register | Register | Register |
Xor | Register | Register | Constant |
Mod | Register | Register | Register |
Mod | Register | Register | Constant |
Not | Register | Register | N/A |
Move | Register | Register | N/A |
Here is the declarations of these methods in the header file:
int add_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int sub_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int lshift_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int rshift_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int mul_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int div_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int and_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int or_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int xor_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int mod_reg(Uint32 RegDest, Uint32 RegSource1, Uint32 RegSource2);
int not_reg(Uint32 RegDest, Uint32 RegSource1);
int move_reg(Uint32 RegDest, Uint32 RegSource);
int add_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int sub_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int lshift_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int rshift_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int mul_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int div_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int and_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int or_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int xor_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
int mod_const_reg(Uint32 RegDest, Uint32 RegSource1, Uint16 Constant);
Load constant instructions#
Another set of instructions are instructions to load registers with constants, where a constant can also be the NULL value.
Operation | Output | First Input |
---|---|---|
Load Constant NULL | Register | N/A |
Load Constant Uint16 | Register | Constant |
Load Constant Uint32 | Register | Constant |
Load Constant Uint64 | Register | Constant |
Another load of a constant uses a variable sized memory that can be used by the interpreter. This instruction uses one register to receive the memory offset, the size is a constant added in the instruction and after executing the instruction the size will be stored in a register. The actual data is added in a pointer to the method.
Operation | OutputSize | Memory offset | Size | Memory address |
---|---|---|---|---|
Load Memory | Register | Register | Constant | Pointer |
Here are the declarations in the header file.
int load_const_null(Uint32 RegDest);
int load_const_u16(Uint32 RegDest, Uint32 Constant);
int load_const_u32(Uint32 RegDest, Uint32 Constant);
int load_const_u64(Uint32 RegDest, Uint64 Constant);
int load_const_mem(Uint32 RegMemoryOffset,
Uint32 RegDestSize,
Uint16 SizeConstant,
Uint32 *const_memory);
Read column instructions#
There are numerous instructions to read from columns the entire column. Those that read into a register only applies to columns that fit within the 64 bits of the register. The read into memory applies to any column.
Operation | Column | Destination | Destination size |
---|---|---|---|
Read Into Register | Column id | Register | N/A |
Read Into Register | Column object | Register | N/A |
Read Into Memory | Column id | Register Offset | Register |
Read Into Memory | Column object | Register Offset | Register |
One can also read only parts of a column into memory. This only applies to columns of variable size with the binary character set. Reading parts of a multi-byte character set would be complicated to verify correctness of. This feature is mainly intended for use cases where the user has a data model stored within a large array of binary data. RonDB uses this as a method to store file data inside RonDB or other generic data such as JSON objects and so forth.
Operation | Column | Destination offset | Position | Size | Destination size |
---|---|---|---|---|---|
Read Part | Column id | Register | Register | Register | Register |
Read Part | Column object | Register | Register | Register | Register |
The reading of columns into memory will always use the first 4 bytes of the memory internally in the interpreter as input to the column reader. This is called AttributeHeader in the RonDB code. After the read instruction it will contain the size in bytes read (including the length bytes if read) in bit 0-14, bit 15 will be a flag being 1 if a partial read is performed and bit 16-31 contains the column id. Normally the interpreted can ignore those 4 bytes.
In the case of reading a full column with fixed size the header is followed by the columnar data. In the case of a VARBINARY column it will have in the first byte a length of the data. Thus the total length read here is length of data plus one. For LONGVARBINARY there are instead 2 bytes of length and total length read is length plus two.
Thus reading a variable sized column e.g. VARBINARY(3000) one will get the actual data starting in position 6 from the memory offset.
Here are the declarations in the header file.
int read_attr(Uint32 RegDest, Uint32 attrId);
int read_attr(Uint32 RegDest, const NdbDictionary::Column *column);
int read_full(Uint32 attrId,
Uint32 RegMemoryOffset,
Uint32 RegDestSize);
int read_full(const NdbDictionary::Column *column,
Uint32 RegMemoryOffset,
Uint32 RegDestSize);
int read_partial(Uint32 attrId,
Uint32 RegMemoryOffset,
Uint32 RegPos,
Uint32 RegSize,
Uint32 RegDestSize);
int read_partial(const NdbDictionary::Column *column,
Uint32 RegMemoryOffset,
Uint32 RegPos,
Uint32 RegSize,
Uint32 RegDestSize);
Write column instructions#
Write columns is a feature that is useful in update operations. They don’t really make sense for write operations since write operations will not execute the interpreter if they become insert operations. Write operations will overwrite the row with new values even if it existed. Write operations can require some interpreter logic, but not to write columns.
Operation | Column | Destination | Destination size |
---|---|---|---|
Write From Register | Column id | Register | N/A |
Write From Register | Column object | Register | N/A |
Write From Memory | Column id | Register Offset | Register |
Write From Memory | Column object | Register Offset | Register |
Append From Memory | Column id | Register Offset | Register |
Append From Memory | Column object | Register Offset | Register |
Here are the declarations in the header file.
int write_attr(Uint32 attrId, Uint32 RegSource);
int write_attr(const NdbDictionary::Column *column, Uint32 RegSource);
int write_from_mem(Uint32 attrId,
Uint32 RegMemoryOffset,
Uint32 RegSize);
int write_from_mem(const NdbDictionary::Column *column,
Uint32 RegMemoryOffset,
Uint32 RegSize);
int append_from_mem(Uint32 attrId,
Uint32 RegMemoryOffset,
Uint32 RegSize);
int append_from_mem(const NdbDictionary::Column *column,
Uint32 RegMemoryOffset,
Uint32 RegSize);
Reading from memory to register#
In cases where RonDB stores binary data, it is useful to read the binary data into a memory and retrieve the actual data from the binary data. This requires obviously understanding of what the binary data represents. For example the data could contain some sorted list that one could perform a binary search on to find where to go next. This could be a use case for data engines that use RonDB to store complex indices.
Reading and writing to memory from registers can also be useful to spill registers in complex calculations.
Operation | Destination Register | Memory offset | |
---|---|---|---|
Read Uint8 Mem to Reg | Register | Register | |
Read Uint8 Mem to Reg | Register | Constant | |
Read Uint16 Mem to Reg | Register | Register | |
Read Uint16 Mem to Reg | Register | Constant | |
Read Uint32 Mem to Reg | Register | Register | |
Read Uint32 Mem to Reg | Register | Constant | |
Read Int64 Mem to Reg | Register | Register | |
Read Int64 Mem to Reg | Register | Constant |
Operation | Source Register | Memory offset | |
---|---|---|---|
Write Uint8 Reg to Mem | Register | Register | |
Write Uint8 Reg to Mem | Register | Constant | |
Write Uint16 Reg to Mem | Register | Register | |
Write Uint16 Reg to Mem | Register | Constant | |
Write Uint32 Reg to Mem | Register | Register | |
Write Uint32 Reg to Mem | Register | Constant | |
Write Int64 Reg to Mem | Register | Register | |
Write Int64 Reg to Mem | Register | Constant |
Here are the declarations in the header file.
int read_uint8_to_reg_const(Uint32 RegDest, Uint32 memory_offset);
int read_uint16_to_reg_const(Uint32 RegDest, Uint32 memory_offset);
int read_uint32_to_reg_const(Uint32 RegDest, Uint32 memory_offset);
int read_int64_to_reg_const(Uint32 RegDest, Uint32 memory_offset);
int read_uint8_to_reg_reg(Uint32 RegDest, Uint32 RegOffset);
int read_uint16_to_reg_reg(Uint32 RegDest, Uint32 RegOffset);
int read_uint32_to_reg_reg(Uint32 RegDest, Uint32 RegOffset);
int read_int64_to_reg_reg(Uint32 RegDest, Uint32 RegOffset);
int write_uint8_reg_to_mem_const(Uint32 RegSource, Uint16 memory_offset);
int write_uint16_reg_to_mem_const(Uint32 RegSource, Uint16 memory_offset);
int write_uint32_reg_to_mem_const(Uint32 RegSource, Uint16 memory_offset);
int write_int64_reg_to_mem_const(Uint32 RegSource, Uint16 memory_offset);
int write_uint8_reg_to_mem_reg(Uint32 RegSource, Uint32 RegOffset);
int write_uint16_reg_to_mem_reg(Uint32 RegSource, Uint32 RegOffset);
int write_uint32_reg_to_mem_reg(Uint32 RegSource, Uint32 RegOffset);
int write_int64_reg_to_mem_reg(Uint32 RegSource, Uint32 RegOffset);
Handling variable size length conversions#
Variable sized columns, e.g. VARBINARY(3000) always start with 2 bytes containing the length of the column. This length is always stored in little-endian, so to safely use it one has to convert it to the local format used by the computer. Convert size in a memory and storing it in a register, and the opposite storing a 2-byte length in memory based on the value in the register are available instructions.
Operation | Source/Dest Register | Memory offset | |
---|---|---|---|
Convert size in memory | Register | Register | |
Write size into memory | Register | Register |
Here are the declarations in the header file.
int convert_size(Uint32 RegSizeDest, Uint32 RegOffset);
int write_size_mem(Uint32 RegSize, Uint32 RegOffset);
Sending generic data back to NDB API#
In some cases it might be a requirement to send back information to the NDB API. For example if a write operation appends to a variable sized column it needs to know how much was actually written to the column. As an example if we want to write 2048 bytes into a VARBINARY(3000) we could end up with not all the data being writen and this information can be vital knowledge for the NDB API user. To gain this information one can read the size of the column before the update and after the update and thus see how much of the write was applied and also the final length of the column.
The interpreter can write any data into 16 output registers. These are all unsigned 32-bit integers. These registers can later be read by the final read operations in the interpreted execution. If the operation was an insert, both the size before and after will be NULL since the insert operation will not execute anything apart from the update part.
Operation | Value | Output index | |
---|---|---|---|
Write to output register | Register | Constant |
Here are the declarations in the header file.
Conditional branch instructions#
There is a range of branch instructions that compare two values, the left value is always a register, the left could be a register or a constant. The branch is done if the condition is true, the branch is made to a label that is defined by an instruction called def_label with the label number as the only input.
Normally the labels would be defined in ascending order without gaps.
Operation | Left Value | Right Value | Label |
---|---|---|---|
Branch Greater or Equal | Register | Register | Constant |
Branch Greater or Equal | Register | Constant | Constant |
Branch Greater than | Register | Register | Constant |
Branch Greater than | Register | Constant | Constant |
Branch Less or Equal | Register | Register | Constant |
Branch Less or Equal | Register | Constant | Constant |
Branch Less than | Register | Register | Constant |
Branch Less than | Register | Constant | Constant |
Branch Equal | Register | Register | Constant |
Branch Equal | Register | Constant | Constant |
Branch Not Equal | Register | Register | Constant |
Branch Not Equal | Register | Constant | Constant |
There are also branch instructions that compare the register value with NULL.
Operation | Left Value | Label | |
---|---|---|---|
Branch Equal NULL | Register | Constant | |
Branch Not Equal NULL | Register | Constant |
Finally there is branch instruction going unconditionally to a label.
Here are the declarations in the header file.
int branch_ge(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_gt(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_le(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_lt(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_eq(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_ne(Uint32 RegLvalue, Uint32 RegRvalue, Uint32 label);
int branch_ne_null(Uint32 RegLvalue, Uint32 label);
int branch_eq_null(Uint32 RegLvalue, Uint32 label);
int branch_ge_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_gt_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_le_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_lt_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_eq_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_ne_const(Uint32 RegLvalue, Uint16 Constant, Uint32 label);
int branch_label(Uint32 label);
There is also a whole range of complex branch instructions, these are documented in the header file NdbInterpretedCode.hpp. These are mainly used for scan filters and are implementing all sorts of filters used by the MySQL Server.
Exit instructions#
As mentioned the last instruction executed in an interpreted program need to always be either interpret_exit_ok or interpret_exit_nok.