Automated CPU spinning#

In RonDB low latency and high throughput are two major requirements. RonDB has been built using a thread pipeline that has a major positive impact on throughput and thus the latency at high load will also decrease.

However latency at low loads could suffer with a thread pipeline since we have to wait for more threads to wake up in the execution of a database query.

One solution to this is to use CPU spinning. Most CPUs have special instructions used for CPU spinning. E.g. in x86 CPUs there is an instruction that takes around one hundred nanoseconds to execute. During the execution of this instruction the CPU can save power since the application has specified that it is waiting for an event. This means in hyperthreaded CPUs that the other CPU thread in the CPU core gets access to all resources.

If the thread goes to sleep the operating system is likely to execute instructions that makes the CPU core go to a lower usage of power, however waking up from such power saving states requires energy as well and increases latency. The special instruction also lowers power usage although not at the level of going into deeper sleep modes that the OS has access to. Thus a reasonable level of CPU spinning is likely to not generate extra power usage, a higher level of CPU spinning is likely to consume a bit more power usage, but will deliver a very positive impact on latency to access data.

If the CPUs are not only used by RonDB, CPU spinning is not such a good idea since it removes the chance for other processes to execute while RonDB is inactive. However in most real use cases for RonDB, the data nodes in RonDB executes on their own VMs or even their own bare metal servers.

Thus CPU spinning can be fairly efficient if the thread has no work to do. CPU spinning can avoid wake ups - a wake up of a sleeping thread can take up to 25 microseconds. Thus substantial savings of latency can be achieved if we use CPU spinning.

However to always spin on the CPU even with a likelihood of a new event arriving is not such a good idea. Thus we have implemented adaptive CPU spinning. We gather statistics on how often a thread is woken up and the expected time before the next event arrives. Using these statistics we can avoid CPU spinning in cases where it is likely to be useless. It thereby achieves a good balance between the optimal power savings as well as the optimal latency savings.

The user can choose 4 levels of CPU spinning:

The lowest level is no CPU spinning at all.
The second level is to spin only when we decrease CPU usage by spinning, thus optimising for power savings.
The next level is a balanced effort between power savings and latency savings where we spend some extra effort in the CPUs to achieve an improved latency, but we are not overly aggressive in the CPU spinning. This is the default level.
The highest level provides the optimal latency savings, but at a higher cost of power usage.