Reporting issues in RonDB#
RonDB is maintained on GitHub. The RonDB tree is placed at https://github.com/logicalclocks/rondb. Any issues can be reported in the Issues section in this tree.
Write a description of how the issue is reproduced and some other notes about how the execution environment looks like.
Next we will describe how to put together a tarball of the important log files that are required to be able to find the solution to the issue. We will describe how to do this for Hopsworks user with some information how one can handle this for generic installations.
Preparing a crash tarball for a RonDB data node#
RonDB data nodes have very detailed log reports that assists in understanding how the crash came about. This includes node logs, trace files that shows messages executed right before the crash (not message data, mostly the message header). RonDB also reports the last jumps made in the code, thus giving a very important addition to the stack trace also reported in the node log. The error log reports which thread that had the crash and there is one trace file for each block thread executing.
All those files are in the same directory. In Hopsworks this directory is /srv/hops/mysql-cluster/log. So to create a tarball of this simply use the command:
tar cfz log.tar.gz log
Now upload this tarball to the GitHub issue and chances are good that we will be able to assist you in finding the problem. Often the fix requires upgrading RonDB and this is described in the chapter on upgrading RonDB.
Generally there are three file types, the node log, often called ndb_1_out.log (where 1 is the node id of the node). Second we have the error log, this is usually called ndb_1_error.log. There can be several crashes reported in the error log. We maintain a maximum number of error reports to avoid growing the file system in cases of repeated failures (RonDB clusters can often continue working even the case of repeated failures of one data node). Thus when we reached the maximum number of crash reports (configured and defaults to 25) the next error report will be 1 again and will overwrite the previous error report with number 1. The trace files are named something like ndb_1_trace.log.3_t2. The first number is the node id, the second number is the number of the crash and t2 is the thread number 2. Thread 0 trace file is named ndb_1_trace.log.3.
RonDB management server logs#
The placement of the log files for RonDB management server is the same as for the RonDB data node. There are two file types here of value. The first is often named ndb_65_cluster.log where 65 is the node id of the management server. This is the cluster log (named cluster.log in Hopsworks). This log contains information about events in all the nodes of the cluster. There is also a node log of the management server in the same directory.
MySQL Server logs#
The MySQL server logs contains often much less information, but can still provide useful information about what happened in the cluster. In Hopsworks the MySQL error log is placed at the same place as other logs and is usually called mysql_67.log where 67 is a node id, but sometimes this isn't the correct node id. A MySQL server uses often multiple node ids to scale to a larger amount of CPUs.