The NDBCLUSTER
storage engine used by MySQL
Cluster is a relational database engine storing records in
tables as with other relational database systems. Table rows
represent records as tuples of relational data. When a new
table is created, its attribute schema is specified for the
table as a whole, and thus each table row has the same
structure. Again, this is typical of relational databases, and
NDB
is no different in this regard.
Primary Keys. Each record has from 1 up to 32 attributes which belong to the primary key of the table.
Transactions. Transactions are committed first to main memory, and then to disk, after a global checkpoint (GCP) is issued. Since all data are (in most NDB Cluster configurations) synchronously replicated and stored on multiple data nodes, the system can handle processor failures without loss of data. However, in the case of a system-wide failure, all transactions (committed or not) occurring since the most recent GCP are lost.
Concurrency Control.
NDBCLUSTER
uses pessimistic
concurrency control based on locking. If a
requested lock (implicit and depending on database
operation) cannot be attained within a specified time, then
a timeout error results.
Concurrent transactions as requested by parallel application programs and thread-based applications can sometimes deadlock when they try to access the same information simultaneously. Thus, applications need to be written in a manner such that timeout errors occurring due to such deadlocks are handled gracefully. This generally means that the transaction encountering a timeout should be rolled back and restarted.
Hints and Performance. Placing the transaction coordinator in close proximity to the actual data used in the transaction can in many cases improve performance significantly. This is particularly true for systems using TCP/IP. For example, a Solaris system using a single 500 MHz processor has a cost model for TCP/IP communication which can be represented by the formula
[30 microseconds] + ([100 nanoseconds] * [number of bytes])
This means that if we can ensure that we use “popular” links we increase buffering and thus drastically reduce the costs of communication. The same system using SCI has a different cost model:
[5 microseconds] + ([10 nanoseconds] * [number of bytes])
This means that the efficiency of an SCI system is much less dependent on selection of transaction coordinators. Typically, TCP/IP systems spend 30 to 60% of their working time on communication, whereas for SCI systems this figure is in the range of 5 to 10%. Thus, employing SCI for data transport means that less effort from the NDB API programmer is required and greater scalability can be achieved, even for applications using data from many different parts of the database.
A simple example would be an application that uses many simple
updates where a transaction needs to update one record. This
record has a 32-bit primary key which also serves as the
partitioning key. Then the keyData
is used
as the address of the integer of the primary key and
keyLen
is 4
.