In the following section, we provide answers to questions that are
most frequently asked about MySQL Cluster and the
NDB
storage engine.
Questions
26.11.1: What does 「NDB」 mean?
26.11.2: What's the difference in using Cluster vs using replication?
26.11.3: Do I need to do any special networking to run Cluster? How do computers in a cluster communicate?
26.11.4: How many computers do I need to run a cluster, and why?
26.11.5: What do the different computers do in a MySQL Cluster?
26.11.6: With which operating systems can I use Cluster?
26.11.7: What are the hardware requirements for running MySQL Cluster?
26.11.8: How much RAM do I need? Is it possible to use disk memory at all?
26.11.9: What filesystems can I use with MySQL Cluster? What about network filesystems or network shares?
26.11.10: I'm trying to populate a Cluster database. The loading process terminates prematurely and I get an error message like this one:
ERROR 1114: The table 'my_cluster_table' is full
Why is this happening?
26.11.11: MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
26.11.12: Do I have to learn a new programming or query language to use MySQL Cluster?
26.11.13: How do I find out what an error or warning message means when using MySQL Cluster?
26.11.14: Is MySQL Cluster transaction-safe? What isolation levels are supported?
26.11.15: What storage engines are supported by MySQL Cluster?
26.11.16: Which versions of the MySQL software support Cluster? Do I have to compile from source?
26.11.17: In the event of a catastrophic failure — say, for instance, the whole city loses power and my UPS fails — would I lose all my data?
26.11.18:
Is it possible to use FULLTEXT
indexes
with Cluster?
26.11.19: Can I run multiple nodes on a single computer?
26.11.20: Can I add nodes to a cluster without restarting it?
26.11.21: Are there any limitations that I should be aware of when using MySQL Cluster?
26.11.22: How do I import an existing MySQL database into a cluster?
26.11.23: How do cluster nodes communicate with one another?
26.11.24: What is an arbitrator?
26.11.25: What data types are supported by MySQL Cluster?
26.11.26: How do I start and stop MySQL Cluster?
26.11.27: What happens to cluster data when the cluster is shut down?
26.11.28: Is it helpful to have more than one management node for a cluster?
26.11.29: Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
26.11.30: Can I run two data nodes on a single host? Two SQL nodes?
26.11.31: Can I use hostnames with MySQL Cluster?
26.11.32: How do I handle MySQL users in a Cluster having multiple MySQL servers?
Questions and Answers
26.11.1: What does 「NDB」 mean?
This stands for
「Network
Database」.
NDB
(also known as NDB
Cluster
or NDBCLUSTER
) is the
storage engine that enables clustering in MySQL.
26.11.2: What's the difference in using Cluster vs using replication?
In a replication setup, a master MySQL server updates one or
more slaves. Transactions are committed sequentially, and a
slow transaction can cause the slave to lag behind the
master. This means that if the master fails, it is possible
that the slave might not have recorded the last few
transactions. If a transaction-safe engine such as
InnoDB
is being used, a transaction will
either be complete on the slave or not applied at all, but
replication does not guarantee that all data on the master
and the slave will be consistent at all times. In MySQL
Cluster, all data nodes are kept in synchrony, and a
transaction committed by any one data node is committed for
all data nodes. In the event of a data node failure, all
remaining data nodes remain in a consistent state.
In short, whereas standard MySQL replication is asynchronous, MySQL Cluster is synchronous.
We have implemented (asynchronous) replication for Cluster in MySQL 5.1. This includes the capability to replicate both between two clusters, and from a MySQL cluster to a non-Cluster MySQL server. See 項14.10. 「MySQL Cluster レプリケーション」.
26.11.3: Do I need to do any special networking to run Cluster? How do computers in a cluster communicate?
MySQL Cluster is intended to be used in a high-bandwidth environment, with computers connecting via TCP/IP. Its performance depends directly upon the connection speed between the cluster's computers. The minimum connectivity requirements for Cluster include a typical 100-megabit Ethernet network or the equivalent. We recommend you use gigabit Ethernet whenever available.
The faster SCI protocol is also supported, but requires special hardware. See 項14.12. 「MySQL Cluster での高速インターコネクトを使用する」, for more information about SCI.
26.11.4: How many computers do I need to run a cluster, and why?
A minimum of three computers is required to run a viable cluster. However, the minimum recommended number of computers in a MySQL Cluster is four: one each to run the management and SQL nodes, and two computers to serve as data nodes. The purpose of the two data nodes is to provide redundancy; the management node must run on a separate machine to guarantee continued arbitration services in the event that one of the data nodes fails.
To provide increased throughput and high availability, you should use multiple SQL nodes (MySQL Servers connected to the cluster). It is also possible (although not strictly necessary) to run multiple management servers.
26.11.5: What do the different computers do in a MySQL Cluster?
A MySQL Cluster has both a physical and logical organization, with computers being the physical elements. The logical or functional elements of a cluster are referred to as nodes, and a computer housing a cluster node is sometimes referred to as a cluster host. There are three types of nodes, each corresponding to a specific role within the cluster. These are:
Management node (MGM node): Provides management services for the cluster as a whole, including startup, shutdown, backups, and configuration data for the other nodes. The management node server is implemented as the application ndb_mgmd; the management client used to control MySQL Cluster via the MGM node is ndb_mgm.
Data node: Stores and replicates data. Data node functionality is handled by an instance of the NDB data node process ndbd.
SQL node: This is
simply an instance of MySQL Server
(mysqld) that is built with support
for the NDB Cluster
storage engine
and started with the --ndb-cluster
option to enable the engine.
26.11.6: With which operating systems can I use Cluster?
MySQL Cluster is supported on most Unix-like operating systems, including Linux, Mac OS X, Solaris, BSD, HP-UX, AIX, and IRIX, among others, as well as Novell Netware. Cluster is not supported for Windows at this time. However, we are working to add Cluster support for other platforms, including Windows, and our goal is to offer MySQL Cluster on all platforms for which MySQL itself is supported.
For more detailed information concerning the level of support which is offered for MySQL Cluster on various operating system versions, OS distributions, and hardware platforms, please refer to http://www.mysql.com/support/supportedplatforms.html.
26.11.7: What are the hardware requirements for running MySQL Cluster?
Cluster should run on any platform for which NDB-enabled binaries are available. Naturally, faster CPUs and more memory will improve performance, and 64-bit CPUs will likely be more effective than 32-bit processors. There must be sufficient memory on machines used for data nodes to hold each node's share of the database (see How much RAM do I Need? for more information). Nodes can communicate via a standard TCP/IP network and hardware. For SCI support, special networking hardware is required.
26.11.8: How much RAM do I need? Is it possible to use disk memory at all?
Previous to MySQL 5.1, Cluster was in-memory only. This meant that all table data (including indexes) was stored in RAM. If your data took up 1GB of space and you wanted to replicate it once in the cluster, you needed 2GB of memory to do so (1 GB per replica). This was in addition to the memory required by the operating system and any applications running on the cluster computers. This is still true of in-memory tables.
If a data node's memory usage exceeds what is available in
RAM, then the system will attempt to use swap space up to
the limit set for DataMemory
. However,
this will at best result in severely degraded performance,
and may cause the node to be dropped due to slow response
time (missed heartbeats). We do not recommend on relying on
disk swapping in a production environment for this reason.
In any case, once the DataMemory
limit is
reached, any operations requiring additional memory (such as
inserts) will fail.
NDB Cluster
in MySQL 5.1
includes support for Disk Data, which helps to alleviate
these issues. See 項14.11. 「MySQL Cluster ディスク データ ストレージ」,
for more information.
You can use the following formula for obtaining a rough estimate of how much RAM is needed for each data node in the cluster:
(SizeofDatabase × NumberOfReplicas × 1.1 ) / NumberOfDataNodes
To calculate the memory requirements more exactly requires determining, for each table in the cluster database, the storage space required per row (see 項10.5. 「データタイプが必要とする記憶容量」, for details), and multiplying this by the number of rows. You must also remember to account for any column indexes as follows:
Each primary key or hash index created for an
NDBCluster
table requires 21–25
bytes per record. These indexes use
IndexMemory
.
Each ordered index requires 10 bytes storage per record,
using DataMemory
.
Creating a primary key or unique index also creates an
ordered index, unless this index is created with
USING HASH
. In other words:
A primary key or unique index on a Cluster table normally takes up 31 to 35 bytes per record.
However, if the primary key or unique index is
created with USING HASH
, then it
requires only 21 to 25 bytes per record.
Note that creating MySQL Cluster tables with USING
HASH
for all primary keys and unique indexes will
generally cause table updates to run more quickly — in
some cases by a much as 20 to 30 percent faster than updates
on tables where USING HASH
was not used
in creating primary and unique keys. This is due to the fact
that less memory is required (because no ordered indexes are
created), and that less CPU must be utilized (because fewer
indexes must be read and possibly updated). However, it also
means that queries that could otherwise use range scans must
be satisfied by other means, which can result in slower
selects.
When calculating Cluster memory requirements, you may find
useful the ndb_size.pl
utility which is
available in recent MySQL 5.1 releases. This
Perl script connects to a current (non-Cluster) MySQL
database and creates a report on how much space that
database would require if it used the
NDBCluster
storage engine. For more
information, see
項14.9.13. 「ndb_size.pl — NDBCluster サイズ仕様エスティメーター」.
It is especially important to keep in mind that
every MySQL Cluster table must have a primary
key. The NDB
storage engine
creates a primary key automatically if none is defined, and
this primary key is created without USING
HASH
.
There is no easy way to determine exactly how much memory is
being used for storage of Cluster indexes at any given time;
however, warnings are written to the Cluster log when 80% of
available DataMemory
or
IndexMemory
is in use, and again when use
reaches 85%, 90%, and so on.
26.11.9: What filesystems can I use with MySQL Cluster? What about network filesystems or network shares?
Generally, any filesystem that is native to the host operating system should work well with MySQL Cluster. If you find that a given filesystem works particularly well (or not so especially well) with MySQL Cluster, we invite you to discuss your findings in the MySQL Cluster Forums.
We do not test MySQL Cluster with FAT
or
VFAT
filesystems on Linux. Because of
this, and due to the fact that these are not very useful for
any purpose other than sharing disk partitions between Linux
and Windows operating systems on multi-boot computers, we do
not recommend their use with MySQL Cluster.
MySQL Cluster is implemented as a shared-nothing solution; the idea behind this is that the failure of a single piece of hardware should not cause the failure of multiple cluster nodes, or possibly even the failure of the cluster as a whole. For this reason, the use of network shares or network filesystems is not supported for MySQL Cluster. This also applies to shared storage devices such as SANs.
26.11.10:
I'm trying to populate a Cluster database. The loading
process terminates prematurely and I get an error message
like this one:
ERROR 1114: The table 'my_cluster_table' is full
Why is this happening?
The cause is very likely to be that your setup does not
provide sufficient RAM for all table data and all indexes,
including the primary key required by the
NDB
storage engine and automatically
created in the event that the table definition does not
include the definition of a primary key.
It is also worth noting that all data nodes should have the same amount of RAM, since no data node in a cluster can use more memory than the least amount available to any individual data node. In other words, if there are four computers hosting Cluster data nodes, and three of these have 3GB of RAM available to store Cluster data while the remaining data node has only 1GB RAM, then each data node can devote only 1GB to clustering.
26.11.11: MySQL Cluster uses TCP/IP. Does this mean that I can run it over the Internet, with one or more nodes in remote locations?
It is very unlikely that a cluster would perform reliably under such conditions, as MySQL Cluster was designed and implemented with the assumption that it would be run under conditions guaranteeing dedicated high-speed connectivity such as that found in a LAN setting using 100 Mbps or gigabit Ethernet — preferably the latter. We neither test nor warrant its performance using anything slower than this.
Also, it is extremely important to keep in mind that communications between the nodes in a MySQL Cluster are not secure; they are neither encrypted nor safeguarded by any other protective mechanism. The most secure configuration for a cluster is in a private network behind a firewall, with no direct access to any Cluster data or management nodes from outside. (For SQL nodes, you should take the same precautions as you would with any other instance of the MySQL server.)
26.11.12: Do I have to learn a new programming or query language to use MySQL Cluster?
No. Although some specialized commands are used to manage and configure the cluster itself, only standard (My)SQL queries and commands are required for the following operations:
Creating, altering, and dropping tables (including Disk Data tables and related objects)
Inserting, updating, and deleting table data
Creating, changing, and dropping primary and unique indexes
Some specialized configuration parameters and files are required to set up a MySQL Cluster — see 項14.4.4. 「設定ファイル」, for information about these.
A few simple commands are used in the MySQL Cluster management client for tasks such as starting and stopping cluster nodes. See 項14.7.2. 「マネジメント クライアントのコマンド」.
26.11.13: How do I find out what an error or warning message means when using MySQL Cluster?
There are two ways in which this can be done:
26.11.14: Is MySQL Cluster transaction-safe? What isolation levels are supported?
Yes: For tables created with the
NDB
storage engine, transactions are
supported. In MySQL 5.1, Cluster supports only
the READ COMMITTED
transaction isolation
level.
26.11.15: What storage engines are supported by MySQL Cluster?
Clustering in MySQL is supported only by the
NDB
storage engine. That is, in order for
a table to be shared between nodes in a cluster, it must be
created using ENGINE=NDB
(or
ENGINE=NDBCLUSTER
, which is equivalent).
It is possible to create tables using other storage engines
(such as MyISAM
or
InnoDB
) on a MySQL server being used for
clustering, but these non-NDB
tables will
not participate in the
cluster; they are local to the individual MySQL server
instance on which they are created.
26.11.16: Which versions of the MySQL software support Cluster? Do I have to compile from source?
Cluster is supported in all server binaries in the
5.1 release series for operating systems on
which MySQL Cluster is available. See
項4.2. 「mysqld — MySQL サーバ」. You can determine whether your
server has NDB support using either the SHOW
VARIABLES LIKE 'have_%'
or SHOW
ENGINES
statement.
You can also obtain NDB support by compiling MySQL from source, but it is not necessary to do so simply to use MySQL Cluster. To download the latest binary, RPM, or source distribution in the MySQL 5.1 series, visit http://dev.mysql.com/downloads/mysql/5.1.html.
26.11.17: In the event of a catastrophic failure — say, for instance, the whole city loses power and my UPS fails — would I lose all my data?
All committed transactions are logged. Therefore, although it is possible that some data could be lost in the event of a catastrophe, this should be quite limited. Data loss can be further reduced by minimizing the number of operations per transaction. (It is not a good idea to perform large numbers of operations per transaction in any case.)
26.11.18:
Is it possible to use FULLTEXT
indexes
with Cluster?
FULLTEXT
indexing is not supported by the
NDB
storage engine in MySQL
5.1, or by any storage engine other than
MyISAM
. We are working to add this
capability in a future release.
26.11.19: Can I run multiple nodes on a single computer?
It is possible but not advisable. One of the chief reasons to run a cluster is to provide redundancy. To enjoy the full benefits of this redundancy, each node should reside on a separate machine. If you place multiple nodes on a single machine and that machine fails, you lose all of those nodes. Given that MySQL Cluster can be run on commodity hardware loaded with a low-cost (or even no-cost) operating system, the expense of an extra machine or two is well worth it to safeguard mission-critical data. It also worth noting that the requirements for a cluster host running a management node are minimal. This task can be accomplished with a 200 MHz Pentium CPU and sufficient RAM for the operating system plus a small amount of overhead for the ndb_mgmd and ndb_mgm processes.
It is acceptable to run multiple cluster data nodes on a single host for learning about MySQL Cluster, or for testing purposes; however, this is not supported for production use.
26.11.20: Can I add nodes to a cluster without restarting it?
Not at present. A simple restart is all that is required for adding new MGM or SQL nodes to a Cluster. When adding data nodes the process is more complex, and requires the following steps:
Make a complete backup of all Cluster data.
Completely shut down the cluster and all cluster node processes.
Restart the cluster, using the
--initial
startup option.
Restore all cluster data from the backup.
In a future MySQL Cluster release series, we hope to implement a 「hot」 reconfiguration capability for MySQL Cluster to minimize (if not eliminate) the requirement for restarting the cluster when adding new nodes. However, this is not planned for MySQL 5.1.
26.11.21: Are there any limitations that I should be aware of when using MySQL Cluster?
Limitations on NDB
tables in MySQL
5.1 include:
Temporary tables are not supported; a CREATE
TEMPORARY TABLE
statement using
ENGINE=NDB
or
ENGINE=NDBCLUSTER
fails with an
error.
The only types of user-defined partitioning supported
for NDB
tables are
KEY
and LINEAR
KEY
. (Beginning with MySQL 5.1.12, attempting
to create an NDB
table using any
other partitioning type fails with an error.)
FULLTEXT
indexes and index prefixes
are not supported. Only complete columns may be indexed.
Spatial data types are not supported. See 章 16. Spatial Extensions.
Only complete rollbacks for transactions are supported. Partial rollbacks and rollbacks to save points are not supported.
The maximum number of attributes allowed per table is 128, and attribute names cannot be any longer than 31 characters. For each table, the maximum combined length of the table and database names is 122 characters.
The maximum size for a table row is 8 kilobytes, not
counting BLOB
values. There is no set
limit for the number of rows per table. Table size
limits depend on a number of factors, in particular on
the amount of RAM available to each data node.
The NDB
engine does not support
foreign key constraints. As with
MyISAM
tables, these are ignored.
For a complete listing of limitations in MySQL Cluster, see 項14.13. 「MySQL Cluster の既知の制限」.
26.11.22: How do I import an existing MySQL database into a cluster?
You can import databases into MySQL Cluster much as you
would with any other version of MySQL. Other than the
limitations mentioned elsewhere in this FAQ and in
項14.13. 「MySQL Cluster の既知の制限」, the only other
special requirement is that any tables to be included in the
cluster must use the NDB
storage engine.
This means that the tables must be created with
ENGINE=NDB
or
ENGINE=NDBCLUSTER
.
It is also possible to convert existing tables using other
storage engines to NDB Cluster
using one
or more ALTER TABLE
statement, but this
requires an additional workaround. See
項14.13. 「MySQL Cluster の既知の制限」, for details.
26.11.23: How do cluster nodes communicate with one another?
Cluster nodes can communicate via any of three different protocols: TCP/IP, SHM (shared memory), and SCI (Scalable Coherent Interface). Where available, SHM is used by default between nodes residing on the same cluster host; however, this is considered experimental in MySQL 5.1. SCI is a high-speed (1 gigabit per second and higher), high-availability protocol used in building scalable multi-processor systems; it requires special hardware and drivers. See 項14.12. 「MySQL Cluster での高速インターコネクトを使用する」, for more about using SCI as a transport mechanism in MySQL Cluster.
26.11.24: What is an arbitrator?
If one or more nodes in a cluster fail, it is possible that not all cluster nodes will be able to 「see」 one another. In fact, it is possible that two sets of nodes might become isolated from one another in a network partitioning, also known as a 「split brain」 scenario. This type of situation is undesirable because each set of nodes tries to behave as though it is the entire cluster.
When cluster nodes go down, there are two possibilities. If more than 50% of the remaining nodes can communicate with each other, we have what is sometimes called a 「majority rules」 situation, and this set of nodes is considered to be the cluster. The arbitrator comes into play when there is an even number of nodes: in such cases, the set of nodes to which the arbitrator belongs is considered to be the cluster, and nodes not belonging to this set are shut down.
The preceding information is somewhat simplified. A more complete explanation taking into account node groups follows:
When all nodes in at least one node group are alive, network
partitioning is not an issue, because no one portion of the
cluster can form a functional cluster. The real problem
arises when no single node group has all its nodes alive, in
which case network partitioning (the
「split-brain」 scenario) becomes possible. Then
an arbitrator is required. All cluster nodes recognize the
same node as the arbitrator, which is normally the
management server; however, it is possible to configure any
of the MySQL Servers in the cluster to act as the arbitrator
instead. The arbitrator accepts the first set of cluster
nodes to contact it, and tells the remaining set to shut
down. Arbitrator selection is controlled by the
ArbitrationRank
configuration parameter
for MySQL Server and management server nodes. (See
項14.4.4.4. 「マネジメント サーバーの定義」, for
details.) It should also be noted that the role of
arbitrator does not in and of itself impose any heavy
demands upon the host so designated, and thus the arbitrator
host does not need to be particularly fast or to have extra
memory especially for this purpose.
26.11.25: What data types are supported by MySQL Cluster?
MySQL Cluster supports all of the usual MySQL data types,
with the exception of those associated with MySQL's spatial
extensions. (See 章 16. Spatial Extensions.) In
addition, there are some differences with regard to indexes
when used with NDB
tables.
Note: MySQL Cluster Disk
Data tables (that is, tables created with
TABLESPACE ... STORAGE DISK
ENGINE=NDBCLUSTER
) have only fixed-width rows.
This means that (for example) each Disk Data table record
containing a VARCHAR(255)
column requires
space for 255 characters (as required for the character set
and collation being used for the table), regardless of the
actual number of characters stored therein.
See 項14.13. 「MySQL Cluster の既知の制限」, for more information about these issues.
26.11.26: How do I start and stop MySQL Cluster?
It is necessary to start each node in the cluster separately, in the following order:
Start the management node with the ndb_mgmd command.
Start each data node with the ndbd command.
Start each MySQL server (SQL node) using mysqld_safe --user=mysql &.
Each of these commands must be run from a system shell on
the machine housing the affected node. (You do not have to
be physically present at the machine — a remote login
shell can be used for this purpose.) You can verify that the
cluster is running by starting the MGM management client
ndb_mgm on the machine housing the MGM
node and issuing the SHOW
or ALL
STATUS
command.
To shut down a running cluster, issue the command
SHUTDOWN
in the MGM client.
Alternatively, you may enter the following command in a
system shell on the machine hosting the MGM node:
shell> ndb_mgm -e "SHUTDOWN"
(Note that the quotation marks are optional here; the
SHUTDOWN
command itself is not
case-sensitive.)
Either of these commands causes the ndb_mgm, ndb_mgm, and any ndbd processes to terminate gracefully. MySQL servers running as Cluster SQL nodes can be stopped using mysqladmin shutdown.
For more information, see 項14.7.2. 「マネジメント クライアントのコマンド」, and 項14.3.6. 「安全なシャットダウンと再起動」.
26.11.27: What happens to cluster data when the cluster is shut down?
The data that was held in memory by the cluster's data nodes is written to disk, and is reloaded into memory the next time that the cluster is started.
26.11.28: Is it helpful to have more than one management node for a cluster?
It can be helpful as a fail-safe. Only one MGM node controls the cluster at any given time, but it is possible to configure one MGM as primary, and one or more additional management nodes to take over in the event that the primary MGM node fails.
See 項14.4.4. 「設定ファイル」, for information on how to configure MySQL Cluster management nodes.
26.11.29: Can I mix different kinds of hardware and operating systems in one MySQL Cluster?
Yes, so long as all machines and operating systems have the same 「endianness」 (all big-endian or all little-endian). It is also possible to use different MySQL Cluster releases on different nodes. However, we recommend this be done only as part of a rolling upgrade procedure (see 項14.5.1. 「クラスタのローリング再起動の実行」).
26.11.30: Can I run two data nodes on a single host? Two SQL nodes?
Yes, it is possible to do this. In the case of multiple data nodes, it is advisable (but not required) for each node to use a different data directory. If you want to run multiple SQL nodes on one machine, each instance of mysqld must use a different TCP/IP port. However, running more than one cluster node of a given type per machine is not supported for production use.
26.11.31: Can I use hostnames with MySQL Cluster?
Yes, it is possible to use DNS and DHCP for cluster hosts. However, if your application requires 「five nines」 availability, we recommend using fixed IP addresses. Making communication between Cluster hosts dependent on services such as DNS and DHCP introduces additional points of failure, and the fewer of these, the better.
26.11.32: How do I handle MySQL users in a Cluster having multiple MySQL servers?
MySQL user accounts and privileges are not automatically propagated between different MySQL servers accessing the same MySQL Cluster. Therefore, you must make sure that these are copied between the SQL nodes yourself.