This is one of the phases in which most kernel blocks
participate (see the table in
Section 6.5.3, “STTOR
Phase 0”).
Otherwise, most blocks are involved primarily in the
initialization of data — for example, this is all that
DBTC
does.
Many blocks initialize references to other blocks in Phase 1.
DBLQH
initializes block references to
DBTUP
, and DBACC
initializes block references to DBTUP
and
DBLQH
. DBTUP
initializes
references to the blocks DBLQH
,
TSMAN
, and LGMAN
.
NDBCNTR
initializes some variables and sets
up block references to DBTUP
,
DBLQH
, DBACC
,
DBTC
, DBDIH
, and
DBDICT
; these are needed in the special start
phase handling of these blocks using
NDB_STTOR
signals, where the bulk of the node
startup process actually takes place.
If the cluster is configured to lock pages (that is, if the
LockPagesInMainMemory
configuration parameter
has been set), CMVMI
handles this locking.
The QMGR
block calls the
initData()
method (defined in
storage/ndb/src/kernel/blocks/qmgr/QmgrMain.cpp
)
whose output is handled by all other blocks in the
READ_CONFIG_REQ
phase (see
Section 6.5.1, “Initialization Phase (Phase -1)”).
Following these initializations, QMGR
sends
the DIH_RESTARTREQ
signal to
DBDIH
, which determines whether a proper
system file exists; if it does, an initial start is not being
performed. After the reception of this signal comes the process
of integrating the node among the other data nodes in the
cluster, where data nodes enter the cluster one at a time. The
first one to enter becomes the master; whenever the master dies
the new master is always the node that has been running for the
longest time from those remaining.
QMGR
sets up timers to ensure that inclusion
in the cluster does not take longer than what the cluster's
configuration is set to allow (see
Controlling
Timeouts, Intervals, and Disk Paging for the relevant
configuration parameters), after which communication to all
other data nodes is established. At this point, a
CM_REGREQ
signal is sent to all data nodes.
Only the president of the cluster responds to this signal; the
president allows one node at a time to enter the cluster. If no
node responds within 3 seconds then the president becomes the
master. If several nodes start up simultaneously, then the node
with the lowest node ID becomes president. The president sends
CM_REGCONF
in response to this signal, but
also sends a CM_ADD
signal to all nodes that
are currently alive.
Next, the starting node sends a
CM_NODEINFOREQ
signal to all current
“live” data nodes. When these nodes receive that
signal they send a NODE_VERSION_REP
signal to
all API nodes that have connected to them. Each data node also
sends a CM_ACKADD
to the president to inform
the president that it has heard the
CM_NODEINFOREQ
signal from the new node.
Finally, each of the current data nodes sends the
CM_NODEINFOCONF
signal in response to the
starting node. When the starting node has received all these
signals, it also sends the CM_ACKADD
signal
to the president.
When the president has received all of the expected
CM_ACKADD
signals, it knows that all data
nodes (including the newest one to start) have replied to the
CM_NODEINFOREQ
signal. When the president
receives the final CM_ACKADD
, it sends a
CM_ADD
signal to all current data nodes (that
is, except for the node that just started). Upon receiving this
signal, the existing data nodes enable communication with the
new node; they begin sending heartbeats to it and including in
the list of neighbors used by the heartbeat protocol.
The start
struct is reset, so that it can
handle new starting nodes, and then each data node sends a
CM_ACKADD
to the president, which then sends
a CM_ADD
to the starting node after all such
CM_ACKADD
signals have been received. The new
node then opens all of its communication channels to the data
nodes that were already connected to the cluster; it also sets
up its own heartbeat structures and starts sending heartbeats.
It also sends a CM_ACKADD
message in response
to the president.
The signalling between the starting data node, the already “live” data nodes, the president, and any API nodes attached to the cluster during this phase is shown in the following diagram:
As a final step, QMGR
also starts the timer
handling for which it is responsible. This means that it
generates a signal to blocks that have requested it. This signal
is sent 100 times per second even if any one instance of the
signal is delayed..
The BACKUP
kernel block also begins sending a
signal periodically. This is to ensure that excessive amounts of
data are not written to disk, and that data writes are kept
within the limits of what has been specified in the cluster
configuration file during and after restarts. The
DBUTIL
block initializes the transaction
identity, and DBTUX
creates a reference to
the DBTUP
block, while
PGMAN
initializes pointers to the
LGMAN
and DBTUP
blocks.
The RESTORE
kernel block creates references
to the DBLQH
and DBTUP
blocks to enable quick access to those blocks when needed.