• June 25, 2022

Oh no, I broke my node… DMINZNODE to the rescue!

We all know Murphy’s Law which states that anything that can go wrong, will go wrong. There are numerous websites that discuss the origins of this law which could make for an entertaining blog all on its own. The iCluster corollary to Murphy is that your backup node will always fail at the same time as your primary! If you do experience a hard failure or unexpected interruption on one of your iCluster nodes and replication just won’t come up again, phone support first! They can diagnose the cause to hopefully prevent it from happening again. In the past – if you broke a node, support would ask you to go through a number of recovery steps manually including clearing user spaces, clearing the stage and store, resetting the replication starting positions etc. Now as of 7.1 TR1 the DMINZNODE command comes to the rescue.

DMINZNODE should be run on all nodes in the cluster and as shown will prompt for whether stage and store libraries should be cleared for any group using the current node as a backup, and whether user spaces and user queues that are created by iCluster for internal use on this node should be cleared.


Some important notes about DMINZNODE:

  1. It MUST be run on all nodes in the cluster – not just the node that experienced the failure
  2. The work done by this command is submitted under a different job description QDFTJOBD, so make sure the job queue used by this job description is available. Use DSPJOBD QDFTJOBD to determine what job queue is used in your shop.
  3. Since the command performs a DMSETPOS for all groups and journals using this node as the primary as part of its recovery process, replication may take a lot longer to restart than normal.

