home / infca / mq / mq clustering (navigation links) Either he's dead or my watch has stopped

Concepts | TroubleShooting SAGCLUSTER | security | best practices | Links | End

MQ cluster


Amunt! Top Amunt!
Concepts

Purpose : workload balancing & simplified administration & scalability

Problem : stuck message(s) @ Xmit Q(s)!

Requirement : "define ql(nom) DEFBIND(NOTFIXED)" or MQOO_BIND_NOT_FIXED, instead of "DEFBIND(OPEN)"

Arquitecture

A MQ cluster is ... You don't define a cluster as such; you define cluster attributes in each queue manager, and each queue manager becomes a member of a logical entity that is referred to as a "queue manager cluster".

It has 2 FRs and lots of PRs.

FR holds info about the cluster topology = participant qmgrs and shared queues.

How do you add a queue manager as a partial repository except by creating a cluster sender and a cluster receiver channel? A qmgr becomes a partial repository (PR) when an object (queue, channel, etc.) is defined with the CLUSTER() attribute name of the cluster in which the object is to be known.

Cluster routing and operation

At SMQ qmgr we can display the cluster sender channels, listening on one transmit queue :

display qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) APPLTAG CHANNEL 6 : display qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) APPLTAG CHANNEL AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.SMQ2) AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.SMQ3) AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.IB9QMGR)
SYSTEM.CLUSTER.TRANSMIT.QUEUE

Para optimizar su vaciado se puede introducir el parámetro PipeLineLength=2 en qm.ini
Pipelinelength=2 enable overlap of putting messages onto TCP while waiting for acknowledgment of previous batch. This enables overlap of sending messages while waiting for Batch synchronization at remote system.

URL : To allow an MCA to transfer messages using multiple threads, type the number of concurrent threads that the channel will use. The default is 1; if you type a value greater than 1, it is treated as 2. Make sure that you configure the queue manager at both ends of the channel to have a Pipeline length that is greater than 1. Pipelining is effective only for TCP/IP channels.

Curiós:

dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all 5 : dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all AMQ8101: WebSphere MQ error (18EBF0) has occurred.

Hem de engegar els canals i obtenim:

dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all 11 : dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLDESC(WebSphere MQ Channel) APPLTAG(C:\MQ\bin\amqrmppa.exe) APPLTYPE(SYSTEM) BROWSE(YES) CHANNEL(TO.SMQ2) CONNAME(127.0.0.1(2417)) ASTATE(NONE) HSTATE(ACTIVE) INPUT(SHARED) INQUIRE(YES) OUTPUT(YES) PID(9060) QMURID(0.155) SET(YES) TID(9) URID(XA_FORMATID[] XA_GTRID[] XA_BQUAL[]) URTYPE(QMGR)

Amunt! Top Amunt!
What data do I need to join a cluster
DEFINE CHANNEL ('CLUSTER-NAME.MY-QM-NAME') + CHLTYPE(CLUSRCVR) + TRPTYPE(TCP) + CLUSTER('OUR-CLUSTER') + [1] DEFINE CHANNEL ('CLUSTER-NAME.FR-QM-NAME') + [3] CHLTYPE(CLUSSDR) + TRPTYPE(TCP) + CLUSTER('OUR-CLUSTER') + [1] CONNAME('remotehost.domain(1482)') + [2]

Real sample :

*** Se crea el canal CLUSTER RCVR *** def channel(TO.QM01) + chltype(CLUSRCVR) + trptype(TCP) + conname('my.hostname(1414)') + cluster(CLUSTERNAME) + [1] maxmsgl(104857600) + replace *** Se crea el canal CLUSTER SDR *** def channel(CLUSTERNAME.QMFR) + [3] chltype(CLUSSDR) + trptype(TCP) + conname('host.remoto.fr(1415)') + [2] cluster(CLUSTERNAME) + [1] maxmsgl(104857600) + replace

Para recibir información de la configuración del cluster - las colas que hay ofrecidas / visibles al cluster - solo nos hace falta [1], el nombre del cluster
Pero para activar el canal CLUSRCVR hace falta activar el canal CLUSSDR, o sea que hace falta la ip/port remotos [2], por lo que hace falta el nombre del canal [3]

Preventing queue managers joining a cluster

It is difficult to stop a queue manager that is a member of a cluster from defining a queue. Therefore, there is a danger that a rogue queue manager can join a cluster, learn what queues are in it, define its own instance of one of those queues, and so receive messages that it should not be authorized to receive.

To prevent a queue manager receiving messages that it should not, you can write:

MQ v7 queue manager clusters, csqzah09.pdf, pg 77


Amunt! Top Amunt!
Naming / nomenclatura

First idea is to name

but when it comes to overlapping clusters, it is not good enough {see PDF}
Advanced naming convention

[TR] Naming convention I try to use instead is <cluster name>.<qmgr name>, as CLUSNAME.QMGRNAME, meaning "only one cluster per channel"

MQ cluster best practices {sagpdf}


Amunt! Top Amunt!
Clustering demo

See commands in "\\MQ\Eines\Clustering_Demo\"

  1. instalació MQ - producte as is
  2. configuració del cluster [administració] - facilitat incorporacio cues noves - no cal QREMOTE
  3. funcionament bàsic del cluster - as MQ
  4. incorporació de un nou QM [escalabilitat] - sols 2 canals (+ cues que ofereix)
  5. prova de càrrega - 200 msg/segon, de 1 KB
  6. prova de balanceig de càrrega [balanceig de càrrega] - sense / amb coeficients - CLWLWGHT/CLWLPRTY/CLWLRANK
  7. caiguda de un servidor [alta disponibilitat] - caiguda de QM o cua no disponible (Put disabled)
  8. accés al cluster des un MQ Client, exterior al cluster [MQ client] - QALIAS at entry node
Minimum actions - create a cluster
LO qm ALTER QMGR REPOS(INVENTORY) NY qm ALTER QMGR REPOS(INVENTORY) LO qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) LO qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) Add a QM (Paris) to the Cluster. PA qm DEFINE CHANNEL(TO.PARIS) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(PARIS.CHSTORE.COM) CLUSTER(INVENTORY) // clusRCVR must go first PA qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) // clusSDR must go second Add a QM+Q (Toronto + INVENTQ) to the Cluster. TO qm DEFINE CHANNEL(TO.TORONTO) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(TORONTO.CHSTORE.COM) CLUSTER(INVENTORY) TO qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) TO qm DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) Verify NY qm DIS QCLUSTER(*) CLUSTER (INVENTORY) NY qm DIS CLUSQMGR(*) CLUSTER (INVENTORY) TO qm DIS QCLUSTER(*) CLUSTER (INVENTORY) TO qm DIS CLUSQMGR(*) CLUSTER (INVENTORY) Load Balance ( LA gets twice as many messages as NY ) LA qm DEFINE CHANNEL(TO.LA) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(LA.CHSTORE.COM) CLUSTER(INVENTORY) CLWLWGHT(2) NY qm ALTER CHANNEL(TO.NEWYORK) CHLTYPE(CLUSRCVR) CLWLWGHT(1)
Minimum actions - display the cluster
[fr/pr?] dis clusqmgr(*) conname qmtype status
Minimum actions - delete a cluster

You don't, as such. A Cluster isn't an "entity" that can be deleted.
Once you have altered the clustered objects to remove the cluster atribute, you can issue the REFRESH CLUSTER command on that qmgr.

Minimum actions - Remove a queue from a cluster

Sample cluster : 2xFR, 1 GW, 1 external qmgr
echo "DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) DEFBIND(NOTFIXED)" | runmqsc TQM4 echo "DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) DEFBIND(NOTFIXED)" | runmqsc TQM2 echo "DEFINE QREMOTE(ANY.INVENTQ) RNAME(' ') RQMNAME(' ')" | runmqsc TQM1 echo "ALTER QREMOTE(INVENTQ) RNAME(INVENTQ) RQMNAME(ANY.INVENTQ) XMITQ(TQM1)" | runmqsc TQM3

You have to be able to deduce that QM2 and 4 are the FR's, QM1 is the gateway, and QM3 is external to the cluster ...


Complete cluster : 2xFR, Nx ENT, Nx MB

See GNF

cluster GNF

Amunt! Top Amunt!
Cluster resource's availability

Amunt! Top Amunt!
Cluster commands

Available commands are :

DISPLAY QCLUSTER(*) CLUSQMGR - displays queues in cluster
1 : display qcluster(*) clusqmgr AMQ8409: Ver detalles de la cola. QUEUE(QL.CLSAG.CLFR1.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.CLSAG.CLFR2.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR2) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(QMAS) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR2)
DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS - display queue managers in cluster
1 : DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(CLFR1) CHANNEL(SAGCLUSTER.CLFR1) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.25(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(CLFR2) CHANNEL(SAGCLUSTER.CLFR2) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.153(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(QMAS) CHANNEL(SAGCLUSTER.QMAS) CLUSTER(SAGCLUSTER) CONNAME(6q(1491)) QMTYPE(NORMAL) STATUS(RUNNING)
SUSPEND QMGR - use the SUSPEND QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance

Syntax is : SUSPEND QMGR CLUSTER (cluster_name) [ MODE( QUIESCE | FORCE ) ]

I always conclude removing a QM from a cluster by issuing the REFRESH CLUSTER command on that QM leaving the cluster.

RESUME QMGR - use the RESUME QMGR command to reinstate a queue manager to a cluster, after temporarily having removed it

Syntax is : RESUME QMGR CLUSTER (cluster_name)

REFRESH CLUSTER

Issue the REFRESH CLUSTER command from a queue manager to discard all locally held information about a cluster.
Using REFRESH CLUSTER(clustername) REPOS(YES) specifies that in addition to the default behavior, objects representing full repository cluster queue managers are also refreshed. This option may not be used if the queue manager is itself a full repository.

Issuing REFRESH CLUSTER is disruptive to the cluster.
It is strongly recommended that all cluster sender channels for the cluster are stopped before the REFRESH CLUSTER command is issued.

RESET CLUSTER - used to forcibly remove a queue manager from a cluster. You can do this from a full repository queue manager by issuing either the command:

RESET CLUSTER(clustername) QMNAME(qmname) ACTION(FORCEREMOVE) QUEUES(NO) or the command RESET CLUSTER(clustername) QMID(qmid) ACTION(FORCEREMOVE) QUEUES(NO)

  publib RESET CLUSTER

Using the RESET CLUSTER command is the only way to delete auto-defined cluster-sender channels

Chapter 6, "Queue Manager Clusters", SC34-6589-00.

UK 2013 :

RESET CLUSTER(PC.ECOMM) QMID(QMN.USR.2_2009-08-24_13.20.39) ACTION(FORCEREMOVE) QUEUES(YES) // on FR REFRESH CLUSTER(PC.ECOMM) REPOS(YES) // on QMN
Resum by mr Saper

RESET is used to forcibly remove the information from the Full repository.
SUSPEND leaves the information in the Full repository intact and merely signals that you do no longer want to be included in the load balancing.

The same way: REFRESH checks the information in the FR and tries to add you if you are not there.
RESUME tells the FR that you a ready to rejoin the load balancing.

Cluster specific actions

Com es fa per convertir un FR en PR ?

alter qmgr repos(' ')

Com es fa per convertir un PR en FR ?

alter qmgr repos('mycluster')

Perform a "cold-start" of the cluster, this is, refresh cluster config (use on PR qmgr)

REFRESH CLUSTER REPOS(YES)

Amunt! Top Amunt!
Cluster troubleshooting

(one of the) Full Repository QM fails. When back, it does not see the remote cluster queues.

Sol : SUSPEND qmgr + RESUME qmgr

Com saber qui es/son el "Full Repository" de un cluster ?

Use the DISPLAY CLUSQMGR command to display cluster information about queue managers in a cluster. If you issue this command from a queue manager with a full repository, the information returned pertains to every queue manager in the cluster. If you issue this command from a queue manager that does not have a full repository, the information returned pertains only to the queue managers in which it has an interest. That is, every queue manager to which it has tried to send a message and every queue manager that holds a full repository.

Use the SUSPEND QMGR command and RESUME QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance, and then to reinstate it.

In an emergency where a queue manager is temporarily damaged, you might want to inform the rest of the cluster before the other queue managers try to send it messages. RESET CLUSTER can be used to remove the damaged queue manager. Later when the damaged queue manager is working again, you can use the REFRESH CLUSTER command to reverse the effect of RESET CLUSTER and put it back in the cluster again.

Use the DISPLAY QCLUSTER(*) command to display all queues visible from a given cluster queue manager.

The DISPLAY QUEUE or DISPLAY QCLUSTER command returns the name of the queue manager that hosts the queue (or the names of all queue managers if there is more than one instance of the queue). It also returns the system name for each queue manager that hosts the queue, the queue type represented, and the date and time at which the definition became available to the local queue manager.

Cluster symptoms and solutions :

"Queue Manager Clusters", SC34-6589-00, csqzah07.pdf, apendix A.

Cluster change propagation - RESET CLUSTER

Quite often what we have seen occurring is that a queue manager is removed without first deleting its cluster resources. This leaves a situation where the rest of the cluster thinks the queue manager still exists. If you find this has occurred, you will need to use the RESET CLUSTER command to force the removed queue managers definitions out of the cluster.

TMM10 - Introduction to WMQ Clustering.

2189 MQRC CLUSTER RESOLUTION ERROR

url

The queue is being opened for the first time and the queue manager cannot make contact with any full repositories. Make sure that the CLUSSDR channels to the full repositories are not in retry state.

1 : display clusqmgr(*) qmtype status AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM1) CLUSTER(DEMO) CHANNEL(TO.QM1) QMTYPE(NORMAL) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM2) CLUSTER(DEMO) CHANNEL(TO.QM2) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM3) CLUSTER(DEMO) CHANNEL(TO.QM3) QMTYPE(REPOS) STATUS(RUNNING)

url

Qmgr (new) values not updated in Cluster

If a queue manager has some values (as listener port) at the moment the cluster is created, a change in those values shall not be propagated to the Cluster (repository), unless the following procedure is used :

  1. alter QM1 to be a Partial Repository (suposing it was FR)
  2. REFRESH CLUSTER with the REPOS(YES) option
  3. make QM1 to be Full Repository again (if needed)

Repeat with QM2, the other FR.

Problems with clustering when changing IP

pending to expand

Problem : display CLUSQMGR shows SYSTEM.TEMPQMGR.*

This is temporary type of situation in that this temporary name goes away once the repositories are brought in sync with each other. This is documented in MQ Queue Managers Clusters manual.

url, url

1 : DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(P7029) CHANNEL(SAGCLUSTER.P7029) CLUSTER(SAGCLUSTER) CONNAME(9.137.166.87(2415)) QMTYPE(NORMAL) STATUS(INACTIVE) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(SYSTEM.TEMPQMGR.9.137.164.25(2401)) CHANNEL(SAGCLUSTER.CLFR1) CLUSTER(SAGCLUSTER) CONNAME(9.137.164.25(2401)) QMTYPE(REPOS) STATUS(RUNNING) One MQSC command read.

Let the cluster settle down while you verify all cluster channel status


Amunt! Top Amunt!
Heartbeat and Keep Alive

When you are defining cluster-sender channels and cluster-receiver channels choose a value for HBINT or KAINT that will detect a network or queue manager failure in a useful amount of time but not burden the network with too many heartbeat or keep alive flows.

MQ v 5.3, "Clustering", SC34-6061-02, page 79 [95/183]

On platforms other than z/OS, if you need the functionality provided by the KAINT parameter (Keep Alive), use the Heartbeat Interval (HBINT) parameter,

MQ v 6.0, "MQSC Reference", SC34-6597-00, page 130 [150/501]


Amunt! Top Amunt!
What about my applications?

You need not alter any of your applications if you are going to set up a simple MQ cluster. The applications name the target queue on the MQOPEN(queue_name) call as usual and need not be concerned about the location of the queue manager [MQCONNECT(qmgr_name)]


Amunt! Top Amunt!
Using clusters for workload management + more than one instance of a queue

Clustering, SC34-6061-02, page 63/183

You can organize your cluster such that the queue managers in it are clones of each other, able to run the same applications and have local definitions of the same queues.

The advantages of using clusters in this way are:

Any one of the queue managers that hosts an instance of a particular queue can handle messages destined for that queue. This means that applications need not explicitly name the queue manager when sending messages. A workload management algorithm determines which queue manager should handle the message.


Amunt! Top Amunt!
Workload balancing

When you have clusters containing more than one instance of the same queue, MQ uses a workload management algorithm to determine the best queue manager to route a message to. The workload management algorithm selects the local queue manager as the destination whenever possible. If there is no instance of the queue on the local queue manager, the algorithm determines which destinations are suitable. Suitability is based on the state of the channel (including any priority you might have assigned to the channel), and also the availability of the queue manager and queue. The algorithm uses a round-robin approach to finalize its choice between the suitable queue managers.

If an application opens a target queue so that it can write messages to it, the MQOPEN call chooses between all available instances of the queue. Any local version of the queue is chosen in preference to other instances. This might limit the ability of your applications to exploit clustering.

If it is not appropriate to modify your applications to remove message affinities, there are a number of other possible solutions to the problem. For example, you can

Clustering, SC34-6061-02, page 65 to 70/183
v6, pg 60 [78/201]


Amunt! Top Amunt!
The cluster workload management algorithm

If a local queue within the cluster becomes unavailable while a message is in transit, the message is forwarded to another instance of the queue but only if the queue was opened (MQOPEN) with the MQOO_BIND_NOT_FIXED open option, of the MQ_Open() specified "MQOO_BIND_AS_Q_DEF" and DEFBIND queue param value is NOTFIXED.

MQ 6.0 Queue Manager Clusters, csqzah07.pdf, SC34-6589-00, page 51

To route all messages put to a queue using MQPUT to the same queue manager by the same route, use the MQOO_BIND_ON_OPEN option on the MQOPEN call. To specify that a destination is to be selected at MQPUT time, that is, on a message-by-message basis, use the MQOO_BIND_NOT_FIXED option on the MQOPEN call.

MQ 6.0 Programming Guide, page 96 [116/601]

The workload management algorithm selects the local queue manager as the destination whenever possible.

from MQ 5.3 Clustering, SC34-6061-02, page 49

On v6 you can change the workload balancing algorithm so that it does not use a preffered-local strategy.

On v5.x, you can use a cluster workload exit, or you can use a different queue manager for your PUTS than you do for your GETS, and this other qmgr would be in the cluster but not have a qlocal X.

CLWLUSEQ := ANY ; { Local, Any, Queue Manager }
The queue manager treats the local queue as another instance of the cluster queue for the purposes of workload distribution.

MQ v6 "MQSC" SC34-6587-00, pg 50 [70/501]

WorkLoad algorithm detailed

WorkLoad algorithm


Amunt! Top Amunt!
Client access to a cluster

SET MQSERVER=QMS3.SVRCONN/tcp/localhost(1423) DEFINE QALIAS(QSAGCLU) TARGQ(QSEBAS) amqsputc QSAGCLU ... .. [ server_1 ] [ client ] <---> . [ gw ] .. [ server_2 ] ...

Com codificar


Amunt! Top Amunt!
Straight WLB

Let's make it run !

Let's use Alias

If TQM3 has an alias queue WLMAQ, whose TARGETQ is WLMQ1, the WLG.EXE can write to it, and the messages still get to (split) queues.

External access (fail)

If another (external to the cluster) qm TQM4 writes into RMQ99, a remote queue pointing to queue WLMAQ and manager TQM3, the messages go into TQM3DLQ, TQM3's Dead Letter Queue, with Reason d'2082 = MQRC_UNKNOWN_ALIAS_BASE_Q in Dead-Letter Header, because the message carries the destination Queue Manager field ... and there is no such queue there !

Solution : in the Gateway queue manager (TQM3), set a queue manager alias

See Put & Destination !


Amunt! Top Amunt!
SAGCLUSTER

The cluster I have for testing is like this

hostname IP Port Op Sys MQ version Qmgr Name MB version MB name FR/PR --------- -------- ----- ------------------ ----------- ---------- ----------- -------- ---------------------- patan .164.249 2401 wxp 7.5.0.1 CLSPATAN - - PR (server) lab005 .164.25 2401 wxp 7.5.0.1 CLFR1 - - FR (main) 6Q .164.234 1491 w2008 SR2 {64-bit} 7.5.0.1 QMAS 7.0.0.1 BKAS PR (moves to BISC net) p9111 .166.86 2416 SLES 10 (ppc) 7.5 P9111 - - PR p7029 .166.87 2415 SLES 10 (ppc) 7.5 P7029 - - PR labss2 . . RH v4 7.5 CLSSS2 - - PR rhv6-64b .164.32 2401 RH v6.1 {64-bit} 7.5.0.1 CLFR2 8.0.0.2 MB64B FR (mix 32/64 bits) t400 .165.248 1491 wxp 7.5.0.1 (MB7QMGR) 7.0.0.1 MB7BROKER PR (or Client)

Shared queues are QL.DELPHI.GRAW.IN & QL.DELPHI.GRAW.OUT, user is MQ_USER_RAW of group MQ_GROUP_RAW

.---- ( SAGCLUSTER ) -------. | | | .-------. | | p9111 | QL.IN | .-------. | | | .-------. .--------. .----------. | p7029 | QL.IN | MQ | | | .-------. QR.MH | client | ------- | patan | | | T400 | | CLSPATAN | .------. .--------. .----------. | 6q | QL.RSP | | QMAS | | .------. | | | .-------. .-------. | .---| 005 |----| rh64b |--. | CLFR1 | | CLFR2 | .-------. .-------.

Some definitions I have

p7029 : DEFINE QREMOTE(QR.MH) RNAME(QL.RSP) RQMNAME(' ') p7029 : define ql(QL.IN) CLUSTER(SAGCLUSTER) QMAS : define ql(QL.RSP) CLUSTER(SAGCLUSTER) p9111 : define qalias(QL.DADES.TRIG) target(QL.P9111) cluster(SAGCLUSTER) defbind(notfixed) replace

Message sent by T400 into cluster is addressed to queue QL.IN so gets to p7029 using cluster. It has ReplyToQueue(QR.MH) and ReplyToQmgr(PATAN), so we get "mqrc = 2087", as there is no QL.RSP at PATAN qmgr

Peter idea :
Have the putting application, the psuedo requester, specify the real reply queue name in the Reply To Queue field of the MQMD of the 'request' message, and fill in the Reply To QM field with a value called VITOR_WUZ_HERE, or any other value you like. Just don't leave it blank or don't fill it in with the name of a real QM.
The message will arrive at the 'replying' app with the reply to queue field filled in with the real reply q name, and the Reply To QM filled in with VITOR_WUZ_HERE. When the app 'replies', it opens the reply queu specifying both the destination queue (the real reply q) and the destination QM (VITOR_WUZ_HERE).
Insure there is a QM Alias called VITOR_WUZ_HERE that routes messages to an XMITQ that gets you back to a Queue Manager in the cluster. I'm assuming the replying app is connected to a QM outside the cluster. On the QM in the cluster that has the RCVR channel from the QM outside the cluster create a QM Alias called VITOR_WUZ_HERE that has a blank Remote Q, blank Remote QM Name and blank XMITQ attribute. As messages arrive destined for a QM called VITOR_WUZ_HERE, this alias will blank out the destination QM and MQ name resolution kicks in looking for that reply queue without a specific QM, and the message will load balance inside the cluster.


Amunt! Top Amunt!
Cluster Scripting - discovering cluster params

On any qmgr:

set QMN=QMAS echo 1 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% echo 2 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% | find "CLUSTER(" echo 3 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% | find "CONNAME("

On FR:

set QMN=QMFR1 echo DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS | runmqsc %QMN% - display queue managers in cluster echo DISPLAY QCLUSTER(*) CLUSQMGR | runmqsc %QMN% - d isplay queues in cluster

Amunt! Top Amunt!
Cluster monitoring

MQ cluster best practices {sagpdf}, publib

Cluster Health monitoring tool (Delphi, of course)

Input - a file with

SET CNAME= // cluster name SET FR1NAME= // full repository (1) qmgr name SET FR2NAME= // full repository (2) qmgr name SET NUMQMCL= // number of queue managers in cluster (apart of FR's) :: "N" times SET QM01NM= // 1-st queue manager - name SET QM01LS= // 1-st queue manager - listener port SET QM02NM= // 2-nd queue manager - name SET QM02LS= // 2-nd queue manager - listener port

Output shall be

Cluster connectivity & availability monitoring tool

In all PR's of the cluster, we can install a "responder" waiting on a specific queue.

The monitor program shall have a list of queue managers and shall send a msg to all of them, verifying a msg can reach there and come back.

This shall assert the cluster shared objects availability to some level.

AMQSCLM - the cluster queue monitoring sample program

*** publib ***


Amunt! Top Amunt!
Multiple cluster XMITQ

All you do to use multiple cluster transmission queues is to change the default cluster transmission queue type on the gateway queue manager. Change the value of the queue manager attribute DEFCLXQ

Changing the default to separate cluster transmission queues to isolate message traffic

The default cluster transmission queue is set as a queue manager attribute, DEFCLXQ. Its value is either SCTQ or CHANNEL. New and migrated queue managers are set to SCTQ. You can alter the value to CHANNEL.

Cluster transmission queues and cluster-sender channels

The values of DefClusterXmitQueueType are MQCLXQ_SCTQ or MQCLXQ_CHANNEL.

DefClusterXmitQueueType (MQLONG)

You have some choices to make when you are planning how to configure a queue manager to select a cluster transmission queue.

Clustering: Planning how to configure cluster transmission queues

If you set the queue manager attribute DEFCLXQ to CHANNEL, a different cluster transmission queue is created automatically from SYSTEM.CLUSTER.TRANSMIT.MODEL.QUEUE for each cluster-sender channel.

Cluster queues

display qmgr 1 : display qmgr AMQ8408: Display Queue Manager details. QMNAME(SMQ) ACCTCONO(DISABLED) DEADQ(QL.DLQ) DEFCLXQ(SCTQ)

Enabling SSL in an existing WebSphere MQ cluster, developerWorks, Ian Vanstone : runmqckm commands, complete sample

About cluster security on a tricky configuration


Amunt! Top Amunt!
{bestp}
Clustering best practices, hints, etc
Repository Query command

If you want to have a look into cluster repository, use this command:

c:\> amqrfdm /? WebSphere MQ Repository Query Program written by Paul Clarke Usage : amqrfdm [-m QMgrName] [-d]
Interesting summary

#1 Regardless of how many FRs you have, each FR should have a manual CLUSSNDR defined to every other FR.

#2 If every FR has a CLUSSNDR to every other FR, each FR will know about every cluster attribute on every QM in the cluster.

#3 A PR will only ever publish info to 2 FRs. A PR will only ever subscribe to 2 FRs. Period. It doesn't matter how many manual CLUSSNDRs you define on that PR. A PR will only ever send its info (publish) to 2 FRs and will only get updates (subscribe) from 2 FRs.

#4 You should only define one CLUSSNDR to one FR from a PR.

#5 If 2 FRs go down in your cluster, your cluster will be able to send messages just fine. But any changes to cluster definitions become a problem. Any PRs that used both of these down FRs will still function for messaging, but they will not be made aware of any changes in the cluster because both of it's FRs are N/A.

#6 If two of your FRs are down, and you still have other FRs, you could go to your PRs and delete the CLUSSNDR to the down FR, define a CLUSSNDR to an available FR and issue REFRESH CLUSTER(*) REPOS(YES). This would cause your PR to register with an available FR and thus pick up cluster changes.

#7 In a properly designed system the likelihood of 2 FRs being down is next to zero, so the need for more than 2 FRs is next to zero. And even if both FRs are down it doesn't mean your cluster will come to a screeching halt.

Just use 2 FRs.

Replace FR steps

If you want to keep IP or QMGR name, keep in mind QMID (includes CRDATE and CRTIME) will certainly be different.

On the local qmgr, use DISPLAY Q(*) WHERE (CLUSTER NE ' ') to see which queues are shared in the cluster.

  1. stop sharing objects by setting CLUSTER(' ')
  2. make sure all queues are empty (as probably this qmgr will never come back)
  3. (pend) stop CLUSSDR channel (?)
  4. remove qmgr from cluster, using SUSPEND command
  5. take down qmgr
  6. add new qmgr (with same name) to cluster
  7. set CLUSTER('cluster-name') property on objects that need to be shared
Garbage collector problems

When objects in the cluster repository cache are modified (for example, changing an attribute on a cluster queue), the details for that object are republished to the cluster. Previous records for the object may persist for some time in the cluster cache, so that applications currently using them (for instance having opened the queue for output) can continue processing without interruption.
Periodically, the repository process attempts to 'garbage collect' these older records, checking whether they are still in use. Where multiple such records exist for a particular cluster queue manager object (the record in the cache which stores information about the channel definition to reach a remote queue manager), and these are held in use for a prolonged period, an error in the logic leads to the possibility that the storage for parts of these queue manager records can be reused (for example overwritten to hold another object) while actually still required.

Solution:

REFRESH CLUSTER(*) REPOS(YES)

wwqa

Few Q&A
Els meus dubtes del clustering

Amunt! Top Amunt!
Links
Books

Ep ! Valid HTML 4.01!   Valid CSS! Escriu-me !
Updated 20180829 (a)  
Uf !