| home / infca / mq / mq clustering (navigation links) | Either he's dead or my watch has stopped |
| Concepts | TroubleShooting | SAGCLUSTER | security | best practices | Links | End |
Purpose : workload balancing & simplified administration & scalability
Problem : stuck message(s) @ Xmit Q(s)!
Requirement : "define ql(nom) DEFBIND(NOTFIXED)" or MQOO_BIND_NOT_FIXED, instead of "DEFBIND(OPEN)"
A MQ cluster is ... You don't define a cluster as such; you define cluster attributes in each queue manager, and each queue manager becomes a member of a logical entity that is referred to as a "queue manager cluster".
It has 2 FRs and lots of PRs.
FR holds info about the cluster topology = participant qmgrs and shared queues.
How do you add a queue manager as a partial repository except by creating a cluster sender and a cluster receiver channel? A qmgr becomes a partial repository (PR) when an object (queue, channel, etc.) is defined with the CLUSTER() attribute name of the cluster in which the object is to be known.
At SMQ qmgr we can display the cluster sender channels, listening on one transmit queue :
Para optimizar su vaciado se puede introducir el parámetro PipeLineLength=2 en qm.ini
Pipelinelength=2 enable overlap of putting messages onto TCP while waiting for acknowledgment of previous batch.
This enables overlap of sending messages while waiting for Batch synchronization at remote system.
URL : To allow an MCA to transfer messages using multiple threads, type the number of concurrent threads that the channel will use. The default is 1; if you type a value greater than 1, it is treated as 2. Make sure that you configure the queue manager at both ends of the channel to have a Pipeline length that is greater than 1. Pipelining is effective only for TCP/IP channels.
Curiós:
Hem de engegar els canals i obtenim:
Real sample :
Para recibir información de la configuración del cluster
- las colas que hay ofrecidas / visibles al cluster -
solo nos hace falta [1], el nombre del cluster
Pero para activar el canal CLUSRCVR hace falta activar el canal CLUSSDR,
o sea que hace falta la ip/port remotos [2], por lo que hace falta el nombre del canal [3]
It is difficult to stop a queue manager that is a member of a cluster from defining a queue. Therefore, there is a danger that a rogue queue manager can join a cluster, learn what queues are in it, define its own instance of one of those queues, and so receive messages that it should not be authorized to receive.
To prevent a queue manager receiving messages that it should not, you can write:
MQ v7 queue manager clusters, csqzah09.pdf, pg 77
First idea is to name
[TR] Naming convention I try to use instead is <cluster name>.<qmgr name>, as CLUSNAME.QMGRNAME, meaning "only one cluster per channel"
MQ cluster best practices {sagpdf}
See commands in "\\MQ\Eines\Clustering_Demo\"
You don't, as such. A Cluster isn't an "entity" that can be deleted.
Once you have altered the clustered objects to remove the cluster atribute,
you can issue the REFRESH CLUSTER command on that qmgr.
You have to be able to deduce that QM2 and 4 are the FR's, QM1 is the gateway, and QM3 is external to the cluster ...
See GNF
Available commands are :
DISPLAY QCLUSTER(*) CLUSQMGR - displays queues in cluster1 : display qcluster(*) clusqmgr AMQ8409: Ver detalles de la cola. QUEUE(QL.CLSAG.CLFR1.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.CLSAG.CLFR2.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR2) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(QMAS) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR2)DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS - display queue managers in cluster1 : DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(CLFR1) CHANNEL(SAGCLUSTER.CLFR1) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.25(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(CLFR2) CHANNEL(SAGCLUSTER.CLFR2) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.153(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de clster. CLUSQMGR(QMAS) CHANNEL(SAGCLUSTER.QMAS) CLUSTER(SAGCLUSTER) CONNAME(6q(1491)) QMTYPE(NORMAL) STATUS(RUNNING)SUSPEND QMGR - use the SUSPEND QMGR command to remove a queue manager from a cluster temporarily, for example for maintenanceSyntax is : SUSPEND QMGR CLUSTER (cluster_name) [ MODE( QUIESCE | FORCE ) ]
I always conclude removing a QM from a cluster by issuing the REFRESH CLUSTER command on that QM leaving the cluster.
RESUME QMGR - use the RESUME QMGR command to reinstate a queue manager to a cluster, after temporarily having removed itSyntax is : RESUME QMGR CLUSTER (cluster_name)
REFRESH CLUSTERIssue the REFRESH CLUSTER command from a queue manager to discard all locally held information about a cluster.
Using REFRESH CLUSTER(clustername) REPOS(YES) specifies that in addition to the default behavior, objects representing full repository cluster queue managers are also refreshed. This option may not be used if the queue manager is itself a full repository.Issuing REFRESH CLUSTER is disruptive to the cluster.
It is strongly recommended that all cluster sender channels for the cluster are stopped before the REFRESH CLUSTER command is issued.RESET CLUSTER - used to forcibly remove a queue manager from a cluster. You can do this from a full repository queue manager by issuing either the command:
RESET CLUSTER(clustername) QMNAME(qmname) ACTION(FORCEREMOVE) QUEUES(NO) or the command RESET CLUSTER(clustername) QMID(qmid) ACTION(FORCEREMOVE) QUEUES(NO)Using the RESET CLUSTER command is the only way to delete auto-defined cluster-sender channels
Chapter 6, "Queue Manager Clusters", SC34-6589-00.
UK 2013 :
RESET is used to forcibly remove the information from the Full repository.
SUSPEND leaves the information in the Full repository intact and merely signals that you do no longer want to be included in the load balancing.
The same way:
REFRESH checks the information in the FR and tries to add you if you are not there.
RESUME tells the FR that you a ready to rejoin the load balancing.
Com es fa per convertir un FR en PR ?
Com es fa per convertir un PR en FR ?
Perform a "cold-start" of the cluster, this is, refresh cluster config (use on PR qmgr)
(one of the) Full Repository QM fails. When back, it does not see the remote cluster queues.
Sol : SUSPEND qmgr + RESUME qmgr
Use the DISPLAY CLUSQMGR command to display cluster information about queue managers in a cluster. If you issue this command from a queue manager with a full repository, the information returned pertains to every queue manager in the cluster. If you issue this command from a queue manager that does not have a full repository, the information returned pertains only to the queue managers in which it has an interest. That is, every queue manager to which it has tried to send a message and every queue manager that holds a full repository.
Use the SUSPEND QMGR command and RESUME QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance, and then to reinstate it.
In an emergency where a queue manager is temporarily damaged, you might want to inform the rest of the cluster before the other queue managers try to send it messages. RESET CLUSTER can be used to remove the damaged queue manager. Later when the damaged queue manager is working again, you can use the REFRESH CLUSTER command to reverse the effect of RESET CLUSTER and put it back in the cluster again.
Use the DISPLAY QCLUSTER(*) command to display all queues visible from a given cluster queue manager.
The DISPLAY QUEUE or DISPLAY QCLUSTER command returns the name of the queue manager that hosts the queue (or the names of all queue managers if there is more than one instance of the queue). It also returns the system name for each queue manager that hosts the queue, the queue type represented, and the date and time at which the definition became available to the local queue manager.
Cluster symptoms and solutions :
"Queue Manager Clusters", SC34-6589-00, csqzah07.pdf, apendix A.
Quite often what we have seen occurring is that a queue manager is removed without first deleting its cluster resources. This leaves a situation where the rest of the cluster thinks the queue manager still exists. If you find this has occurred, you will need to use the RESET CLUSTER command to force the removed queue managers definitions out of the cluster.
TMM10 - Introduction to WMQ Clustering.
The queue is being opened for the first time and the queue manager cannot make contact with any full repositories. Make sure that the CLUSSDR channels to the full repositories are not in retry state.
If a queue manager has some values (as listener port) at the moment the cluster is created, a change in those values shall not be propagated to the Cluster (repository), unless the following procedure is used :
Repeat with QM2, the other FR.
pending to expand
This is temporary type of situation in that this temporary name goes away once the repositories are brought in sync with each other. This is documented in MQ Queue Managers Clusters manual.
Let the cluster settle down while you verify all cluster channel status
When you are defining cluster-sender channels and cluster-receiver channels choose a value for HBINT or KAINT that will detect a network or queue manager failure in a useful amount of time but not burden the network with too many heartbeat or keep alive flows.
MQ v 5.3, "Clustering", SC34-6061-02, page 79 [95/183]
On platforms other than z/OS, if you need the functionality provided by the KAINT parameter (Keep Alive), use the Heartbeat Interval (HBINT) parameter,
MQ v 6.0, "MQSC Reference", SC34-6597-00, page 130 [150/501]
You need not alter any of your applications if you are going to set up a simple MQ cluster. The applications name the target queue on the MQOPEN(queue_name) call as usual and need not be concerned about the location of the queue manager [MQCONNECT(qmgr_name)]
Clustering, SC34-6061-02, page 63/183
You can organize your cluster such that the queue managers in it are clones of each other, able to run the same applications and have local definitions of the same queues.
The advantages of using clusters in this way are:
Any one of the queue managers that hosts an instance of a particular queue can handle messages destined for that queue. This means that applications need not explicitly name the queue manager when sending messages. A workload management algorithm determines which queue manager should handle the message.
When you have clusters containing more than one instance of the same queue, MQ uses a workload management algorithm to determine the best queue manager to route a message to. The workload management algorithm selects the local queue manager as the destination whenever possible. If there is no instance of the queue on the local queue manager, the algorithm determines which destinations are suitable. Suitability is based on the state of the channel (including any priority you might have assigned to the channel), and also the availability of the queue manager and queue. The algorithm uses a round-robin approach to finalize its choice between the suitable queue managers.
If an application opens a target queue so that it can write messages to it, the MQOPEN call chooses between all available instances of the queue. Any local version of the queue is chosen in preference to other instances. This might limit the ability of your applications to exploit clustering.
If it is not appropriate to modify your applications to remove message affinities, there are a number of other possible solutions to the problem. For example, you can
Clustering, SC34-6061-02, page 65 to 70/183
v6, pg 60 [78/201]
If a local queue within the cluster becomes unavailable while a message is in transit, the message is forwarded to another instance of the queue but only if the queue was opened (MQOPEN) with the MQOO_BIND_NOT_FIXED open option, of the MQ_Open() specified "MQOO_BIND_AS_Q_DEF" and DEFBIND queue param value is NOTFIXED.
MQ 6.0 Queue Manager Clusters, csqzah07.pdf, SC34-6589-00, page 51
To route all messages put to a queue using MQPUT to the same queue manager by the same route, use the MQOO_BIND_ON_OPEN option on the MQOPEN call. To specify that a destination is to be selected at MQPUT time, that is, on a message-by-message basis, use the MQOO_BIND_NOT_FIXED option on the MQOPEN call.
MQ 6.0 Programming Guide, page 96 [116/601]
The workload management algorithm selects the local queue manager as the destination whenever possible.
from MQ 5.3 Clustering, SC34-6061-02, page 49
On v6 you can change the workload balancing algorithm so that it does not use a preffered-local strategy.
On v5.x, you can use a cluster workload exit, or you can use a different queue manager for your PUTS than you do for your GETS, and this other qmgr would be in the cluster but not have a qlocal X.
CLWLUSEQ := ANY ; { Local, Any, Queue Manager }
The queue manager treats the local queue
as another instance of the cluster queue
for the purposes of workload distribution.
MQ v6 "MQSC" SC34-6587-00, pg 50 [70/501]
Com codificar
Let's make it run !
If TQM3 has an alias queue WLMAQ, whose TARGETQ is WLMQ1, the WLG.EXE can write to it, and the messages still get to (split) queues.
If another (external to the cluster) qm TQM4 writes into RMQ99, a remote queue pointing to queue WLMAQ and manager TQM3, the messages go into TQM3DLQ, TQM3's Dead Letter Queue, with Reason d'2082 = MQRC_UNKNOWN_ALIAS_BASE_Q in Dead-Letter Header, because the message carries the destination Queue Manager field ... and there is no such queue there !
Solution : in the Gateway queue manager (TQM3), set a queue manager alias
See Put & Destination !
The cluster I have for testing is like this
Shared queues are QL.DELPHI.GRAW.IN & QL.DELPHI.GRAW.OUT, user is MQ_USER_RAW of group MQ_GROUP_RAW
Some definitions I have
Message sent by T400 into cluster is addressed to queue QL.IN so gets to p7029 using cluster. It has ReplyToQueue(QR.MH) and ReplyToQmgr(PATAN), so we get "mqrc = 2087", as there is no QL.RSP at PATAN qmgr
Peter idea :
Have the putting application, the psuedo requester,
specify the real reply queue name in the Reply To Queue field of the MQMD of the 'request' message,
and fill in the Reply To QM field with a value called VITOR_WUZ_HERE, or any other value you like.
Just don't leave it blank or don't fill it in with the name of a real QM.
The message will arrive at the 'replying' app with the reply to queue field filled in with the real reply q name,
and the Reply To QM filled in with VITOR_WUZ_HERE.
When the app 'replies', it opens the reply queu specifying both the destination queue (the real reply q) and the destination QM (VITOR_WUZ_HERE).
Insure there is a QM Alias called VITOR_WUZ_HERE that routes messages to an XMITQ that gets you back to a Queue Manager in the cluster.
I'm assuming the replying app is connected to a QM outside the cluster.
On the QM in the cluster that has the RCVR channel from the QM outside the cluster
create a QM Alias called VITOR_WUZ_HERE that has a blank Remote Q, blank Remote QM Name and blank XMITQ attribute.
As messages arrive destined for a QM called VITOR_WUZ_HERE,
this alias will blank out the destination QM
and MQ name resolution kicks in looking for that reply queue without a specific QM,
and the message will load balance inside the cluster.
On any qmgr:
On FR:
MQ cluster best practices {sagpdf}, publib
Input - a file with
Output shall be
In all PR's of the cluster, we can install a "responder" waiting on a specific queue.
The monitor program shall have a list of queue managers and shall send a msg to all of them, verifying a msg can reach there and come back.
This shall assert the cluster shared objects availability to some level.
*** publib ***
All you do to use multiple cluster transmission queues is to change the default cluster transmission queue type on the gateway queue manager. Change the value of the queue manager attribute DEFCLXQ
Changing the default to separate cluster transmission queues to isolate message traffic
The default cluster transmission queue is set as a queue manager attribute, DEFCLXQ. Its value is either SCTQ or CHANNEL. New and migrated queue managers are set to SCTQ. You can alter the value to CHANNEL.
Cluster transmission queues and cluster-sender channels
The values of DefClusterXmitQueueType are MQCLXQ_SCTQ or MQCLXQ_CHANNEL.
DefClusterXmitQueueType (MQLONG)
You have some choices to make when you are planning how to configure a queue manager to select a cluster transmission queue.
Clustering: Planning how to configure cluster transmission queues
If you set the queue manager attribute DEFCLXQ to CHANNEL, a different cluster transmission queue is created automatically from SYSTEM.CLUSTER.TRANSMIT.MODEL.QUEUE for each cluster-sender channel.
Enabling SSL in an existing WebSphere MQ cluster, developerWorks, Ian Vanstone : runmqckm commands, complete sample
About cluster security on a tricky configuration
Also "moving full repositories"
Read this:
Never pretend that two different installations are the same queue manager
(by trying to give a new installation the same QMGR name, IP address etc)
If you want to have a look into cluster repository, use this command:
#1 Regardless of how many FRs you have, each FR should have a manual CLUSSNDR defined to every other FR.
#2 If every FR has a CLUSSNDR to every other FR, each FR will know about every cluster attribute on every QM in the cluster.
#3 A PR will only ever publish info to 2 FRs. A PR will only ever subscribe to 2 FRs. Period. It doesn't matter how many manual CLUSSNDRs you define on that PR. A PR will only ever send its info (publish) to 2 FRs and will only get updates (subscribe) from 2 FRs.
#4 You should only define one CLUSSNDR to one FR from a PR.
#5 If 2 FRs go down in your cluster, your cluster will be able to send messages just fine. But any changes to cluster definitions become a problem. Any PRs that used both of these down FRs will still function for messaging, but they will not be made aware of any changes in the cluster because both of it's FRs are N/A.
#6 If two of your FRs are down, and you still have other FRs, you could go to your PRs and delete the CLUSSNDR to the down FR, define a CLUSSNDR to an available FR and issue REFRESH CLUSTER(*) REPOS(YES). This would cause your PR to register with an available FR and thus pick up cluster changes.
#7 In a properly designed system the likelihood of 2 FRs being down is next to zero, so the need for more than 2 FRs is next to zero. And even if both FRs are down it doesn't mean your cluster will come to a screeching halt.
Just use 2 FRs.
If you want to keep IP or QMGR name, keep in mind QMID (includes CRDATE and CRTIME) will certainly be different.
On the local qmgr, use DISPLAY Q(*) WHERE (CLUSTER NE ' ') to see which queues are shared in the cluster.
When objects in the cluster repository cache are modified
(for example, changing an attribute on a cluster queue),
the details for that object are republished to the cluster.
Previous records for the object may persist for some time in the cluster cache, so that applications currently using them
(for instance having opened the queue for output) can continue processing without interruption.
Periodically, the repository process attempts to 'garbage collect' these older records, checking whether they are still in use.
Where multiple such records exist for a particular cluster queue manager object
(the record in the cache which stores information about the channel definition to reach a remote queue manager),
and these are held in use for a prolonged period,
an error in the logic leads to the possibility that the storage for parts of these queue manager records can be reused
(for example overwritten to hold another object) while actually still required.
Solution:
|
|
|
|
Updated 20180829 (a)
|
|