Problem:
- Product Name: VisiBroker
- Product Version: VisiBroker 7 Service Pack 2 & beyond
- Product Component: Smart Agent (osagent)
- Product Platform/OS: All supported platforms of VisiBroker
In some set-ups, more than one osagent are started to provide high availability such that when one osagent goes down, other osagent in the same domain can continue to provide directory services for VisiBroker object implementations. The failover is not immediate though.
In its default settings, osagent will check the availability of other osagent at 5 minutes interval. It will check the availability of its clients at 2 minutes interval.
When one of the osagent has just been stopped abruptly, osfind may take a few minutes before the results are returned. This is because other osagent(s) still think that the former osagent is alive and try to wait for its responses. This time could be shorten through configuration of some properties described in the resolution section below.
Resolution:
Checking the existence of other Smart AgentA Smart Agent sends an āAre you aliveā message (also known as a heartbeat message) at regular interval to other Smart Agents within its reach. The time for this periodic interval is controlled by the property vbroker.agent.timer (units of measurement in seconds).
If the Smart Agent did not receive an acknowledgement from the other Smart Agent within a configurable time interval, it will send another āAre you aliveā message (aka verification message) to the latter. This time interval is the summation of vbroker.agent.timer and vbroker.agent.threshold (units of measurement also in seconds).
If there is no acknowledgement message received even after the verification has been sent, the Smart Agent will decide whether to send any more verification message. The number of times a verification message could be resent is controlled by the property vbroker.agent.maxRetries.
If after x number of retries, and there is still no receipt of an acknowledgement message, the Smart Agent will remove the other Smart Agent from its list of Smart Agents.
Checking the existence of clients to the Smart Agent
Both VisiBroker CORBA server and CORBA clients are clients of the Smart Agent. There exists a different set of properties to check the existence of Smart Agent clients. The process is similar though.
At regular interval (as determined by vbroker.agent.keepAliveTimer), the Smart Agent sends a āAre you aliveā message to its clients.
If an acknowledgement message is not received within (vbroker.agent.keepAliveTimer vbroker.agent.keepAliveThreshold) seconds from a client, it will send another āAre you aliveā message to that client.
The number of resends is determined by the same property vbroker.agent.maxRetries .
Sample usage of the above properties
$ osagent -Dvbroker.agent.timer 8 -Dvbroker.agent.threshold 2
-Dvbroker.agent.keepAliveTimer 8 -Dvbroker.agent.keepAliveThreshold 2
-Dvbroker.agent.maxRetries 1
Notes
1. The threshold field is to allow some lag time for the reply during high traffic condition. It should not be set so large that when added to the timer value, it spans the period of the previous heartbeat interval. The timer can be thought of as some sort of mean value and the threshold is the variance that can be tolerated around the mean.
2. Default values of the above properties are:
vbroker.agent.timer=300 (seconds)
vbroker.agent.threshold=40 (seconds)
vbroker.agent.keepAliveTimer=120 (seconds)
vbroker.agent.keepAliveThreshold=40 (seconds)
vbroker.agent.maxRetries=4 (attempts)
#Security
#VisiBroker