Problem

  • Product Name: VisiBroker for C
  • Product Version: 7.0 & later
  • Platform/OS Version: All
VisiBroker for C application using Messaging::NO_RECONNECT rebind policy makes a CORBA request but encounters CORBA::COMM_FAILURE and after that the Round Trip Timer(RTT) is not reset for the next CORBA request. Issue also occurs for Messaging::TRANSPARENT, Messaging::NO_REBIND and QoSExt::VB_NOTIFY_REBIND rebind policies.

After every CORBA invocation the RTT is reset as can be observed in the log.
for e.g.
"Pid# 3211 Tim# Tue Jan 18 15:30:58 2011 563746ms Tid# 1 Src# client Msg# Elapsed timer is reset."
 
When CORBA::COMM_FAILURE is encountered, such logging is not observed. Hence the timer is not reset.
When the RTT timer is made to "run out"(making no other CORBA requests for a time), CORBA::TIMEOUT is encountered outright on the next Corba invocation on the same thread.

A test case is attached to demonstrate the issue. Brief description of the test case:

a. Server1 has a BankManager object instance
b. Server2 has an Account object instance
c. Account class has a balance() and throw_comm_fail() method
d. Account::throw_comm_fail() will cause the server where the account object resides to kill itself via system(kill `pgrep Server1`)
e. Client will do the following:
 1. Call bankmanager->open(). This returns a servant reference, account1 residing in Server1
 2. Call account2->balance() from an account object residing in Server2
 3. Call account1->throw_comm_fail() causing Server1 to kill itself
 4. Server1 fails to send a proper reply. Client catches the CORBA::COMM_FAILURE.
 5. Client sleep for 45 seconds. Timeout is set to 30 seconds. This is to demonstrate the timer not being reset.
 6. Call account2->balance() again. The timeout is returned outright.

Resolution

A RPI: 1077775 for this issue is raised and fixed in VB 8.5 SP1. The possible workaround is to use a different rebind policy. Consider your application's requirements when choosing the below rebind policies:

    QoSExt::VB_TRANSPARENT(default)
    QoSExt::VB_NO_REBIND

Another workaround is, re-try the connection when catching a CORBA::COMM_FAILURE exception.  In this case the client gets CORBA::REBIND exception which properly resets the timeout timer. 

 
Incident #2494571