[SOLVED] Unusual Urouter error on RedHat Enterprise 6.5
Author: gianni.sandigliano@unifacesolutions.com (gianni)
A Uappl was ported from RedHat 4.2 32bit to RedHat 6.5 64bit. Oldest Uversion was: 9.2 Newest Uversion is: 9.6.04.X402 This Uappl is delivering Uniface Services through SOAP as WebServices; frontend application is Java based and its server is configured on the same node where Uniface is also installed. Everything is working, here an example of a working session: 61:32.222.34 t=1823463168: accepted new connection on TCP:+13001 61:32.222.48 t=1031780096: From Client:chn=6;len=135: CLTCON; 61:32.222.53 t=1031780096: clt=(hst=172.16.3.134,trt-lnx-app03.interna.regio.it;pid=0;tid=0;sid=0;usr=nobody;ust=) 61:32.222.55 t=1031780096: log=(hst=TCP:trt-lnx-app03.interna.regio.it+13001;usr=usys;ust=LISRV) 61:32.222.57 t=1031780096: reguser: nid=172.16.3.134, node=trt-lnx-app03.interna.regio.it, pid=0, ust= 61:32.264.11 t=1031780096: To Client:chn=6;len=2: CONANS; continue: From time to time, usually in the middle of the morning, when more users are using those services an error -21, it should be a <NetworkLoginError>, is generated at urouter level. This is what urouter log file is reporting: 61:34.954.38 t=1823463168: accepted new connection on TCP:+13001 61:34.954.52 t=1031780096: From Client:chn=6;len=135: CLTCON; 61:34.954.56 t=1031780096: clt=(hst=172.16.3.134,trt-lnx-app03.interna.regio.it;pid=0;tid=0;sid=0;usr=nobody;ust=) 61:34.954.58 t=1031780096: log=(hst=TCP:trt-lnx-app03.interna.regio.it+13001;usr=usys;ust=LISRV) 61:34.954.62 t=1031780096: reguser: nid=172.16.3.134, node=trt-lnx-app03.interna.regio.it, pid=0, ust= 61:36.967.82 t=1031780096: [Mon Nov 3 14:09:34 2014] err=-21: cretpsv: Authentication of user/password failed for user usys 61:36.967.87 t=1031780096: To Client:chn=6;len=70: CONANS; Error=-21: From then on every client get same answer: 61:39.632.95 t=1823463168: accepted new connection on TCP:+13001 61:43.795.92 t=1823463168: accepted new connection on TCP:+13001 61:46.884.77 t=1823463168: accepted new connection on TCP:+13001 61:58.710.17 t=1823463168: accepted new connection on TCP:+13001 62:02.138.85 t=1823463168: accepted new connection on TCP:+13001 62:20.956.70 t=1823463168: accepted new connection on TCP:+13001 62:33.163.03 t=1823463168: accepted new connection on TCP:+13001 63:18.532.76 t=1823463168: accepted new connection on TCP:+13001 63:18.564.72 t=1823463168: accepted new connection on TCP:+13001 63:19.880.47 t=1823463168: accepted new connection on TCP:+13001 63:25.753.78 t=1823463168: accepted new connection on TCP:+13001 63:25.782.29 t=1823463168: accepted new connection on TCP:+13001 63:28.425.74 t=1823463168: accepted new connection on TCP:+13001 63:35.509.25 t=1823463168: accepted new connection on TCP:+13001 64:19.685.94 t=1823463168: accepted new connection on TCP:+13001 Until all active servers are restarted (in this case because idle, in other cases because a biped manual restart): 64:34.607.88 t=1823463168: accepted new connection on TCP:+13001 64:34.608.05 t=1031780096: From Client:chn=6;len=135: CLTCON; 64:34.608.09 t=1031780096: clt=(hst=172.16.3.134,trt-lnx-app03.interna.regio.it;pid=0;tid=0;sid=0;usr=nobody;ust=) 64:34.608.11 t=1031780096: log=(hst=TCP:trt-lnx-app03.interna.regio.it+13001;usr=usys;ust=LISRV) 64:34.608.13 t=1031780096: reguser: nid=172.16.3.134, node=trt-lnx-app03.interna.regio.it, pid=0, ust= 64:34.608.23 t=1031780096: [Mon Nov 3 14:12:32 2014] err=-21: cretpsv: Authentication of user/password failed for user usys 64:34.608.25 t=1031780096: To Client:chn=6;len=70: CONANS; Error=-21: 65:18.051.44 t=1835534112: Stopping server sid=20; shut=1 mode=normal 65:18.051.50 t=1835534112: Reason for stop: Server timed out after 300 seconds (max=300) 65:18.051.52 t=1835534112: To Server:chn=1000;len=6: SRVSHUT; It seems urouter is not anymore able neither to talk to current alive uservers neither to start new ones until last userver is closed; after this point everything is working again... While we are applying latest 9.6.05 patches (up to X505) and collecting more infos, is there anyone having suggestions, hints or any clue on reasons for this BAD behaviour? Thanks for any answers!




