CASKC0004W Error attempting to start SEP

Forum|Forum|10 years ago
March 18, 2015
0 replies
0 views

+2

Dominique Sacre
Rocketeer

Problem Description

When starting an Enterprise Server region the following error message appears in the console:

CASKC0004W Error attempting to start SEP

Resolution

The CASKC0004W messages in the console can be ignored.

These messages are only a warning. The Enterprise Server manager process handles the issue by checking the number of requested SEPs against the number that have been started.

If less SEPs were started than were requested then the manager will attempt to start a new SEP. It does this until the requested number of SEPs are reached.

Although the message can be safely ignored, if you do want to prevent them appearing in the console then further investigation is required.

The first step is to enable auxiliary tracing. Examining the trace for MgrCreate trace entries will show the return code when creating SEPs. For example:

MgrCreate(Admin SEP)              437         2      26109 01 8d0183 8532172 <...! ....> 00000021 00000001 MGR dfhgglbl(dfhcpcrp)
MgrReqRC(    2 )                  438         2      26109 01 8d0284 8532175 <.... -rEA> 00000002 2d724541 MGR dfhgglbl(dfhcpcrp)

The above trace entries show that the creation of the admin SEP failed with an error code 2. This error indicates an inter-process message send failure. Other possible errors from the MgrCreate function are as follows:

            02 ipc-Result       pic s9(8) comp-5.
               88 ipc-no-response-88            value -3.
               88 ipc-Startup-BAD-88            value -2.
               88 ipc-CD-not-avail-88           value -1.
               88 ipc-Startup-OK-88             value 0.
               88 ipc-msgget-fail-88            value 1.
               88 ipc-msgsnd-fail-88            value 2.
               88 ipc-fgets-fail-88             value 3.
               88 ipc-fopen-fail-88             value 4.
               88 ipc-mknod-fail-88             value 5.

More information on the cause of the error 2 can be found by setting the ES_DBGIPC environment variable to ERROR to get additional traces in the syslog. It is also required that *.info messages are enabled in syslog.conf.

In this instance the syslog showed that that all of the process that resulted in the CASKC0004W console message all output the message:

blocking msgsnd Error - errno = 11

(where errno 11 is EAGAIN)

The documentation for the operating system msgsnd() function states:

If sufficient space is available in the queue, msgsnd() succeeds 
immediately.

The queue capacity is defined by the msg_qbytes field in the associated 
data structure for the message queue. During queue creation this field is 
initialized to MSGMNB bytes, but this limit can be modified using msgctl. 

If insufficient space is available in the queue, then the default 
behavior of msgsnd() is to block until space becomes available.

If IPC_NOWAIT is specified in msgflg, then the call instead fails with the error EAGAIN.

So the error occurred because the message can't be sent due to the msg_qbytes limit on this message queue.

This can be increased using the MSGMNB system/kernel value:

MSGMNB - Default maximum size in bytes of a message queue: 16384 bytes

On Linux, this limit can be read and modified via /proc/sys/kernel/msgmnb.

The superuser can increase the size of a message queue beyond MSGMNB by a msgctl(2) system call.

Note that "ipcs -l" shows the current values.

Also note that these parameters can be modified by setting them in "etc/sysctl.conf" (and then run "sysctl -p" to load these into the kernel).

For the error 2, increasing message queue size resolves the underlying operating system problem and prevents CASKC0004W messages appearing in the console.

#Server
#EnterpriseServer
#Enterprise

Problem Description

Resolution

Recent badge winners

Sign up

Please log in or register:

Welcome to the Rocket Forum!

Please log in or register:

Scanning file for viruses.

This file cannot be downloaded