Skip to main content

CASKC0004W Error attempting to start SEP

  • March 18, 2015
  • 0 replies
  • 0 views

Problem Description

When starting an Enterprise Server region the following error message appears in the console:

    CASKC0004W Error attempting to start SEP

Resolution

The CASKC0004W messages in the console can be ignored. 

These messages are only a warning. The Enterprise Server manager process handles the issue by checking the number of requested SEPs against the number that have been started. 

If less SEPs were started than were requested then the manager will attempt to start a new SEP.  It does this until the requested number of SEPs are reached.


Although the message can be safely ignored, if you do want to prevent them appearing in the console then further investigation is required.

The first step is to enable auxiliary tracing.  Examining the trace for MgrCreate trace entries will show the return code when creating SEPs.  For example:


MgrCreate(Admin SEP)              437         2      26109 01 8d0183  8532172  <...! ....>  00000021 00000001 MGR dfhgglbl(dfhcpcrp)
MgrReqRC(    2 )                  438         2      26109 01 8d0284  8532175  <.... -rEA>  00000002 2d724541 MGR dfhgglbl(dfhcpcrp)

The above trace entries show that the creation of the admin SEP failed with an error code 2.  This error indicates an inter-process message send failure.  Other possible errors from the MgrCreate function are as follows:

            02 ipc-Result       pic s9(8) comp-5.
               88 ipc-no-response-88            value -3.
               88 ipc-Startup-BAD-88            value -2.
               88 ipc-CD-not-avail-88           value -1.
               88 ipc-Startup-OK-88             value 0.
               88 ipc-msgget-fail-88            value 1.
               88 ipc-msgsnd-fail-88            value 2.
               88 ipc-fgets-fail-88             value 3.
               88 ipc-fopen-fail-88             value 4.
               88 ipc-mknod-fail-88             value 5.

More information on the cause of the error 2 can be found by setting the ES_DBGIPC environment variable to ERROR to get additional traces in the syslog.  It is also required that *.info messages are enabled in syslog.conf.

In this instance the syslog showed that that all of the process that resulted in the CASKC0004W console message all output the message:

blocking msgsnd Error - errno = 11

(where errno 11 is EAGAIN)

The documentation for the operating system msgsnd() function states:

If sufficient space is available in the queue, msgsnd() succeeds 
immediately. The queue capacity is defined by the msg_qbytes field in the associated
data structure for the message queue. During queue creation this field is initialized to MSGMNB bytes, but this limit can be modified using msgctl. If insufficient space is available in the queue, then the default
behavior of msgsnd() is to block until space becomes available. If IPC_NOWAIT is specified in msgflg, then the call instead fails with the error EAGAIN.



So the error occurred because the message can't be sent due to the msg_qbytes limit on this message queue.

This can be increased using the MSGMNB system/kernel value:

MSGMNB - Default maximum size in bytes of a message queue: 16384 bytes

On Linux, this limit can be read and modified via /proc/sys/kernel/msgmnb.

The superuser can increase the size of a message queue beyond MSGMNB by a msgctl(2) system call.



Note that "ipcs -l" shows the current values.

Also note that these parameters can be modified by setting them in "etc/sysctl.conf" (and then run "sysctl -p" to load these into the kernel).

For the error 2, increasing message queue size resolves the underlying operating system problem and prevents CASKC0004W messages appearing in the console.


#Server
#EnterpriseServer
#Enterprise

0 replies

Be the first to reply!