Rocket iCluster

Repair multiple out of sync conditions with a single request.

  • 1.  Repair multiple out of sync conditions with a single request.

    Posted 10-05-2020 14:33
    Edited by Mark Watts 10-05-2020 16:14
    Before we discuss the command, let's discuss what an 'out of sync' (OOS) condition is.  An OOS is the result of a Synchronization Check run that resulted in a list of exceptions.  A Synchronization Check (SC) is a special process that compares attributes or contents (depending on the SC parameters) of objects selected for replication.  When a difference is detected, the objects are flagged as OOS.  Remember that this OOS exception list does not mean the object(s) are not replicating or in an error state.  It means that during the most recent SC, some aspect did not match.  Those specific exception conditions can be reviewed in more detail, but that is for a different post.  To be provided with the most accurate SC results, replication should be operating smoothly and in a condition of low latency at the time of the SC run.

    If the SC has resulted in a list of possible exceptions, we can see the results from the monitor and dive deeper using option 8 (Object Status) and take some corrective action on each one.  But wait, did you know that there is an Option 13 (Activate OOS) that will automatically refresh all OOS objects for a group?  There is no need to individually activate each one because with this command it can act on the list of all OOS exceptions for a group and automatically refresh each one.

    Notice also that there are three Refresh Types can that can be selected; *FULL, *FIXREC, *AUTH which target either specific exception types or a *FULL provides a generic full resync of the objects identified.

    Alternatively, the command DMACTOOS can be submitted from the command line or within a CLP program following a Sync Check.  This makes a very simple way to automate checking fidelity of replication and issuing auto corrections of any exceptions. 

    Are there any possible gotchas?  I would share there is slight risk that one of the objects in the OOS list are very large and the resulting REFRESH request could take a considerable amount of time and bandwidth that is unexpected.  For that reason, I would not jump into submitting an Activate for ALL OOS too early in your deployment.  Set up your replication.  Correct any exceptions.  Monitor it over time and become familiar with what event on your system and SC timings produce exceptions.  After a short while of observation and tuning to make your deployment reliable and predictable, the automated invocation of DMACTOOS can substantially reduce your need to correct any OOS exceptions manually.  

      Mark Watts
      Rocket Software