Rocket iCluster

 View Only

iCluster Synchronization - Synchronizing replication using the network connection.

  • 1.  iCluster Synchronization - Synchronizing replication using the network connection.

    Posted 06-16-2021 19:01
    In a previous post on synchronization, we discussed using a save and restore method to synchronize the Backup node and start replication at a known point and time in the journal receivers. In this post we will tackle two methods to use some built-in functions or commands that can accomplish this task with ease. There are more possibilities as proven by customers talking about methods that they have used in the past. I must say there are some limitations depending on the size of the bandwidth available, volume of data to be sent, rate of change the application produces hourly or daily, system resources to work on the task and although most replication groups can use this method successfully there are certain conditions in which a network synchronize may not be possible.  Don't lose heart though as we are going to discuss (right now) some powerful methods and commands that can make network synchronization easy and successful using Rocket iCluster. 

    Method number one is the easiest network synchronize procedure of the two shared in this article. After the design of the selections for the group and the additions of the necessary exclusions for replication to have the best efficient result, we are ready to start the replication group. I would point out that for each replication group to perform well, in addition to the 'replicate the right stuff' determination, invest the effort to create a unique 'Default Journal' for each replication group and use a separate Staging library for each group. There are many override settings available to you and consider if each of these adds value and separation for the unique demands of the replication group. Now, all options reviewed and set as desired, start the group with the option to ' Refresh selected objects' set to *YES. I go ahead and submit the start job to batch so I am free to work on the next group start or monitor progress immediately. A command example follows:

    note: I named the job 'AASTRPAY' just for alphabetic convenience.  Use any name you wish.

    Most admins are starting replication to a target system instance previously seeded with an original restore completed some weeks ago. As a result most of the user profiles are already intact on the BACKUP node but it is a good idea to have your replication group designed to maintain user profiles and authority lists replicated and completed synchronization before starting the other groups that will be dependent on any user profile existence. Use the procedure we discussed but start with your group that includes user profiles and authority first, then proceed with the remainder of your replication groups.  Although you can run multiple refreshes at a time, I usually start them one at a time and observe the overall progress while monitoring any impact to the system performance before opting to start another one.  Communication and system resources competition is usually very low for normal replication however the initial refresh of all objects in a replication group can temporarily spike the demand of resources and it is prudent to see how things are going and be a bit conservative until the REFRESH mode is complete.  

    Method number two is using a new command for network synchronization that was introduced at iCluster version 8.2. The command is DMMRKSYNC and using the options available allow the command to more efficiently utilize the network and can be faster for the network sync process.  The command does the following:
    •  create/mark a sync point
    •  save all the objects to a save file in a temporary work library (one save file for each library in selection of the group)
    •  use the FTP service to send the save file(s) to the target system
    •  restore the objects from the save file
    •  optionally delete all the save files used in the invocation
    •  start the replication group automatically when restore processing is complete.

    Again, although we can submit this for multiple replication groups concurrently, I prefer to activate one group at a time when using this method. That allows each group activation to complete and not compete for available bandwidth and resources.  As you are aware, an FTP interruption can not be resumed so we want to successfully complete a replication group start to capture our successes before moving to the next group. You can create a CLP with some DLYJOB statements to attempt to wait some time between each synchronize requests if you want to perform these unattended (over the weekend for instance).  As is true of any save operation, objects that are exclusively locked at the time of the save can be skipped and special handling of these may be necessary if running this task while the applications are active.  One easy way to address these is to schedule the activation of the exceptions when the application can be briefly stopped. Stop the application, activate the objects when the lock is released and restart the application again.

    Considerations for using the DMMRKSYNC command:
    • In the iCluster system values there is a setting for 'Operation values / Object compression' that will be used by this command and can reduce the overall size of the resulting save files. I recommend  using the value '*MEDIUM 'compression.
    • If the replication group is very large with many libraries and storage, estimate the amount of work space needed to save all the group selections and confirm the required space is available.  In some extreme cases, the DMMRKSYNC command can not be used with some groups without reworking the selections of objects in scope and resulting overall group size.
    • If you are submitting this task unattended, remember to review the scheduled jobs to see scheduled IPLs, BACKUPS or other tasks that could interrupt the sync process.   We can put these on hold or schedule your sync attempt to avoid a conflict.
    In our next post on Synchronization, we will discuss some tricks to help speed through challenges like 'limits in available bandwidth over time' OR 'using a stale Backup target node state as a starting point'.


    Mark Watts
    Rocket Software