Increase OOS Activation performance when there are a lot of OOS objects

Forum|Forum|2 years ago
August 30, 2023
2 replies
10 views

Satid Singkorapoom
Participating Frequently

Hello

I'm testing iCLuster in a pre-production machine by restoring a lot of updated files from User Acceptance Test system to pre-PRD system while turning off replication group. Then run DMMRKPOS and start the group with the marked position *YES. Then I start Sync Check on the group and found around 3,000 OOS objects reported :

When I do WRKACTJOB on primary system, there are several iCluster jobs running eating up a total of not more than 35% CPU power of P9 2-core PRD machine. CPU % Busy is less than 10% on the backup node DR machine (same HW capacity as PRD). When I press F5, the OOS count reduce slowly.

My question is whether there is a way available in iCluster's parameters that I can, say, increase the number of iCluster jobs doing OOS Activation task to speed up the process?

Thanks in advnace for your response.

------------------------------
Satid Singkorapoom
IBM i SME
Rocket Forum Shared Account
------------------------------

Mark Watts
Rocketeer
Forum|Forum|2 years ago
August 30, 2023

Hello

I'm testing iCLuster in a pre-production machine by restoring a lot of updated files from User Acceptance Test system to pre-PRD system while turning off replication group. Then run DMMRKPOS and start the group with the marked position *YES. Then I start Sync Check on the group and found around 3,000 OOS objects reported :

When I do WRKACTJOB on primary system, there are several iCluster jobs running eating up a total of not more than 35% CPU power of P9 2-core PRD machine. CPU % Busy is less than 10% on the backup node DR machine (same HW capacity as PRD). When I press F5, the OOS count reduce slowly.

My question is whether there is a way available in iCluster's parameters that I can, say, increase the number of iCluster jobs doing OOS Activation task to speed up the process?

Thanks in advnace for your response.

------------------------------
Satid Singkorapoom
IBM i SME
Rocket Forum Shared Account
------------------------------

Hi Satid,

There are several different approaches to synchronizing the replication pair (Primary and Backup). One way that customers will use is similar to the way you described... Which is really starting a replication group you know is out of sync, running a sync check and then correcting the exceptions. When you submit a DMACTOOS request, this is a task that will perform an automated activation for every out of sync object in the list and as you have noticed, it is single threaded. That is because it was not designed with the anticipation that there would be 1000s of out of sync objects to activate. That said, it will take on the task but it will take the time necessary to refresh each object. Notice in the system values there is a value for 'object compression' that can help when there are many objects that are compressible to reduce communication effort to the Backup node.

However, when you are synchronizing over a communication link as in the scenario you shared, it may be more advantageous to instead use command DMMRKSYNC. From the online help: "The Mark Journal Positions and Synchronize Objects (DMMRKSYNC) command marks journal positions in the iCluster metadata for a specified replication group, synchronizes the objects replicated by the group to the backup node, and starts the group using marked journal positions if necessary". This command uses protocol FTP to send the resulting savefile to the Backup node and restores the objects. There is more detail in the online help.

You can also leverage the save and send process you used to set the test objects on the test-prod system and also use the same process to restore the objects to the backup node. Follow that activity with the DMMRKPOS and 'start at marked' and the groups' objects have been restored in-sync and your first sync check after startup should be very close to in-sync with few exceptions (if any). Be sure to set the auto-repair options in the sync check so that very little manual repair request are necessary. Also allow the sync check to check for obsoletes and delete obsoletes unless you are aware of anticipated exception that you want to remain.

Regards,

Mark

------------------------------
Mark Watts
Sr. Software Engineer
Rocket Software Inc
Waltham MA US
------------------------------

Like

S

Satid Singkorapoom
Author
Participating Frequently
Forum|Forum|2 years ago
August 30, 2023

Hi Satid,

There are several different approaches to synchronizing the replication pair (Primary and Backup). One way that customers will use is similar to the way you described... Which is really starting a replication group you know is out of sync, running a sync check and then correcting the exceptions. When you submit a DMACTOOS request, this is a task that will perform an automated activation for every out of sync object in the list and as you have noticed, it is single threaded. That is because it was not designed with the anticipation that there would be 1000s of out of sync objects to activate. That said, it will take on the task but it will take the time necessary to refresh each object. Notice in the system values there is a value for 'object compression' that can help when there are many objects that are compressible to reduce communication effort to the Backup node.

However, when you are synchronizing over a communication link as in the scenario you shared, it may be more advantageous to instead use command DMMRKSYNC. From the online help: "The Mark Journal Positions and Synchronize Objects (DMMRKSYNC) command marks journal positions in the iCluster metadata for a specified replication group, synchronizes the objects replicated by the group to the backup node, and starts the group using marked journal positions if necessary". This command uses protocol FTP to send the resulting savefile to the Backup node and restores the objects. There is more detail in the online help.

You can also leverage the save and send process you used to set the test objects on the test-prod system and also use the same process to restore the objects to the backup node. Follow that activity with the DMMRKPOS and 'start at marked' and the groups' objects have been restored in-sync and your first sync check after startup should be very close to in-sync with few exceptions (if any). Be sure to set the auto-repair options in the sync check so that very little manual repair request are necessary. Also allow the sync check to check for obsoletes and delete obsoletes unless you are aware of anticipated exception that you want to remain.

Regards,

Mark

------------------------------
Mark Watts
Sr. Software Engineer
Rocket Software Inc
Waltham MA US
------------------------------

Dear Mark

Once again, I thank you for your very informative response that supplies me with enlightening bits of knowledge. I appreciate your sharing of abundant knowledge.

------------------------------
Satid Singkorapoom
IBM i SME
Rocket Forum Shared Account
------------------------------