Rocket iCluster

 View Only
  • 1.  iCluster deletes LF objects on the target and sync check flags them as not found NFD

    Posted 10-21-2022 15:48
    I have an ongoing issue.  Yes, I do have a case opened with Rocket Support, but I thought I would reach out to the user community on this topic to see if anyone else has run into this.

    We have over 125 PAIRS of IBM i LPARs, production (source) and DR (target) pairs.  That's a LOT of systems all running iCluster replication!  They all run basically the same application.  I am having an issue with one specific pair of source/target lpars when it comes to LFs being replicated.

    Our application has MANY LFs.  In this specific source/target pair, there are about 30 LFs that refuse to play nicely with iCluster.  They are flagged as OOS (out of sync) in the Latency screen (option 81 on the target side) with an NFD (not found) code.  This is the ONLY pair of systems that has an issue with this subset of LFs.  

    What have I done?  Well, I run a DMSNDOBJ (send object) on the source system to send the objects over to the target system.  I then do a WRKOBJ on the target system to make sure that the objects did, in fact, make it over to the target.  They are there.  At some point during the day, the objects disappear from the target system.  I once suspected that the sync check (which we run overnight) was deleting them.  It wasn't. I tried putting a HOLD on my sync check job on the IBM job scheduler and the objects still disappeared from the target. 

    I've gone around and around with my rep at Rocket support and we tried a lot of different things.  Running DMINZNODE, playing with the various options defined on the groups, and the various iCluster-wide system values, but to no avail.  We cannot prevent these LF objects from being deleted from the target system and becoming flagged as OOS on the monitor screen. 

    Has anyone run into a similar situation like this before?  If so, any suggestions as to what to do?  The last piece of advice I was given by Rocket support was to do a manual refresh of the whole library using virtual tape.  Due to the size of the library, that is not a feasible thing to do, especially since there is no guarantee that these files will still have the same issue.

    Thoughts?

    Thanks for reading.

    Sonia

    ------------------------------
    Sonia Dodov-Maloney
    Director Of Global Information Technology
    Computer Guidance Corporation
    Scottsdale AZ US
    ------------------------------


  • 2.  RE: iCluster deletes LF objects on the target and sync check flags them as not found NFD

    Posted 10-24-2022 08:22
    As part of your investigative work, have you reviewed the DO journal entries on the target to try and determine what is deleting the LFs? Won't solve your issue, but might give you some additional insight into what is happening to the LFs.

    We have some weekend processes that "rebuild" certain files and they inconsistently show up as NFD on Monday mornings. We've even had instances where the physical doesn't get replicated and then the LF shows up as NFD. There was no indication in the sync checks that the PF was also not there. Once we sent the PF over, then the LF could be replicated.

    ------------------------------
    John Techmeier
    Infrastructure Engineer Senior - IBM i
    Associated Banc-Corp
    Green Bay WI US
    ------------------------------



  • 3.  RE: iCluster deletes LF objects on the target and sync check flags them as not found NFD

    Posted 10-24-2022 13:56
    Hi John,

    Thanks for responding.

    I've tried to display the journal on the target system, but the receiver doesn't exist on it.  It only exists on the source system.  I tried to un-journal/re-journal one of these LFs but these LFs are implicitely journaled by the system.  Meaning, they are not journaled via the STRJRNAP (start journal access path) command, but rather their based on physical files are journaled so the LFs are journaled automatically by the system.  So I was unable to un-journal and re-journal them so I could determine the process that is deleting the LFs.  Frustrating for sure!

    These LFs are not rebuilt over and over again.  They are built once, at software install time, and then are not deleted.  I can tell by the creation dates on the few that I've sampled via the DSPFD command. 

    Thanks,
    Sonia

    ------------------------------
    Sonia Dodov-Maloney
    IBM i Systems Engineer
    Computer Guidance Corporation
    Scottsdale AZ US
    ------------------------------



  • 4.  RE: iCluster deletes LF objects on the target and sync check flags them as not found NFD

    Posted 10-24-2022 14:40
    These would be audit journal entries not database journal entries.

    DSPJRN JRN(QAUDJRN) RCVRNG(*CURCHAIN) FROMTIME('10/24/2022' '10:00:00') TOTIME('10/24/2022' '13:00:00') ENTTYP(DO)

    Prompting and filling in the appropriate FROMTIME and TOTIME values.

    Depending on your IBM i OS version and PTF level, you might be able to use the audit journal entry services to find it via SQL. AUDIT_JOURNAL_DO table function - IBM Documentation  

    SELECT OBJECT_LIBRARY, OBJECT_NAME, USER_NAME
      FROM TABLE( SYSTOOLS.AUDIT_JOURNAL_DO( STARTING_TIMESTAMP => CURRENT TIMESTAMP - 7 DAYS ) )
      WHERE OBJECT_LIBRARY = '<library>'
             AND OBJECT_NAME = '<filename>'
            AND OBJECT_TYPE = '*FILE';


    ------------------------------
    John Techmeier
    Infrastructure Engineer Senior - IBM i
    Associated Banc-Corp
    Green Bay WI US
    ------------------------------



  • 5.  RE: iCluster deletes LF objects on the target and sync check flags them as not found NFD

    Posted 10-25-2022 12:50

    Hi John,

    Thanks for the info.  I tried running the commands you provided and it was not very helpful.  The SQL statement returned one entry (I ran it for only 1 file name as an example).  I received the following output:
    ....+....1....+....2....+....3....
    OBJEC00001 OBJEC00002 USER_NAME
    CMSFIL            CRVPCLS01      DMCLUSTER
    ******** End of data ********
    So nothing that I didn't already know.  The user DMCLUSTER is the user doing the delete on this file.  But why is the question?

    Thanks,
    Sonia



    ------------------------------
    Sonia Dodov-Maloney
    IBM i Systems Engineer
    Computer Guidance Corporation
    Scottsdale AZ US
    ------------------------------



  • 6.  RE: iCluster deletes LF objects on the target and sync check flags them as not found NFD

    ROCKETEER
    Posted 10-25-2022 14:27
    LFs are not just an object, they can be a dependent object. (depending on whether they are a SQL index or View, or a logical build from a DDS file description)  When there is a refresh action on the Physical File (with logicals) you will notice that iCluster will delete all the logicals associated with the Physical File on the backup, then refresh the Physical, and lastly refresh/rebuild all the logicals.  If the refresh of the physical file fails to complete, the logicals and possibly the physical can be NFD on the backup.  If it failed, I would expect the Physical to be SUSPENDED on the PRIMARY or BACKUP node.  If iCluster then automatically attempts to activate the file again, we would expect it to delete the logicals associated with it (on the BACKUP) and then attempt to ACTIVATE it.  Look at your AUTOMATIC REACTIVATION settings to understand the interval that is set.

    Rather than using a 'DMSENDOBJ' on the Logicals, you could try this process (DMACTOBJ) and see if you have success (which will refresh the entire Physical file and it's Logicals (which has some considerations that you are likely aware of, i.e. - source side locking from the application, users ,backups, at the time of your attempt etc.   Physical File data size and bandwidth limitation should also be considered).

    I would also consider upgrading to iCluster 8.4 for this pair if you are not currently running the latest. You have a lot of clusters to manage so this one could be the first 8.4 of your sets.

    You may also be aware that Rocket Development Lab is planning a delivery of iCluster 9.1 before year end with a planned Web Administrator. (we will all know the exact version number of the web release when it is available but for now, I think 9.1 is a candidate)  So, before you upgrade your world you may want to preview that and see if that would be a better destination for an upgrade effort.  Reach out to me and I can show you a brief walk through of the Alpha/Beta I have access to.  (Lets get this other issue resolved first.  If this cluster pair is running a different OS version on source or target, it could impact whether you should be considering an iCluster upgrade even though your other clusters are working fine.)  Previewing 9.1 should not preclude your move to 8.4 for this cluster if that would resolve your issues.

    1. On the Primary 'DSPFD FILE(lib/lfile)
    2. Find 'PFILE' 
    3. 'DSPFD  FILE(lib/pfile)
    4. Find 'Total' and review Total records and member sizes.
    5. Compare these attributes on the Backup manually by repeating these steps on the Backup node. (just to confirm the situation)
    6. Manually suspend the physical file 'DMSUSOBJ' in iCluster
    7. Go to the backup node and delete the physical and all of its logicals.
    8. Check your iCluster Compression settings in option 6 from the main menu. Third field from the top, set it to *MEDIUM. (in older releases it was just *YES = *MEDIUM)  using compression could reduce the overall data volume needed to synchronize during the save and restore.
    9. Prompt the DMACTOBJ and press F9 for all fields
    - Move your cursor to Pysical File refresh method and press F1 and review the field help.
    - Depending on the file allocations you may need to perform this command (DMACTOBJ) when the application can be ended
    - My preference is to set the environment as needed for successful run using the *SAVRST option

    I hope you find that suggestion helpful.  Good Luck!

    ------------------------------
    Mark Watts
    Software Engineer
    Rocket Software Inc
    Waltham MA US
    ------------------------------