Rocket U2 | UniVerse & UniData

 View Only

 Universe behaviour when loading a subroutine

John Green's profile image
PARTNER John Green posted 11-20-2023 23:02

I am hoping someone can explain how Universe determines when, and from where, to load object code.

What does Universe do when it has to call a subroutine. My experiments point to using a globally catalog subroutine as being faster than a locally catalog routine but I don't see why.

I understand that with a local catalog the system must read the VOC to find the object location then load the object code, but once it is loaded it remains in memory for the duration of the program. Why would this be faster than a global catalog which must effectively do a similar thing. And regardless of how it is found and loaded, thereafter, the object remains in memory and there should be no difference in the cost of each subsequent call. Yet my testing shows a significant performance gain using a globally catalog subroutine.

I believe a globally cataloged subroutine loaded to shared memory so multiple processes execute the same object code but each with there own memory space. Is that correct?

What is the benefit of adding a subroutine to the SHM.TO.LOAD file? Does this just reduce the cost of the initial load the first time a subroutine is called?

Ultimately I'm trying to determine if our frequently called subroutines  (millions of calls per day responding to requests via Web DE) should be put in to the global catalog.

Doing so would seem to come with the once-off cost of changing our code from "CALL MYSUB" to "CALL *MYSUB" in order to invoke the globally cataloged code. Is there a way around having to make this change?

John Jenkins's profile image
John Jenkins

If I remember correctly:

  • When loading a program/subroutine, every process has its own space for variable data.
  • A globally catalogued program will only have one copy of the program object code in memory, loaded on demand and shared by all that use the same program/subroutine. When no-one is using it any more it goes away and requires re-loading to re-use.
  • If pre-loaded using SHM.TO.LOAD then the program object code is loaded into UniVerse memory when UniVerse starts, so is available immediately to all processes with no further loading required. It is never unloaded, even when not actively in use.

If using pre-loading with SHM.TO.LOAD then UniVerse will however take longer to start and I have seen one (only one) edge case with a massive number of programs subroutines (10s/100s of thousands)  - a complete company application suite - was pre-loaded. The startup time for UniVerse became so long that various time-dependent applications could not run UniVerse when they expected to as it was not yet ready - too busy loading all those subroutines.   The solution was to trim the list from EVERYTHING to what was really relevant.

So catalogued programs are faster and global catalogues more conservative of memory space - which can matter if there are thousands of routines and also users, With current system speeds though it is moot whether there would be any worthwhile substantial gain by pre-loaded using SHM.TO.LOAD, though I can see extreme cases where it might matter. It might also matter if you are using a hosted cloud environment and might want to minimise IOPS for co$t control rea$on$, even if the $aving is $mall

Regards

JJ.

Brian Leach's profile image
Brian Leach

Just to answer the last part of your question:

Given a globally cataloged subroutine *HELLO:

ED VOC HELLO
1> V

2> *HELLO

3> B

Will run it as HELLO, but whether that obviates any performance gains from having it globally cataloged is another question. One obvious one - how well sized is your VOC? It's the file everyone forgets .. and might be responsible for longer load times if it's having to search for the catalog pointer. Old school I know :)

Neil Morris's profile image
ROCKETEER Neil Morris

John, are you able to provide an example of how you are testing the performance? My understanding is similar to yours in that for a local subroutine the reading of the VOC would be an extra step. But only for the initial call. I can take a look and see if there is an explanation for the behavior you are seeing. Thanks. Neil

Doug Averch's profile image
PARTNER Doug Averch

We have found that on Universe local cataloging works well using UniObject for Java in web applications. 

Our response is usually under 8 milliseconds. 

Additionally, we have found that since we are calling the routine through UOJ there is one copy per web license.

Posted: 11-22-2023 12:47

John Green's profile image
PARTNER John Green

Thanks for all the responses.

 @Brian Leach, there's always a case for old school and a very good point.

Further testing raised the interesting situation that I can call a subroutine that is not cataloged - hence my confusion that both CALL *SUB and CALL SUB ran the same code. In fact CALL *SUB ran  the globally catalog code while CALL SUB ran the object code from the .O file even though there is no catalog for SUB. Case 00933081 raised as I did not expect that behaviour.

@Neil Morris, my testing has been using routine we have for code analysis, so many levels of recursive calls and complex enough to make it hard to provide for you to test. I'll keep digging and perhaps can some to some easy-to-publish code that demonstrates my findings.

What I'm really after this the details of the CALL logic used to find the the object code to execute including how already loaded code is used/reused.

@John Jenkins, I agree the benefit of SHM.TO.LOAD seems small, especially in the Web DE scenario where the responder process is running for long periods and therefore subroutines remain in memory once called.

Brian Leach's profile image
Brian Leach

Hi John

You can call a subroutine that isn't cataloged if it is in the same file as the calling routine.

Gregor Scott's profile image
PARTNER Gregor Scott

"I am hoping someone can explain how Universe determines when, and from where, to load object code."

From my testing and tracing I believe UniVerse uses the following search sequence to locate the object code for the called subroutine/function:

  1. Local object code (for local subroutines/functions compiled into the same object code as the caller)
  2. Local  VOC (this is checked even for Global Catalog program names, such as "*MYSUB", allowing local VOC's to intercept the search for a Globally Cataloged program)
  3. Object library of the caller
  4. Global Catalog

I have not experimented with loading programs into shared memory, so am not sure if they still fall into step 4 or not.

John Green's profile image
PARTNER John Green

@Gregor Scott, Thanks for your input. I agree. My own testing showed the same thing. I measured the elapsed time to make 5 million calls (prog-->subA-->subB). My results show a range of approx. 3% difference between the fasted method (No catalogs, all object code in the same file) and the slowest (all object globally cataloged). So I guess there's a trade-off to be made between memory used by object code per process vs. load time. The absolute load time difference per call is very small so it makes me wonder if global catalog is worth it at all.

Gregor Scott's profile image
PARTNER Gregor Scott

John

Your approach of having no catalog entries in the VOC still results in the VOC being checked for the subroutine.

The fastest call method is to have all the subroutines compiled into the one object code (you simply need to include the subroutines at the end of your program), like this:

* Example program

<...>

END

$INCLUDE subA

END

$INCLUDE subB

This will result in no additional VOC queries for the subroutines.

The obvious down side is the object code maintenance needed to deliver changes to the included subroutines when the main program has not changed.

John Green's profile image
PARTNER John Green

@Gregor Scott, thanks for the input. Tests I did a few years back showed the local functions being significantly faster than external functions. Generally I think the cost of the dependency is greater than the benefit except in a few cases. I had not considered the $INCLUDE method for including the function as a local function, that is a good idea I will explore.