[UV 11.3.5] My Dynamic File Challenge

+1

Gregor Scott
Author
Participating Frequently
Forum|Forum|2 years ago
October 31, 2023

I have a challenge to automate the creation of a backup file for an existing data file. The data file being backed up will vary and they can be hashed or dynamic files.

It has been reasonably straight forward until I came to the effort to support an existing dynamic file.

All my previous posts about dynamic files are in an attempt to better understand how I can assess the existing file such that I can create the backup file using well-optimised specifications so the item copy process is as quick/efficient as possible.

I was first thinking that ANALYZE.FILE might help determine both the Minimum modulo needed as well as the group size needed to properly cater for the data, but now I'm unsure if that is the best approach.

My alternative method is to simply count the number of items in the existing file and generate an average item size, then set the MINIMUM.MODULO to the item count and set the RECORD.SIZE to the average item size and let UV's Dynamic file magic do it's work.

I have come up with this TCL command to both count the items as well as calculate the average record size:

SELECT COUNT(*), AVG(CAST(EVAL "LEN(@RECORD)+LEN(@ID)+1" AS INT)) FROM TEMP_DYN COUNT.SUP COL.SUP MARGIN 0;

Are these approaches valid, or are there other methods/options available to achieve the results I need to deliver?

Many thanks for any guidance on this.

------------------------------
Gregor Scott
Software Architect
Pentana Solutions Pty Ltd
Mount Waverley VIC AU
------------------------------

I tried the SELECT statement on a test file (it is dynamic, and has some VERY large items in it):

>SELECT COUNT(*), AVG(CAST(EVAL "LEN(@RECORD)+LEN(@ID)+1" AS INT)) FROM TEMP_DYNCOUNT.SUP COL.SUP MARGIN 0;
LEN ( @RECORD ) + LEN ( @ID ) + 1

2455 34746
>

So I then created a file using the following commands:

ICREATE.FILE DICT TEMP_DYN$_311023 18 53 2
ICREATE.FILE DATA TEMP_DYN$_311023 DYNAMIC GENERAL MINIMUM.MODULUS 2456 RECORD.SIZE 34746

And copied the source data into the new backup file using

COPYI FROM TEMP_DYN TO GDS.TEMP_DYN$_311023 ALL OVERWRITING

This all worked ok

>ICREATE.FILE DICT TEMP_DYN$_311023 18 53 2
Creating file "D_TEMP_DYN$_311023" as Type 18, Modulo 53, Separation 2.
Added "@ID", the default record for RetrieVe, to "D_TEMP_DYN$_311023".
>ICREATE.FILE DATA TEMP_DYN$_311023 DYNAMIC GENERAL MINIMUM.MODULUS 2456 RECORD.SIZE 34746
Creating file "TEMP_DYN$_311023" as Type 30.
>COPYI FROM TEMP_DYN TO TEMP_DYN$_311023 ALL OVERWRITING

2455 records copied.
>

This is what ANALYZE.FILE thinks:

>ANALYZE.FILE TEMP_DYN$_311023 STATS NO.PAGE
File name = TEMP_DYN$_311023
File has 2456 groups (each * represents 10 groups analyzed).
********************************************************************************
********************************************************************************
********************************************************************************
*****File name .................. TEMP_DYN$_311023
Pathname ................... TEMP_DYN$_311023
File type .................. DYNAMIC
File style and revision .... 64BIT Revision 12
NLS Character Set Mapping .. NONE
Hashing Algorithm .......... GENERAL
No. of groups (modulus) .... 2456 current ( minimum 2456, 934 empty,
1285 overflowed, 1263 badly )
Number of records .......... 2455
Large record size .......... 3257 bytes
Number of large records .... 1867
Group size ................. 4096 bytes
Load factors ............... 80% (split), 50% (merge) and 6% (actual)
Total size ................. 99160064 bytes
Total size of record data .. 89682841 bytes
Total size of record IDs ... 78591 bytes
Unused space ............... 9390440 bytes
Total space for records .... 99151872 bytes

File name .................. TEMP_DYN$_311023
Number per group ( total of 2456 groups )
Average Minimum Maximum StdDev
Group buffers .............. 9.86 1 148 15.37
Records .................... 1.00 1 5 1.02
Large records .............. 0.76 1 5 0.89
Data bytes ................. 6515.81 27 602257 2967.32
Record ID bytes ............ 32.00 2 175 33.24
Unused bytes ............... 3823.47 88 4096 558.45
Total bytes ................ 0371.28 4096 606208 0.00

Number per record ( total of 2455 records )
Average Minimum Maximum StdDev
Data bytes ................. 6530.69 38 516090 0615.47
Record ID bytes ............ 32.01 2 62 7.40
Total bytes ................ 6562.70 40 516152 0616.95

File name .................. TEMP_DYN$_311023
Histogram of record and ID lengths

63.8%
Bytes ---------------------------------------------------------------------

From reading the manual the RECORD.SIZE option for ICREATE.FILE should result in the group size and large record values being calculated to suite the value specified.

The resulting GROUP size is 4096 - aka GROUP.SIZE 2 - and the Large record value of 3257 is the 80% of the calculated group size, but neither seem related to the RECORD.SIZE provided on the command line.

Does this indicate that the RECORD.SIZE parameter is totally ignored, or does it have an upper limit (that is not documented)?

------------------------------
Gregor Scott
Software Architect
Pentana Solutions Pty Ltd
Mount Waverley VIC AU
------------------------------

Like

B

Brian Leach
Participating Frequently
Forum|Forum|2 years ago
October 31, 2023