Skip to main content

I had a report of deleting some records from MYFILE was taking longer than it should. Below is the result of ANALYZE.FILE and HASH.HELP.  Can anyone provide some thoughts or advice about the size of MYFILE?  

ANALYZE.FILE MYFILE
File name= MYFILE
File type= 18
File style and revision                 = 64BIT Revision 12
Number of groups in file (modulo)= 2368289
Separation= 4
Number of records= 7225458
Number of physical bytes= 6209269760
Number of data bytes= 3505627456
 
Average number of records per group= 3.0509
Average number of bytes per group= 1480.2363
Minimum number of records in a group= 0
Maximum number of records in a group= 13
 
Average number of bytes per record= 485.1772
Minimum number of bytes in a record= 104
Maximum number of bytes in a record= 55112
 
Average number of fields per record= 110.6418
Minimum number of fields per record= 40
Maximum number of fields per record= 286
 
Groups   25%     50%     75%    100%    125%    150%    175%    200%  full
      288853  506367  593082  460663  278281  138798   61706   40539
 
 
HASH.HELP:
File MYFILE  Type= 18  Modulo= 2368289  Sep= 4      09:48:29am  17 Jan 2024  PAGE    1
 
Of the 7218428 total keys in this file:
 
      0  keys were wholly numeric (digits 0 thru 9)
         (Use File Type 2, 6, 10 or 14 for wholly numeric keys)
 
7218428  keys were numeric with separators (as reproduced below)
         0123456789#$%&*+-./:;_
         (Use File Type 3, 7, 11 or 15 for numeric keys with separators)
 
      0  keys were from the 64-character ASCII set reproduced below
         !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`
         (Use File Type 4, 8, 12 or 16 for 64-character ASCII keys)
 
      0  keys were from the 256-character ASCII set
         (Use File Type 5, 9, 13 or 17 for 256-character ASCII keys)
 
The keys in this file are more unique in their entirety.
The smallest modulo you should consider for this file is 2326255.
The smallest separation you should consider for this file is 4.
The best type to choose for this file is probably type 18.


------------------------------
[Gary] [Rhodes]
[Universe Developer]
[NPW Companies]
[Hialeah] [FL] [USA]
------------------------------

I had a report of deleting some records from MYFILE was taking longer than it should. Below is the result of ANALYZE.FILE and HASH.HELP.  Can anyone provide some thoughts or advice about the size of MYFILE?  

ANALYZE.FILE MYFILE
File name= MYFILE
File type= 18
File style and revision                 = 64BIT Revision 12
Number of groups in file (modulo)= 2368289
Separation= 4
Number of records= 7225458
Number of physical bytes= 6209269760
Number of data bytes= 3505627456
 
Average number of records per group= 3.0509
Average number of bytes per group= 1480.2363
Minimum number of records in a group= 0
Maximum number of records in a group= 13
 
Average number of bytes per record= 485.1772
Minimum number of bytes in a record= 104
Maximum number of bytes in a record= 55112
 
Average number of fields per record= 110.6418
Minimum number of fields per record= 40
Maximum number of fields per record= 286
 
Groups   25%     50%     75%    100%    125%    150%    175%    200%  full
      288853  506367  593082  460663  278281  138798   61706   40539
 
 
HASH.HELP:
File MYFILE  Type= 18  Modulo= 2368289  Sep= 4      09:48:29am  17 Jan 2024  PAGE    1
 
Of the 7218428 total keys in this file:
 
      0  keys were wholly numeric (digits 0 thru 9)
         (Use File Type 2, 6, 10 or 14 for wholly numeric keys)
 
7218428  keys were numeric with separators (as reproduced below)
         0123456789#$%&*+-./:;_
         (Use File Type 3, 7, 11 or 15 for numeric keys with separators)
 
      0  keys were from the 64-character ASCII set reproduced below
         !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`
         (Use File Type 4, 8, 12 or 16 for 64-character ASCII keys)
 
      0  keys were from the 256-character ASCII set
         (Use File Type 5, 9, 13 or 17 for 256-character ASCII keys)
 
The keys in this file are more unique in their entirety.
The smallest modulo you should consider for this file is 2326255.
The smallest separation you should consider for this file is 4.
The best type to choose for this file is probably type 18.


------------------------------
[Gary] [Rhodes]
[Universe Developer]
[NPW Companies]
[Hialeah] [FL] [USA]
------------------------------

Are your drives SSDs?



------------------------------
Marcus Rhodes
marcus1@thinqware.com
------------------------------

I had a report of deleting some records from MYFILE was taking longer than it should. Below is the result of ANALYZE.FILE and HASH.HELP.  Can anyone provide some thoughts or advice about the size of MYFILE?  

ANALYZE.FILE MYFILE
File name= MYFILE
File type= 18
File style and revision                 = 64BIT Revision 12
Number of groups in file (modulo)= 2368289
Separation= 4
Number of records= 7225458
Number of physical bytes= 6209269760
Number of data bytes= 3505627456
 
Average number of records per group= 3.0509
Average number of bytes per group= 1480.2363
Minimum number of records in a group= 0
Maximum number of records in a group= 13
 
Average number of bytes per record= 485.1772
Minimum number of bytes in a record= 104
Maximum number of bytes in a record= 55112
 
Average number of fields per record= 110.6418
Minimum number of fields per record= 40
Maximum number of fields per record= 286
 
Groups   25%     50%     75%    100%    125%    150%    175%    200%  full
      288853  506367  593082  460663  278281  138798   61706   40539
 
 
HASH.HELP:
File MYFILE  Type= 18  Modulo= 2368289  Sep= 4      09:48:29am  17 Jan 2024  PAGE    1
 
Of the 7218428 total keys in this file:
 
      0  keys were wholly numeric (digits 0 thru 9)
         (Use File Type 2, 6, 10 or 14 for wholly numeric keys)
 
7218428  keys were numeric with separators (as reproduced below)
         0123456789#$%&*+-./:;_
         (Use File Type 3, 7, 11 or 15 for numeric keys with separators)
 
      0  keys were from the 64-character ASCII set reproduced below
         !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`
         (Use File Type 4, 8, 12 or 16 for 64-character ASCII keys)
 
      0  keys were from the 256-character ASCII set
         (Use File Type 5, 9, 13 or 17 for 256-character ASCII keys)
 
The keys in this file are more unique in their entirety.
The smallest modulo you should consider for this file is 2326255.
The smallest separation you should consider for this file is 4.
The best type to choose for this file is probably type 18.


------------------------------
[Gary] [Rhodes]
[Universe Developer]
[NPW Companies]
[Hialeah] [FL] [USA]
------------------------------
Do you have any indices and/or triggers on the file? If so, the additional processing might be the cause of the slowness more than the sizing of the file.

Brian

Are your drives SSDs?



------------------------------
Marcus Rhodes
marcus1@thinqware.com
------------------------------
Hey Marcus:
Yes they are SSDs

Thanks,
Gary

Do you have any indices and/or triggers on the file? If so, the additional processing might be the cause of the slowness more than the sizing of the file.

Brian
Hey Brian:
There are three indexes on the file. In your opinion do you think I have any problems with the sizing? From what you see could a different modulo and separation make a positive impact?

Thanks,
Gary

Hey Brian:
There are three indexes on the file. In your opinion do you think I have any problems with the sizing? From what you see could a different modulo and separation make a positive impact?

Thanks,
Gary

HI Gary,

Not sure a different modulo would help other then perhaps changing it to a Prime number.

>PRIME 2368289
Next lower prime number:  2368273.
Next higher prime number: 2368297.
>

A larger separation might help if there are a lot of records larger than 2K. The statistics show the record sizes range from 104 bytes up to 55112 bytes. But unclear how many are very large since the average size is 485 bytes. You could perform the following query to get an idea.

>LIST MYFILE BY.DSND SIZE SIZE USING DICT VOC

You could also experiment with file type since the number of records per group ranges from 0 to 13. But based on the nature of the keys, the current file type of 18 may be best. You could look at the HASH.AID command to test out some different values. 

But as Brian noted, there could be other reasons for the performance behavior noted. If one of the index files has an index value containing a very large number of record id's, that could impact performance. The LIST.INDEX command with the STATISTICS or DETAIL keyword would provide some info in this regard.

Thanks,

Neil



------------------------------
Neil Morris
Universe Advanced Technical Support
Rocket Software
------------------------------
Hey Brian:
There are three indexes on the file. In your opinion do you think I have any problems with the sizing? From what you see could a different modulo and separation make a positive impact?

Thanks,
Gary
Gary,

I will have to age myself in saying that is has been many years (over a decade?) since I have worked on file sizing. We've gone to Dynamic files for a good chunk of our database, and another gentleman on the team does the majority (okay, all) of the database maintenance.

Your numbers are very close to what HASH.HELP recommends. While HASH.HELP is never the gospel, it's usually not so far off as to cause noticeable performance issues.

Is this the only file experiencing slowness?
Do you have other files that are similar in size and usage that are okay?
What's unique about this file?
Are these manual DELETE commands (at TCL) or is the delete happening from within an application? If the latter, does the application do anything else at the same time?
Is it only deletes that are slow, or are adds and changes affected as well?
Did this slowness appear suddenly, or creep up over time? If it was a sudden occurrence, what changed?
Note that updating indices (which includes deleting a record ID) can have a significant impact on file performance once certain values get to a certain size (for example, a binary value where the vast majority of the millions of records have one of the two values).

I know I didn't really answer your question directly, but unfortunately I'm extremely rusty when it comes to actual file sizing. I hope the additional information might be helpful if the issue turns out to not be the sizing of the file.

Brian

Hey Marcus:
Yes they are SSDs

Thanks,
Gary

I believe all SSDs read and write 8k blocks, so you may want to adjust your separation accordingly.  If set to 4k, for example, then, in addition to the complexities (or simplicities, if you prefer) inherent to MV's data-storage  model, your drives are always reading in, updating only half the block, then writing out entire 8k blocks, and, on SSDs, that's going to be very slow.



------------------------------
Marcus Rhodes
marcus1@thinqware.com
------------------------------
Gary,

I will have to age myself in saying that is has been many years (over a decade?) since I have worked on file sizing. We've gone to Dynamic files for a good chunk of our database, and another gentleman on the team does the majority (okay, all) of the database maintenance.

Your numbers are very close to what HASH.HELP recommends. While HASH.HELP is never the gospel, it's usually not so far off as to cause noticeable performance issues.

Is this the only file experiencing slowness?
Do you have other files that are similar in size and usage that are okay?
What's unique about this file?
Are these manual DELETE commands (at TCL) or is the delete happening from within an application? If the latter, does the application do anything else at the same time?
Is it only deletes that are slow, or are adds and changes affected as well?
Did this slowness appear suddenly, or creep up over time? If it was a sudden occurrence, what changed?
Note that updating indices (which includes deleting a record ID) can have a significant impact on file performance once certain values get to a certain size (for example, a binary value where the vast majority of the millions of records have one of the two values).

I know I didn't really answer your question directly, but unfortunately I'm extremely rusty when it comes to actual file sizing. I hope the additional information might be helpful if the issue turns out to not be the sizing of the file.

Brian
Thanks again Brian:
Everything anyone shares is helpful and I do appreciate that you took the time to give me this information. When someone asked me about the file they were deleting about 60k record from a select list.

Thanks,
Gary

Hey Marcus:
Yes they are SSDs

Thanks,
Gary

Gary,

It's worth noting that not all SSDs are equal. There are multiple factors involved in comparison including:

interface type

  • SATA
  • M.2 
  • PCI Express
  • U.2 interface

SSD cell tyoe

  • SLC [Single-Layer cell] (old school SSD, fast, long endurance, relatively expensive)
  • MLC {Multi-layer cell] (cheaper, slower than SL often replaced by TLC and QLX)
  • TLC [Triple-layer cell]  (slower than MLC, more data-dense, cache size can be vitally important}
  • 3D Xpoint/Optane [New tech. from Intel and Micron - unfortunately Intel discontinued standalone Optane in 2021]  (faster - speed similar to DRAM, longer endurance)
  • QLX [Quad-level cell] (general usage SSD as of today, slower than TLC, higher density tthan TLC, generally has a lower endutance)
  • Z-NAND [Samsung specific] - has promise as in many ways it behaves as if an SLC. For detailed information see www.tomshardware.com

Endurance

 SSD drives 'wear' - i.e. any single storage point can be written to a finite number of times.There are internal mechanisms that perform optimisation and re-mapping on commercial SSDs to help here by limiting the number of physical writes to groups of cells - 'spreading the load'. You should figure in a regular SSD replacement cycle based on endurance and write cycles.

Bottom Line

Not all SSDs are equal and for Enterprise-class systems it is worth consulting a professional. It is also worth noting that SSDs do not eliminate a need for mirroring as a form of protection - SSDs can still fail and - in time and with sufficient writes - they will fail (as will hard disk spindles - eventually - just differently

JJ).



------------------------------
John Jenkins
Thame, Oxfordshire
------------------------------


I had a report of deleting some records from MYFILE was taking longer than it should. Below is the result of ANALYZE.FILE and HASH.HELP.  Can anyone provide some thoughts or advice about the size of MYFILE?  

ANALYZE.FILE MYFILE
File name= MYFILE
File type= 18
File style and revision                 = 64BIT Revision 12
Number of groups in file (modulo)= 2368289
Separation= 4
Number of records= 7225458
Number of physical bytes= 6209269760
Number of data bytes= 3505627456
 
Average number of records per group= 3.0509
Average number of bytes per group= 1480.2363
Minimum number of records in a group= 0
Maximum number of records in a group= 13
 
Average number of bytes per record= 485.1772
Minimum number of bytes in a record= 104
Maximum number of bytes in a record= 55112
 
Average number of fields per record= 110.6418
Minimum number of fields per record= 40
Maximum number of fields per record= 286
 
Groups   25%     50%     75%    100%    125%    150%    175%    200%  full
      288853  506367  593082  460663  278281  138798   61706   40539
 
 
HASH.HELP:
File MYFILE  Type= 18  Modulo= 2368289  Sep= 4      09:48:29am  17 Jan 2024  PAGE    1
 
Of the 7218428 total keys in this file:
 
      0  keys were wholly numeric (digits 0 thru 9)
         (Use File Type 2, 6, 10 or 14 for wholly numeric keys)
 
7218428  keys were numeric with separators (as reproduced below)
         0123456789#$%&*+-./:;_
         (Use File Type 3, 7, 11 or 15 for numeric keys with separators)
 
      0  keys were from the 64-character ASCII set reproduced below
         !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`
         (Use File Type 4, 8, 12 or 16 for 64-character ASCII keys)
 
      0  keys were from the 256-character ASCII set
         (Use File Type 5, 9, 13 or 17 for 256-character ASCII keys)
 
The keys in this file are more unique in their entirety.
The smallest modulo you should consider for this file is 2326255.
The smallest separation you should consider for this file is 4.
The best type to choose for this file is probably type 18.


------------------------------
[Gary] [Rhodes]
[Universe Developer]
[NPW Companies]
[Hialeah] [FL] [USA]
------------------------------
Thanks Brian:

This file was dynamic for a long time but we had constant problems for a while. We converted it to static hash file which has solved those issues, thankfully. Thanks for the list of considerations. All the information I am getting from the community is wonderful and extremely helpful.

Many thanks,
Gary