Problem:
Sample attached is just a demonstration which applies recommendations of KB 19691
Customer has questions on using large XML files with Net Express
http://supportline.microfocus.com/mf_kb_display.asp?kbnumber=19691
What does this sample show?
===========================
It just shows the chapter 3) of the KB 19691
...
3. You can use the CBL_OPEN_FILE and CBL_READ_FILE byte stream routines to read the data directly from the XML document in managable blocks of 100KB or more. You can then use INSPECT and STRING statements to parse the data read to find the first <childdocn> start tag and the last </childdocn> end tag in the data buffer and then extract this information into another buffer such as XML-BUFFER PIC X(100000).
In the SELECT statement for your XML enabled file you can have ASSIGN TO ADDRESS OF XML-BUFFER. Then when you open the file you can do a READ NEXT through each of the <childdocn> elements in the XML-BUFFER until you reach end of file. When you are done processing this buffer of data you would close the XML enabled file and reposition the starting offset for the next CBL_READ_FILE to point to the byte immediately following the last </childdocn> end tag that you extracted into XML-BUFFER. You can then perform the CBL_READ_FILE operation and repeat this process until the end of the XML document has been reached.
This method should be used when you do not have any control of the format of the XML input document.
...
===========================
Customer's context is reading an XML file of 400 Megabytes.
Demonstration described below doesn't use such a big file, but the methodology will be the same with a 'HUGE' file...
===========================
this demonstration was tested on Net Express 5.0 WebSync2 and on Server Express 5.0.
Resolution:
1) STEP 1 of the sample: in paragraph GenerateXmlFile of ioxml.cbl
========================================================
The STEP1 is just an easy way to quickly generate an XML file ... named FItest.xml
An XML stream is generated, using the XML GENERATE statement .....
(The XML GENERATE statement converts data ( GROUP of datas ) to XML format.)
The XML stream is generated from the COBOL Group ROOTNODEANSI
and written to a classical Line Sequential file. ... so this file can be reused later in an XML context
A quick look at the Group ROOTNODEANSI shows it contains more 'sub-groups':
78 Lg value 50.
01 ROOTNODEANSI.
02 B-XMLNOTUSEFULL.
03 B-XMLNOTUSEFULL03-1 PIC X(LG).
03 B-XMLNOTUSEFULL03-2 PIC X(LG).
03 B-XMLNOTUSEFULL03.
04 B-XMLNOTUSEFULL04-1 PIC X(LG).
04 B-XMLNOTUSEFULL04-2 PIC X(LG).
02 COORDONNEESANSI.
03 NOM PIC X(LG).
03 PRENOM PIC X(LG).
03 ADDRESSE.
04 ADR1 PIC X(LG).
04 ADR2 PIC X(LG).
03 CODEPOSTAL PIC X(LG).
03 VILLE PIC X(LG).
03 PAYS PIC X(LG).
02 E-XMLNOTUSEFULL.
03 E-XMLNOTUSEFULL03-1 PIC X(LG).
03 E-XMLNOTUSEFULL03-2 PIC X(LG).
03 E-XMLNOTUSEFULL03.
04 E-XMLNOTUSEFULL04-1 PIC X(LG).
04 E-XMLNOTUSEFULL04-2 PIC X(LG).
OF COURSE, we'll later ONLY read the sub-group COORDONNEESANSI. of ROOTNODEANSI.
2) STEP2 of the sample is just an XML OPEN & READ of the Line Sequential file generated in STEP1.
Using the XML file named XMLFILE (select XMLfile ASSIGN FItest)
The XD used for this file was generated using cbl2xml tool.
The whole XML file is read here...
========================================================
3) STEP3: we'll now READ ONLY the XML node COORDONNEESANSI and its sub-nodes.
========================================================
31)
Dynamic memory allocation using the MF primitive CBL_ALLOC_MEM.
The number of bytes allocated in the demonstration is 240 bytes
(78 78memsize value 240. *> unit = bytes)
The only aim of this small value is to loop later when reading the file with the byte-stream routines
So we don't have to allocate a HUGE amount of memory
32)
byte stream routines CBL_OPEN_FILE & CBL_READ_FILE to open file generated in STEP1 and get its total size
33)
LOOP
..byte stream routine CBL_READ_FILE
...Search for a string which value = constant BEGIN-XMLstream
...... ( value "<COORDONNEESANSI>". )
...And when found save its OFFSET ( from the beginning of the file ) in a variable
...Search for a string which value = constant END-XMLstream
...... ( value "</COORDONNEESANSI>". )
...And when found save its OFFSET ( from the beginning of the file ) in a variable
END-LOOP
34)
byte stream routine CBL_READ_FILE using as file-offset the OFSSET of the beginning of the 'interesting' XML-stream
You have isolated now from the initial HUGE XML stream the data you were interested with
(<COORDONNEESANSI><NOM>Dupont</NOM><PRENOM>Charles</PRENOM><ADDRESSE><ADR1>35 Rues de Alouettes</ADR1><ADR2>Quartier des affaires</ADR2></ADDRESSE><CODEPOSTAL>75000</CODEPOSTAL><VILLE>Paname</VILLE><PAYS>France</PAYS></COORDONNEESANSI>)
35)
You now just have to open an XML file using in the SELECT statement the pointer initially returned by CBL_ALLOC_MEM. Read this file
(select XMLfilePart ASSIGN mem-pointer length is ptr-len
...)
ReadPartOfXMLstream.
move file-offset-wrk to ptr-len
initialize pXML-coordonneesansi
open input XMLfilePart
if xml-status not = 0
display "Pb Open XMLfilepart " xml-status stop run
end-if
read XMLfilepart
if xml-status < 0
display "Pb Read XMLfilepart " xml-status stop run
end-if
display pXML-coordonneesansi
close XMLfile
if xml-status < 0
display "Pb Close XMLfilepart " xml-status stop run
end-if
.
36)
When finished, don't forget to close the file you opened with the byte stream routine ( CBL_OPEN_FILE ) == use CBL_CLOSE_FILE and to free the memory you allocated with CBL_ALLOC_MEM == use CBL_FREE_MEM