Skip to main content

jEDI that allows you to access the XML file as if it was a standard jBASE hashed file

During a training class for jBASE, I was aided by Charles Barouch, Jenna Hernandez and Tyler McDonough in designing a jEDI to access XML files.
I want to thank them for their contributions and the brainstorming session.  This blossomed into the jEDI attached to this post and may be handy for anyone needing to access XML files from jBASE.

When run against a 500,000 record XML file with a size of 94Mb, it can access all the items in the file in just under 2 minutes.

Why have to read and decode an XML file that looks like this

<CATALOG><PLANT><XMLROW>1</XMLROW><COMMON>Buttercup</COMMON><BOTANICAL>Erythronium americanum</BOTANICAL><ZONE>21</ZONE><LIGHT>Mostly Shady
</LIGHT><PRICE>$12.59</PRICE><AVAILABILITY>12,599</AVAILABILITY></PLANT>
<PLANT><XMLROW>2</XMLROW><COMMON>Greek Valerian</COMMON><BOTANICAL>Phlox divaricata</BOTANICAL><ZONE>33</ZONE><LIGHT></LIGHT><PRICE>$9.02</PRICE><AVAILABILITY>3,596</AVAILABILITY></PLANT><PLANT><XMLROW>3</XMLROW><COMMON>Columbine</COMMON><BOTANICAL>Oenothera</BOTANICAL><ZONE>4</ZONE><LIGHT>Sun or Shade</LIGHT><PRICE>$3.74</PRICE><AVAILABILITY>8,669</AVAILABILITY></PLANT><PLANT><XMLROW>4</XMLROW><COMMON>Anemone</COMMON><BOTANICAL>Eschscholzia californica</BOTANICAL><ZONE>8</ZONE><LIGHT>Sun or Shade</LIGHT><PRICE>$10.48</PRICE><AVAILABILITY>1,502</AVAILABILITY></PLANT></CATALOG>

When you could access it from jBASE in a structure like this:

    1

001 1

002 Buttercup

003 Erythronium americanum

004 21

005 Mostly Shady

006 1259

007 12599

    2

001 2

002 Greek Valerian

003 Phlox divaricata

004 33

005

006 902

007 3596

    3

001 3

002 Columbine

003 Oenothera

004 4

005 Sun or Shade

006 374

007 8669

    4

001 4

002 Anemone

003 Eschscholzia californica

004 8

005 Sun or Shade

006 1048

007 1502

 

The other main advantage is you can use standard JQL and access the file from the jsh without any other programs.

Attached is the source to these three programs

XMLJEDI                   The jEDI subroutine which parses the XML to appear as a jBASE file.

MAKEXML                A program to generate a test XML file to see how the XMLJEDI works.

TESTXML                  A program to adjust blocksize in the XMLJEDI parameters for best performance

The JEDI Stub which defines the parameters for the XMLJEDI

A stub in the current directory called SAMPLE.XML

JBC__SOB JediInitSUB XMLJEDI filename=/big/tmp/SampleLarge.xml,xmlmap=SAMPLE.MAP,dict=SAMPLE.XML]D,blocksize=63488<Enter>

Parameters explained:

                 JBC__SOB JediInitSUB XMLJEDI                  Defines this item to be a jEDI running the subroutine XMLJEDI

                 filename=/big/tmp/SampleLarge.xml                 The full path name of the XML file to be parsed

                 xmlmap=SAMPLE.MAP                            The XML definition record explained later

                 dict=SAMPLE.XML]D                                 The dictionary level file used for this XML file (optional)

                 blocksize=63488                                        Block size used to buffer reads (optional)

Working example of the jEDI stub

Add the jEDI stub to the current directory

jsh>ED . SAMPLE.XML<Enter>
New Record

SAMPLE.XML

TOP

I<Enter>

001+ JBC__SOB JediInitSUB XMLJEDI filename=/big/tmp/SampleLarge.xml,xmlmap=SAMPLE.MAP,dict=SAMPLE.XML]D,blocksize=63488<Enter>

002+<Enter>

FI<Enter>

Create the DICT portion for the jEDI stub file 

jsh> CREATE-FILE DICT SAMPLE.XML<Enter>

 Create the XML.MAPS file and add SAMPLE.MAP

jsh>CREATE-FILE XML.MAPS<Enter>

jsh>ED XML.MAPS SAMPLE.MAP<Enter>
New Record
SAMPLE.MAP
TOP
.I<Enter>
001+ Large Sample XML record<Enter>
002+ PLANT<Enter>                                                  <-- In this sample, <PLANT> and </PLANT> envelope the data "rows"
003+<Enter>
.FI<Enter>

 ED item 1

Upon editing the first item in the XML stub, the map will be updated and the DICT created with DICT items. After that you can adjust field 4 of the XML.MAPS item to apply conversions to the data before being presented.

jsh> ED SAMPLE.XML 1<Enter>
1

TOP

.P<Enter>

TOP

001 1

002 Buttercup

003 Erythronium americanum

004 21

005 Mostly Shady

006 $12.59                                                                  <-- In the next step you will remove the $ and comma for fields 6 and 7

007 12,599

BOTTOM

.EX<Enter>

Record '1' exited from file 'SAMPLE.XML'

 Now the XML.MAPS item has been updated and the DICT populated

We will alter the MAP item to change how the data extracts and DICT items to alter how the data presents

jsh>ED XML.MAPS SAMPLE.MAP<Enter>

SAMPLE.MAP
TOP
.P<Enter>

TOP

001 Large Sample XML record

002 PLANT

003 XMLROW]COMMON]BOTANICAL]ZONE]LIGHT]PRICE]AVAILABILITY

BOTTOM

.I<Enter>

004+ ]]]]]MCN]MCN<Enter>                                    <-- Adding the internal conversion of MCN for PRICE and AVAILABILTY
005+ <Enter>                                                                               So, the ] are actually value marks (^]).  Multiples can be applied by using sub-values

BOTTOM                                                                                       This can also be CALL XXXX where XXXX is a subroutine with arguments indata,outdata
.FI<Enter>                                                                                     and can manipulate the data based on the indata passed and return outdata

 

jsh> ED SAMPLE.XML 1<Enter>

1

TOP

.P<Enter>

TOP

001 1

002 Buttercup

003 Erythronium americanum

004 21

005 Mostly Shady

006 1259                                                                      <-- Now presented as "multi-value style" ICONV data for (6) PRICE and (7) AVAILABILITY

007 12599

BOTTOM

.EX<Enter>

Record '1' exited from file 'SAMPLE.XML'

 

Now fix the DICT items which are the number of the value and the data content of field 3 of XML.MAPS

 

ED DICT SAMPLE.XML 6 PRICE 7 AVAILABILITY

6

TOP

.P<Enter>

TOP

001 A

002 6

003 PRICE

004

005

006

007

008

009 L

010 10

BOTTOM

.7<Enter>

007

.R<Enter>

007+MR2,$<Enter>
.9<Enter>

009 L

.R<Enter>

009+R<Enter>

.FI<Enter>

Record '6' written to file 'DICT SAMPLE.XML'

PRICE

TOP

.7<Enter>

007

.R<Enter>

007+MR2,$<Enter>

.9<Enter>

009 L

.R<Enter>

009+R<Enter>

.FI<enter>

Record 'PRICE' written to file 'DICT SAMPLE.XML'

7<Enter>

TOP

.P<Enter>

TOP

001 A

002 7

003 AVAILABILITY

004

005

006

007

008

009 L

010 10

BOTTOM

.7<Enter>

007

.R

007+MR0,<Enter>

.9<Enter>

009 L

.R<Enter>

009+R<Enter>

.FI<Enter>

Record '7' written to file 'DICT SAMPLE.XML'

AVAILABILITY

TOP

.7<Enter>

007

.R<Enter>

007+MR0,<Enter>

.9<Enter>

009 L

.R<Enter>

009+R<Enter>

.FI<Enter>

Record 'AVAILABILITY' written to file 'DICT SAMPLE.XML'

LIST item 1 in the newly created SAMPLE.XML "file"

jsh> LIST SAMPLE.XML '1'<Enter>

 

PAGE    1                                                                                    06:59:59  10 MAY 2022
SAMPLE.XML....    XMLROW....    COMMON....    BOTANICAL.    ZONE......    LIGHT.....    PRICE.....    AVAILABILITY
1                 1             Buttercup     Erythroniu    21            Mostly Sha        $12.59          12,599
                                              m american                  dy
                                              um


------------------------------
Dan Ell
Senior Software Engineer
Rocket Internal - All Brands
------------------------------