Hi Ted,
It is a coincidence that I was just recently asked to look at json files and to possibly parse them.
I modified some other code I use to read large files to process json files.
The code should place each key-value pair, object delimiter and array delimiter on a separate line in dynamic array for further processing.
There could be other string swaps needed to delineate each item to it's own line.
My goal with this was to get the key-value objects into their own attributes.
Once that is done you should be able to navigate down through the array keeping track of which object and/or array level you are at.
If you want the key's and value's in separate attributes, just add another string swap to convert ":" to ":crlf"
Of course you can create and convert to a dimensioned array for faster processing.
The file I processed was just over 18MB and it processed to a dynamic array with just under 400K attributes in just under 2 seconds.
I would be interested to see the code you are using to convert to D3 files.
Hope this works for you.
****** THIS PROGAM MUST BE COMPILED WITH THE "COMPILE" COMMAND ******
INCLUDE DM,BP,UNIX.H FCNTL.H ;! Need for "C" funtions to work correctly.
output_folder = '/u/home/lance/json/'
json_filespec = output_folder : 'cigna.json'
outfilename = 'cignatest.json'
Initialize:
a = @AM
v = @VM
lf = CHAR(10) ;! LINE FEED character
cr = CHAR(13) ;! CARRIAGE RETURN character
buflen = 20000 ;! Serial Read buffer length (Tried values of 10k, 15k and 30k - 20k seems to be the best)
exit_flag = 0 ;! Control exit from LOOP statement
extra = '' ;! Hold partial record information between reads
groupptr = 0 ;! Pointer using to store info to dimension array. Info that could go back to calling program.
ln = 0 ;! Temporary line counter used in debugging
results = '' ;! Hold return data to calling program
stime = SYSTEM( 12 )
DIM groups( 80000 ) ;! Create dimensioned array. Temporary holder before saving data to file
MAT groups = '' ;! Initialize dimentioned array
Read_File:
file_handle = %OPEN( json_filespec, O$RDONLY ) ;! Open file
bytesread = 0
IF file_handle LT 0 THEN ;! Check for bad file handle
results = 'ERROR: OPEN file error in sub (PRE_PROCESS_EPIC_FILE): ' json_filespec ;! Check for file process error
END ELSE
LOOP UNTIL exit_flag ;! Beginning of READ loop
readbuffer = SPACE( buflen ) ;! Initialize readbuffer
readbytes = %read( file_handle, readbuffer, buflen ) ;! Perform read
readbuffer = SWAP( readbuffer, lf, '' ) ;! Remove all line feed characters in readbuffer
readbuffer = SWAP( readbuffer, cr, '' ) ;! Remove all carriage retrun characters in readbuffer
BEGIN CASE
CASE readbytes = -1 ;! Problem with read
results = 'ERROR: Read error: ' : SYSTEM( 0 )
exit_flag = 1 ;! Set exit flag to exit loop
CASE readbytes EQ buflen ;! Process current read buffer - NOT EOF
lines = extra : readbuffer ;! Prepend remaining previous readbuffer to current readbuffer
GOSUB Convert2Attribute
linecount = COUNT( lines, a ) ;! Get count lines based on attributes (closed braces)
extrapos = INDEX( lines, a, linecount ) + 1 ;! Determine start point of partial message group
extra = lines[ extrapos, 10000] ;! Save last partial/whole read buffer
lines[ extrapos, 10000 ] = '' ;! Clear portion of lines now held in extra for next pass
groupptr += 1
groups( groupptr ) = lines
CASE readbytes NE buflen ;! Process last read buffer - EOF
lines = extra : readbuffer[ 1, readbytes ] ;! Prepend remaining previous data setment to current data segment
GOSUB Convert2Attribute
linecount = COUNT( lines, a ) ;! Get count of lines
groupptr += 1
groups( groupptr ) = lines
exit_flag = 1 ;! Set exit flag to exit loop
END CASE
REPEAT
END
results = groups ;! Convert to dynamic array
FixLastLine:
last_attr = DCOUNT( results, a ) ;! Get attribute of last line
last = TRIM( results< last_attr > ) ;! Remove any leading/trailing/redundant spaces
DEL results< last_attr > ;! Remove last line
FOR ctr = 1 TO LEN( last ) ;! Loop through length of last line
results< -1 > = last[ ctr, 1 ] ;! Add new line for each character in last line
NEXT ctr
etime = SYSTEM( 12 )
Display2Screen:
*ln = DCOUNT( results, a )
*FOR x = 1 TO ln
*CRT x, results< x >[ 1, 100 ]
*NEXT x
WriteResults:
CRT 'Unique Lines: ' : DCOUNT( results, a ), 'Elapsed Time: ' : (etime - stime)/1000
OPEN output_folder TO TempFolder ELSE STOP
WRITE results ON TempFolder, outfilename
STOP
Convert2Attribute:
! These conversions were created based on a test json file I was working with.
! It is possible that there may be other stings that need to be converted to get each
! key-value pair, object delimiter and array delimiter on their own line
lines = SWAP( lines, \","\, \",\ : a : \"\ )
lines = SWAP( lines, \{"\, \{\ : a : \"\ )
lines = SWAP( lines, \":[{\, \":\ : a : \[\ : a : \{\ )
lines = SWAP( lines, \":{\, \":\ : a : \{\ )
lines = SWAP( lines, \"}],"\, \"\ : a : \}\ : a : \],\ : a : \"\ )
lines = SWAP( lines, \"}},{\, \"\ : a : \}\ : a : \},\ : a : \{\ )
lines = SWAP( lines, \"}\, \"\ : a : \}\ )
RETURN