Problem:
Resolution:
Net Express does not support this type of file directly but attached is a sample program that will read it using the CBL byte stream file routines and will parse the individual data records from the file and then further parse each record into its tab delimited fields placing the fields in a PIC N field for processing as UTF-16 data.
Here is the source code to the demo:
*------------------------------------------------------------------------------------------*
* CONVERTUTF16
*
* This example demonstrates how to read in a UTF-16 based text
* file which has records whose fields are delimited by tabs.
* There is no filetype in Net Express that is equivalent to a
* Unicode file so we must read the file using the CBL_ byte
* stream library routines.
*
* In this demo the entire file is read into a buffer and then
* the individual records are parsed and for each record the indiv-
* idual fields are parsed and processed.
*
* Although the file is in UTF-16 the parsing is all done using
* standard PIC X data fields instead of PIC N for ease of use.
*-------------------------------------------------------------------------------------------*
id division.
program-id. convertutf16.
data division.
working-storage section.
01 filename pic x(256) value "UTF16-LE_TAB-delimited.txt".
01 access-mode pic x comp-x value 1.
01 deny-mode pic x comp-x value 0.
01 device pic x comp-x value 0.
01 file-handle pic x(4) comp-5 value 0.
01 file-offset pic x(8) comp-x value 0.
01 byte-count pic x(4) comp-x value 0.
01 flags pic x comp-x value 0.
01 buffer pic X(100000) value spaces.
01 rec-size pic 9(5) value zeroes.
01 start-pos pic 9(5) value 0.
01 end-pos pic 9(5) value 0.
01 rec-num pic 9(5) value 0.
01 current-record pic X(1000) value spaces.
01 current-field pic x(100) value spaces.
01 current-field-n pic n(100) value spaces.
01 field-pointer pic 9(5) value zeroes.
01 field-size pic 9(5) value zeroes.
01 crlf pic x(4) value X"0D000A00".
01 tab-char pic x(2) value X"0900".
01 last-record-flag pic x value "N".
88 last-record value "Y"
when set to false "N".
procedure division.
call "CBL_OPEN_FILE"
using filename
access-mode
deny-mode
device
file-handle
end-call
if return-code not = 0
display "error on open = " return-code
stop run
end-if
*> The following call with flags set to 128 will get the filesize
*> into the file-offset field. It is then used in the next read
*> to read the entire file at once. You must ensure that the
*> buffer size is large enough to hold your largest file.
move 128 to flags *> get filesize
call "CBL_READ_FILE"
using file-handle
file-offset
byte-count
flags
buffer
end-call
if return-code not = 0
display "error on read rec length = " return-code
stop run
end-if
move file-offset to byte-count
move 0 to file-offset flags
call "CBL_READ_FILE"
using file-handle
file-offset
byte-count
flags
buffer
end-call
if return-code not = 0
display "error on read = " return-code
stop run
end-if
*> Contents of file is now in buffer
*> We want to skip the first two bytes which are the UTF markers
move 3 to start-pos
*> We will parse the buffer using unstring delimited by the CRLF
*> delimiters so that we process one record at a time. Since this
*> is UTF-16 the actual code is 0D000A00.
*> The size of the record is held in rec-size.
perform until exit
move spaces to current-record
unstring buffer
delimited by crlf
into current-record
count in rec-size
with pointer start-pos
end-unstring
*> For ease of parsing we add a tab character to the end of the
*> record so all fields are delimited
move tab-char to current-record(rec-size 1:2)
add 2 to rec-size
*> This procedure will actually parse the individual fields
*> delimited by tab character and process them.
perform 100-process-record
if start-pos >= byte-count
exit perform
end-if
end-perform
display "recs processed = " rec-num
stop run.
100-process-record.
add 1 to rec-num.
move 1 to field-pointer
perform until exit
move 0 to field-size
unstring current-record
delimited by tab-char
into current-field
count in field-size
with pointer field-pointer
end-unstring
*> space fill with UTF spaces
move all X"2000" to current-field(field-size 1:)
move current-field to current-field-n
*> Now field is in actual PIC N data type and you can process it
*> however you wish
display current-field-n(1:field-size / 2)
if field-pointer >= rec-size
exit perform
end-if
end-perform.
Attachments
#VisualCOBOL
#netexpress
#COBOL



