Extended ASCII characters changing value when written

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 25, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

The assertion you make in the title is incorrect if by "written" you mean by the COBOL WRITE statement. A COBOL WRITE statement simply writes out the bytes in the record area with no changes. The I/O routines do not know if those bytes are part of binary data or are character data and therefore cannot know what changes would be appropriate. No formatting is done in COBOL WRITES.

The behavior you describe is consistent with the data being treated by Visual COBOL as OEM (maybe codepage 850) data, whereas the database source probably represents the characters in UNICODE. Windows (misnamed) "ANSI" codepage (1252) would be a closer match to the database. When COBOL displays data on Windows, routines that expect the ANSI codepage are used for the display and OEM data must be translated internally to ANSI. That may be why you think the data is being changed when it is not, at least, not by the WRITE.

I've been looking for the way to get Visual COBOL to treat the data as already in ANSI codepage instead of assuming it's in the OEM codepage (and thus translating it for DISPLAY, but haven't found it yet). I believe that's what you need to make your COBOL program compatible with the database in the representation of data. Alternatively, in Visual COBOL you can use UNICODE data, which would be an even better match to the database and not limited to the characters available in the codepage on Windows (a codepage on Windows maps between 8-bit characters and 16-bit UNICODE characters).

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 25, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

I found the Visual COBOL compiler directive runtime-encoding(ansi). It's described in the documentation as if it should be what you want, but so far my testing shows it as not working. The default of runtime-encoding(oem) seems to be what happens even if the program is compiled with runtime-encoding(ansi). That seems to happen with or without dialect(rm). Maybe you will get different results than I did if you try runtime-encoding(ansi) at compile-time.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 25, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

I found the Visual COBOL compiler directive runtime-encoding(ansi). It's described in the documentation as if it should be what you want, but so far my testing shows it as not working. The default of runtime-encoding(oem) seems to be what happens even if the program is compiled with runtime-encoding(ansi). That seems to happen with or without dialect(rm). Maybe you will get different results than I did if you try runtime-encoding(ansi) at compile-time.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 27, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

RUNTIME-ENCODING(ANSI) gets me the same character as before. RUNTIME-ENCODING(OEM) stores the character in the database as the original character, but the display is still switched.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 27, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

RUNTIME-ENCODING(ANSI) gets me the same character as before. RUNTIME-ENCODING(OEM) stores the character in the database as the original character, but the display is still switched.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 27, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

I need to know how you "store the character in the database" and how you "display" the character. The runtime only needs to know the encoding when calling functions that expect a certain encoding. Otherwise, the character is left unchanged.

In RM/COBOL, the DISPLAY statement displays into a Windows window using a function that expects ANSI encoding; thus, if the native character set is OEM, the character is translated to ANSI before being displayed and if the native character set is ANSI, the character is written as is for DISPLAY.

In Visual COBOL, a COBOL DISPLAY is likely just a write to the console with no change to the character(s). That is why the runtime-encoding didn't seem to work for me since it's probably not used in any way in a simple DISPLAY. Are you displaying the characters in a message box or some other way? You seem to be expecting the Windows ANSI character for the specified code-points, but a simple DISPLAY in Visual COBOL is likely giving you the OEM interpretation of the code-point in the console Window. An understanding of fonts, code points, code pages (in particular, Windows "ANSI" vs. "OEM" code pages) and Unicode is necessary to resolve your issue.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 28, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

Storing the character in the database is done by moving the input record (as you say, probably type OEM) to working storage, then constructing an Insert statement from the values in working storage, then running that insert statement. If I use the directive SOURCE-ENCODING"OEM" (on the project properties/COBOL/Additional Directives field), the character is stored in the database in the original format. If I change that to SOURCE-ENCODING"ANSI", the field is stored in the changed value. I am not permitted by VS2010 to use SOURCE-ENCODING"OEM" as a compiler directive at the top of the cbl file. Using RUNTIME-ENCODING"OEM" gives me the same results.

I also move that record to a logging field, and print that in a file (open output in the cbl file). No matter what, that character is always the replaced character, no matter what is stored in the database.

select special-characters-file assign external SPECCHAR
organization is line sequential.
fd special-characters-file.
01 special-characters-record pic x(182).

I was making that file, to record what special characters were in Pic X fields, so I knew what I was dealing with.

Data in Database:
AR_INV_LINE_2
AV. LA FONTANA Nº 1155 INT.01
AV. LA FONTANA Nº 1155 INT. 01

Data in file (same run):
§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
September 28, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

Storing the character in the database is done by moving the input record (as you say, probably type OEM) to working storage, then constructing an Insert statement from the values in working storage, then running that insert statement. If I use the directive SOURCE-ENCODING"OEM" (on the project properties/COBOL/Additional Directives field), the character is stored in the database in the original format. If I change that to SOURCE-ENCODING"ANSI", the field is stored in the changed value. I am not permitted by VS2010 to use SOURCE-ENCODING"OEM" as a compiler directive at the top of the cbl file. Using RUNTIME-ENCODING"OEM" gives me the same results.

I also move that record to a logging field, and print that in a file (open output in the cbl file). No matter what, that character is always the replaced character, no matter what is stored in the database.

select special-characters-file assign external SPECCHAR
organization is line sequential.
fd special-characters-file.
01 special-characters-record pic x(182).

I was making that file, to record what special characters were in Pic X fields, so I knew what I was dealing with.

Data in Database:
AR_INV_LINE_2
AV. LA FONTANA Nº 1155 INT.01
AV. LA FONTANA Nº 1155 INT. 01

Data in file (same run):
§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
October 2, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

You don't need the source-encoding directive unless you have literals in your source that are encoded with ANSI instead of OEM; OEM is the default source encoding. The source-encoding directive sets the runtime-encoding directive as a side effect, so it can affect interpretation of the runtime encoding of character values. If your display is to a Windows command prompt, characters are interpreted as being in the OEM character encoding. I think you have ANSI encoded characters in your database and thus when you display them, you don't get the characters you expect. Fortunately, the two characters you have specifically described have both an ANSI and OEM encoding; this is not true of all ANSI characters or OEM characters. You should Google Windows Code Page 1252 (ANSI) and Windows Code Page 850 (OEM); these pages have charts showing the encodings and interpretations of the characters.

Maybe this program will help you understand:

identification division.
program-id. cf167186.

environment division.
configuration section.

data division.
working-storage section.
01 ws-data-group.
03 ws-data-chars occurs 128 times indexed by ch-inx.
05 charbin pic 999 binary(1).
05 charch redefines charbin pic x.

01 ws-data-char-ansi pic x(128).

01 ws-buf-length pic x(8) comp-5 value 128.

procedure division.
000-start.

perform varying ch-inx from 128 by 1 until ch-inx > 255
set charbin(ch-inx - 127) to ch-inx
end-perform

display ws-data-group mode block line 1 erase

call "PC_WIN_CHAR_TO_OEM"
USING ws-data-group ws-data-char-ansi
BY VALUE ws-buf-length ws-buf-length.
display ws-data-char-ansi mode block line 3

stop run.

end program cf167186.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
October 2, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

You don't need the source-encoding directive unless you have literals in your source that are encoded with ANSI instead of OEM; OEM is the default source encoding. The source-encoding directive sets the runtime-encoding directive as a side effect, so it can affect interpretation of the runtime encoding of character values. If your display is to a Windows command prompt, characters are interpreted as being in the OEM character encoding. I think you have ANSI encoded characters in your database and thus when you display them, you don't get the characters you expect. Fortunately, the two characters you have specifically described have both an ANSI and OEM encoding; this is not true of all ANSI characters or OEM characters. You should Google Windows Code Page 1252 (ANSI) and Windows Code Page 850 (OEM); these pages have charts showing the encodings and interpretations of the characters.

Maybe this program will help you understand:

identification division.
program-id. cf167186.

environment division.
configuration section.

data division.
working-storage section.
01 ws-data-group.
03 ws-data-chars occurs 128 times indexed by ch-inx.
05 charbin pic 999 binary(1).
05 charch redefines charbin pic x.

01 ws-data-char-ansi pic x(128).

01 ws-buf-length pic x(8) comp-5 value 128.

procedure division.
000-start.

perform varying ch-inx from 128 by 1 until ch-inx > 255
set charbin(ch-inx - 127) to ch-inx
end-perform

display ws-data-group mode block line 1 erase

call "PC_WIN_CHAR_TO_OEM"
USING ws-data-group ws-data-char-ansi
BY VALUE ws-buf-length ws-buf-length.
display ws-data-char-ansi mode block line 3

stop run.

end program cf167186.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
October 4, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

OK, I think I have this. The file being loaded is in OEM codepage (850). When reading that, the special characters must be converted to ANSI (1252), using the PC_WIN_OEM_TO_CHAR (opposite of shown method) method. Then the value will be correct when put into the database. It also appears correct when displayed in a text file. The command window was right all along, because that displays in OEM format anyway.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
October 5, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

Yes, sounds like you understand now. The runtime-encoding default of OEM would have worked for you if the method you used to insert the data into the database had that method known the database wanted ANSI encoding and done the OEM to ANSI conversion for you. Since the method of insertion apparently does not do this, you are correct that you must supply the conversion yourself with PC_WIN_OEM_TO_CHAR.

Like

D

Dominique Sacre
Author
Rocketeer
Forum|Forum|8 years ago
October 5, 2017

Starting with a RM idxformat 21 file. Loading text data, encountering an extended ascii character (value above ascii 127).

Debug step in Visual Studio, value is as appears in the original file.

Debug window with value of ASCII-VALUE

This is the entire record, where the character is.

After moving that record to a location in Working Storage, that record is written. This is the code in that section:

if WORK-FIELD3(WS-SUB1:1) < SPACE or WORK-FIELD3(WS-SUB1:1) > "~" then

MOVE WORK-FIELD3(WS-SUB1:1) TO ASCII-VALUE

move spaces to special-characters-record

move ASCII-VALUE to special-character

move SF-AR-PRIMARY-KEY to special-character-agr

move WORK-FIELD3 to special-character-rec

move DUMMY-FIELD-NAME to special-character-rec-name

move special-characters-line to special-characters-record

write special-characters-record

display "special character: " ASCII-VALUE

display "agreement = " SF-AR-PRIMARY-KEY

move "?" to WORK-FIELD4(WS-SUB2:1)

end-if

(I would prefer to not replace the character, this is just what I'm doing right now)

Directives at the top of the program:

$SET IDXFORMAT"21"

$SET DIALECT"RM"

After write, this is the line that remains. The special character has switched values (this is an error logging record)

§ 174204001A AV. LA FONTANA N§ 1155 INT.01 SF-AR-INV-LINE-2
§ 174204002A AV. LA FONTANA N§ 1155 INT. 01 SF-AR-INV-LINE-2

Query the database using SQL Management Studio, for the values of the characters.

This value is stored in a database. What is happening is when we retrieve that text field, and put that in a string variable (which rebuilds the COBOL record) value 167 isn’t being displayed, shortening the text field by one, and throwing off the entire subsequent fields by 1.

Is there any way to retain the original value? Or are we going to have to replace all extended ASCII characters with something else?

Yes, sounds like you understand now. The runtime-encoding default of OEM would have worked for you if the method you used to insert the data into the database had that method known the database wanted ANSI encoding and done the OEM to ANSI conversion for you. Since the method of insertion apparently does not do this, you are correct that you must supply the conversion yourself with PC_WIN_OEM_TO_CHAR.