[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
Try setting the NSYMBOL"NATIONAL" compiler directive.
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
This does not work ...
With another example, I will try to explain my problem.
I have a program in a project COBOL with UTF-8 text file encoding. Using diferent code number of diferent code pages, I try to display the character 'Ф':
...
01 NATIONAL-Prova pic N(20) usage National.
01 UNICODE-Prova pic X(40).
01 UTF8-Prova pic X(40).
....
*Using the UTF-16 code number
move NX"0424" to NATIONAL-PROVA.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
*Using the UTF-8 code number
move X'D0A4' to UNICODE-PROVA.
move function national-of(UNICODE-Prova,1208) to NATIONAL-Prova.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
*Using the 866 code number
move X'94' to UNICODE-PROVA.
move function national-of(UNICODE-PROVA,866) to NATIONAL-PROVA.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
...
Only in the first and second cases the character 'Ф' is correctly displayed; but in the third case, the program display the character '”'.
The problem is in the sentence "move function national-of(UNICODE-PROVA,866) to NATIONAL-PROVA", where the program not use the code page 866; it use the code page ANSI to convert the string to National.
What should I do to make the program will use the correct code page 866 and not the code page ANSI?
Thank you!
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
This does not work ...
With another example, I will try to explain my problem.
I have a program in a project COBOL with UTF-8 text file encoding. Using diferent code number of diferent code pages, I try to display the character 'Ф':
...
01 NATIONAL-Prova pic N(20) usage National.
01 UNICODE-Prova pic X(40).
01 UTF8-Prova pic X(40).
....
*Using the UTF-16 code number
move NX"0424" to NATIONAL-PROVA.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
*Using the UTF-8 code number
move X'D0A4' to UNICODE-PROVA.
move function national-of(UNICODE-Prova,1208) to NATIONAL-Prova.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
*Using the 866 code number
move X'94' to UNICODE-PROVA.
move function national-of(UNICODE-PROVA,866) to NATIONAL-PROVA.
move function Display-of(NATIONAL-Prova,1208) to UTF8-Prova.
Display "UTF8-Prova(1208): ",UTF8-Prova.
...
Only in the first and second cases the character 'Ф' is correctly displayed; but in the third case, the program display the character '”'.
The problem is in the sentence "move function national-of(UNICODE-PROVA,866) to NATIONAL-PROVA", where the program not use the code page 866; it use the code page ANSI to convert the string to National.
What should I do to make the program will use the correct code page 866 and not the code page ANSI?
Thank you!
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
In the documentation for the NATIONAL-OF intrinsic function
see NATIONAL-OF docs:
it clearly states the following regarding the use of a code page as argument-2 to the function:
Must be an integer. Argument-2 identifies the source code page for the conversion.
Currently beyond the default codepage only 1208 is supported.I believe that this is why it is failing for you.
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
Hi,
In Visual Cobol, is there another way to convert a string, encoded in 866 code page (or encoded in antother code page), to a string encoded in Unicode (UTF-16, UTF-8)?
Thank you
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
Hi,
In Visual Cobol, is there another way to convert a string, encoded in 866 code page (or encoded in antother code page), to a string encoded in Unicode (UTF-16, UTF-8)?
Thank you
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
You don't say whether you are creating a native project or a .NET managed code project.
The following example shows how this can be done in a .NET managed code program:
program-id. Program1 as "testcodepage.Program1".
data division.
working-storage section.
01 greektext type Byte[] value x"AFACAE9E".
01 wsEncoding type Encoding.
01 codePageValues type Byte[].
01 unicodeValues string.
01 b type Byte.
01 unicodeString string.
01 enumerator type System.Globalization.TextElementEnumerator.
01 s string.
01 i binary-long.
01 any-key pic x.
procedure division.
invoke type System.IO.File::WriteAllBytes("greek.txt", greektext)
*> Specify the code page to correctly interpret byte values
set wsEncoding to type Encoding::GetEncoding(737) *>(DOS) Greek code page
set codePageValues to type System.IO.File::ReadAllBytes("greek.txt")
*> Same content is now encoded as UTF-16
set unicodeValues to wsencoding::GetString(codePageValues)
*> Show that the text content is still intact in Unicode string
*> (Add a reference to System.Windows.Forms.dll)
invoke type System.Windows.Forms.MessageBox::Show(unicodeValues)
*> Same content "ψυχή" is stored as UTF-8
invoke type System.IO.File::WriteAllText("greek_unicode.txt", unicodeValues)
*> Conversion is complete. Show the bytes to prove the conversion.
display "8-bit encoding byte values:"
perform varying b thru codePageValues
invoke type Console::Write("{0:X}-", b)
end-perform
display " "
display "Unicode values:"
set unicodeString to type System.IO.File::ReadAllText("greek_unicode.txt")
set enumerator to type System.Globalization.StringInfo::GetTextElementEnumerator(unicodeString)
perform until exit
if enumerator::MoveNext()
set s to enumerator::GetTextElement()
set i to type Char::ConvertToUtf32(s, 0)
invoke type Console::Write("{0:X}-", i)
else
exit perform
end-if
end-perform
display " "
display "Press any key to exit."
accept any-key
goback.
end program Program1.
[Migrated content. Thread originally posted on 18 April 2011]
Hi,
I need to process UTF-8 data. In the documentation, I've read that I have to convert the data(with a code page) to UTF-16 in a national data item, and then convert it back to UTF-8 for output; and for the conversions, I have to use the functions NATIONAL-OF and DISPLAY-OF, respectively.
For exemple, if I have to convert data in Russian:
01 Russian-data pic X(10) value 'Russian-text'.
01 UnicodeString pic N(10).
01 UTF-8-String pic X(20).
Move function National-of(Russian-data, 866) to UnicodeString
Move function Display-of(UnicodeString, 01208) to UTF-8-String
...but it doesn't work. I'm not doing well? What should I do?
Thank you
Do you have an example for native project or for a Cobol JVM project?
I'm working with Visual Cobol for Eclipse..
Thank you.