[Migrated content. Thread originally posted on 03 November 2011]
Hello,I am trying to convert a chinese string codified in UTF-8, to Unicode.
In order to achieve that, I am working with a .NET managed code project. Here is my program:
program-id. Program1 as "testcodepage.Program1".
data division.
working-storage section.
* 01 greektext type Byte[] value x"AFACAE9E".
01 chinesetext type Byte[] value x"E58F91E7A5A8".
01 wsEncoding type Encoding.
01 codePageValues type Byte[].
01 unicodeValues string.
01 b type Byte.
01 unicodeString string.
01 enumerator type System.Globalization.TextElementEnumerator.
01 s string.
01 i binary-long.
01 any-key pic x.
01 final-string pic n(2).
01 ind pic 9.
procedure division.
invoke type System.IO.File::WriteAllBytes("chinese.txt", chinesetext)
*> Specify the code page to correctly interpret byte values
* set wsEncoding to type Encoding::GetEncoding(737) *>(DOS) Greek code page
set wsEncoding to type Encoding::UTF8
set codePageValues to type System.IO.File::ReadAllBytes("chinese.txt")
*> Same content is now encoded as UTF-16
set unicodeValues to wsencoding::GetString(codePageValues)
*> Show that the text content is still intact in Unicode string
*> (Add a reference to System.Windows.Forms.dll)
invoke type System.Windows.Forms.MessageBox::Show(unicodeValues)
*> Same content "ψυχή" is stored as UTF-8
invoke type System.IO.File::WriteAllText("chinese_unicode.txt", unicodeValues)
*> Conversion is complete. Show the bytes to prove the conversion.
display "8-bit encoding byte values:"
perform varying b thru codePageValues
invoke type Console::Write("{0:X}-", b)
end-perform
display " "
display "Unicode values:"
set unicodeString to type System.IO.File::ReadAllText("chinese_unicode.txt")
set enumerator to type System.Globalization.StringInfo::GetTextElementEnumerator(unicodeString)
move 1 to ind
perform until exit
if enumerator::MoveNext()
set s to enumerator::GetTextElement()
set i to type Char::ConvertToUtf32(s, 0)
set final-string(ind:1) to i *> How to perform the conversion to PIC(N)????
add 1 to ind
invoke type Console::Write("{0:X}-", i)
else
exit perform
end-if
end-perform
*> Show the chinese string converted
invoke type System.Windows.Forms.MessageBox::Show(final-string)
display " "
display "Press any key to exit."
accept any-key
goback.
end program Program1.I hope anybody could help me with this problem.
Thank you




