Problem:
What is UTF-8?
Does Server Express support UTF-8 or UTF-16 format
Resolution:
What is utf-8?
UTF-8 is the Unicode Transformation Format-8. The "Uni" part of "Unicode" comes from "Universal", so it is not only for the English language. Unicode can be used in several formats. In particular, UTF-8 is the one used by default by XML because it is the most complete code without increasing the size of the file too much. It is variable and the number of bytes used to represent a character depends on in which part of the Unicode table the character is.
The 8 part of UTF-8 is sometimes misleading, as you could tend to think that all characters are represented with 8 bits, but that is not the case...
UTF-8 uses 1 byte only with characters within the ASCII code (only the first 7 bits) and that makes it compatible with ASCII. It is the only UNICODE encoding that is compatible with ASCII and that is the advantage of it. It represents each character in the range U 0000 through U 007F as a single octet.
But UTF-8 is variable length and uses more than 1 octet when the character falls outside that range. Programs that read a UTF-8 file should be prepared to read up to 6 octets per character. When using characters that require more than one octet, UTF-8 is not compatible with ASCII and if a text contains characters out of that range, it won't be understood by a tool that only understands ASCII.
Finally, if you ever have the need to convert an ASCII text file into a UTF-8 file, all you need to do is to use Notepad within the latest versions of Windows. Notepad allows you to open an ASCII encoded file and save it as UTF-8.
For more information on UTF-8 and the UNICODE Standards, please visit
http://www.utf-8.com
Micro Focus Server Express v4.0 SP1 does support UTF-8 in a PIC X field. You do not have to do anything special. That is, you do not have to specify any directives. The character is held in one byte.
Server Express v4.0 SP1 also supports UTF-16 in a PIC N field.



