Uniface User Forum

 View Only
  • 1.  How to deal with special character Ć (C accent aigu) in my app

    Posted 05-17-2021 07:07

    Hi,

    I need to add special character to my app. set of special characters. It is Ć (C accent aigu). But this special character is not in the ASCII table unlike other special characters used in my app. And using windows character map to add the Ć for example in to the created xml file report is causing troubles. It looks like this:


    How can I handle this special char. in my app? 



  • 2.  RE: How to deal with special character Ć (C accent aigu) in my app

    Posted 05-17-2021 07:52

    Hi Jozef

    short explanation:
    "utf-8 resp "utf-16" are both "ZIP"-formats for unicode as the try to use as less byte per chararcter as possible.
    Both of them can handle all unicode characters, so there are not restricting unicode to a codepage
    "CP1250" on the other hand is a codepage for 256 characters (suitable for middle europe)
    "ASCII" is a codepage with only 96 characters (plus 32 controls) (suitable for US/UK/...)

    You character (if I interpret the glphy correct) is "LATIN CAPITAL LETTER C WITH ACUTE"  with unicode U+0106.
    This one is in codepage CP1250 at point 0xC6 and so you can als used it within "middle european" text

    UnifAce is uniocde-enabled, so you can use any unicode character inside of UnifAce.
    Problems are the input/output interfaces as they are often only capable of using a limited range of characters (codepages)
    If you create XML by your self and then write it by lfiledump, you can set the encoding, e.g. UTF-8
    But always write the encoding
    <?xml version="1.0" encoding="utf-8"?>

    If you see a few cryptic glyphs in the XML, this could due three things:

    1. You use a wrong unicode codepoint
    2. You don't write the encoding so that the other programm don't know how to interpret
    3. The other programm ignores any encoding

    To make a character more readable for a texteditor without knowledge about an encoding, you can use XML-entities:

    So U-0106 became "&#x0106;"  or even better "&Cacute;"

    Ingo



  • 3.  RE: How to deal with special character Ć (C accent aigu) in my app

    Posted 05-18-2021 09:26

    Hi Ingo, thanks for your answer. But I see also another issue now. When I enter this C with accent via app. it is correctly displayed in the screen but when I save it, then in the DB it is stored with weird character Æ. How can I fix it so in the DB it is also stored as C with accent?



  • 4.  RE: How to deal with special character Ć (C accent aigu) in my app

    Posted 05-21-2021 07:54

    Hi Jozef,

    AFAIK description from Ingo was a complete one.

    could it be:
    1) you are saving a Unicode character in a field which is NOT enabled to it?
      or
    2) you are looking to that saved field on DB with a tool which is not correctly Unicode enabled?

    Uniface could save full Unicode charset when:
    a) your (R)DBMS is fully Unicode enabled as default charset
    b) your (R)DBMS is NOT fully Unicode enabled as default charset but each single field can be specifically enabled/mapped. Uniface acts this way when a field is mapped to a W packing code, (Wx, VWx, W*).

    I am currently using option b) a lot on Oracle 12.2 on with success.

    Hope it helps.

    Gianni



  • 5.  RE: How to deal with special character Ć (C accent aigu) in my app

    Posted 05-18-2021 10:49

    Hi Jozef

    Is the character still correctly displayed after storing and retrieve back to the surface?

    If this is true, there is no need to worry about the strange glphys on DB 🙂

    If after readback, the character/glyph changed, than it is an issue with the database/database-driver.

    Which interface typ in UnifAce do this field has (C*,W*,R*,....)
    Which DBMS are you using? "MSS" or ?
    Which COLLATION (MS-SQL) are you using?


    An idea to check wether the character is correct beside the stange glphys:
    Use a SQL-statement like this:
    SELECT CAST(my_field as varbinary(100)) FROM my_table
    Then check the HEX-code(s) for the letter.
    If it is 0xc486, then the unicode codepoint 0x0106 is correct transformed by UTF-8 and your UnifAce driver uses UTF-8.

    Check the COLLATION, if it ends with "_UTF8"
    If not, it's all correct, as the UTF-8 byte sequence is treated as a normal "ASCII"/"CPnnnn"-sequence
    If the COLLATION is an UTF8 one, you have to blame Microsoft 🙂

    If it is not 0xc486 or 0xC6 then your UnifAce-driver is using some strange codepage (neither UFT8 nor CP1250)

    Ingo

    PS:
    Character: One distinguish letter/sign/... without knowing how to display or to store
    Glyph: The visual representation of a character
    Byte sequence: The representation of a character/codepoint/something_else in bits and bytes
    Codepage: A selection of characters including the bytesequence
    Codepoint: A character or some special things in unicode
    UTF:  UCS Transformation Format      (Not necessary unicode!)
    UCS: Universal Coded Character Set  (Not necessary unicode!)

    UTF8 on Unicode:
    All Unicodes<=0x007F are converted to one byte which is 1:1 ASCII. So in most cases, the length of a document doesnt' change or only very little. Characters above 0x0100 are two bytes or longer.

    All Codepoints could be transformed into a byte sequence by i.e. UTF8 or UTF16
    Not all Codepoints can be displayed. This depends (amongst other things ) to the fonts installed
    A codepage holds (normaly) (32+)96+128 characters
       32 control characters
       96 characters close to the ASCII default
     128 extended characters
    Far from all codepoints can be mapped to a codepage
    When only bytesequences a passed from on application to another, it's not clear which transformation/codepage/codeunit is behind the bytes, so this could lead to very strange glyphs 🙂