Or better, intern a character holds the meta-font number and the numer of the character in the meta-font-
Plus the attributes (if I'm not completely wrong, or UnifAce)
This strange coding is due to the fact that UnifAce longer exists then Unicode and had to deal with non ASCII-characeters long before
$tometa((192+v_FONT_NBR)*256+CHAR_NBR)
or
$concat($tometa(192+v_FONT_NBR),$tometa(CHAR_NBR))
So if you need to write the Euro character, the statement should look like this
$tometa(50558)
or
$concat($tometa(197),$tometa(126))
Original Message:
Sent: 09-17-2024 09:31
From: Larry Adkins
Subject: Convert from CP-1252 to UTF-8
Hi Ingo,
This is interesting stuff....makes me glad I don't have to deal with extended character sets on a regular basis. I believe I can use some of what you have in that component to come up with a character translator that will fit my needs. In the end, I'll still have to decide on what to do with characters that don't land within characters 0 - 127. There's also the possibility that I will choose to save the data to be merged in Word to a plain text file letting Word do the conversion for me then take that result into the final merge process. My comfort level with Word COM calls is higher than dealing with converting individual characters.
I do appreciate the suggestion and if I choose to use something I gleaned from your code, I will post my solution here.
------------------------------
Larry Adkins
Proware
Cincinnati OH US
Original Message:
Sent: 09-17-2024 03:37
From: Ingo Stiller
Subject: Convert from CP-1252 to UTF-8
Hi Larry
First of all, UTF-8 x actually not a code page but only a ZIP procedure for reducing the size of unicode strings, optimized for the characters 0-127 (good old ASCII) :-)
The "meta"- codepages are a specialty of UnifAce, where font 0 again corresponds to ASCII (7-bit)
A long time ago I played around with codepages in UnifAce and programmed a few forms that display the fonts/codepages and also offer a few other operations. Compile and call PS0891Z.
As I said, it's very old, but at least it's better than nothing :-)
Ingo
------------------------------
Ingo Stiller
Aareon Deutschland GmbH
Original Message:
Sent: 09-13-2024 13:14
From: Larry Adkins
Subject: Convert from CP-1252 to UTF-8
I was wondering if anyone else had run into this before. If someone copies characters from a Word document and then pastes into an editbox, sometimes Uniface returns an error message or it will store it as some weird character. I have seen this countless times over the decades and have always just coded it using the $format and $replace in the formatFromDisplay trigger. Has anyone else run into the same type of thing and if so, did you come up with a way to convert any CP-1252 character to a UTF-8?
Here's a quick example of how I currently resolve this on a case-by-case basis. These are the "smart" quotes that Word puts in automatically while you type. I just replace them with the standard ascii quotes.
trigger formatFromDisplay
$format = $replace($replace($format, 1, """, $tometa(34), -1), 1, """, $tometa(34), -1)
$format = $replace($replace($format, 1, """, $tometa(34), -1), 1, """, $tometa(34), -1)
Here's a similar approach in c# (link below). It's trying to convert the contents of a file whereas I would be happy with converting the contents of a single field:
Steve McGill - Ramblings about Sitecore, EXM, and more
If anyone has any ideas on this, I'd love to hear them.
Best,
Larry
------------------------------
Larry Adkins
Proware
Cincinnati OH US
------------------------------