------------------------------
Michael Rösch
Abrechnungszentrum Emmendingen
------------------------------
Hello Michaël,
In our software we interfaced the iconv library which is the best tools we have found to convert between character sets including replacing unsupported characters.
Note: It exists also also as an Unix executable.
Bertrand Daene
CGM Lab molis
Hello Michaël,
In our software we interfaced the iconv library which is the best tools we have found to convert between character sets including replacing unsupported characters.
Note: It exists also also as an Unix executable.
Bertrand Daene
CGM Lab molis
Hello Bertrand,
thank you very much for your reply. I did some testing with iconv and the //TRANSLIT option on solaris.
But unfortunately all I've tested hasn't met my purposes. :-(
During my research I've found the site https://www.baeldung.com/linux/utf-8-ascii-conversion. From there I tried the call of
$ iconv -f UTF-8 -t ASCII//TRANSLIT input_utf8.txt -o output_ascii.txt
The output shown will have the encoding in ASCII format, including the transliterated character, such as all 'ç' characters being altered to 'c'.
My results with my own test file were different. I've tried for example:
Salihamidžić
Brebrić
and got
Salihamidzic'
Brebric'
€ and £ have been replaced with EUR and GBP. This could cause a "string too long error".
Do you know if it's possible to use "iconv" with own transliteration rules?
Best regards
------------------------------
Michael Rösch
Abrechnungszentrum Emmendingen
------------------------------
Hello Bertrand,
thank you very much for your reply. I did some testing with iconv and the //TRANSLIT option on solaris.
But unfortunately all I've tested hasn't met my purposes. :-(
During my research I've found the site https://www.baeldung.com/linux/utf-8-ascii-conversion. From there I tried the call of
$ iconv -f UTF-8 -t ASCII//TRANSLIT input_utf8.txt -o output_ascii.txt
The output shown will have the encoding in ASCII format, including the transliterated character, such as all 'ç' characters being altered to 'c'.
My results with my own test file were different. I've tried for example:
Salihamidžić
Brebrić
and got
Salihamidzic'
Brebric'
€ and £ have been replaced with EUR and GBP. This could cause a "string too long error".
Do you know if it's possible to use "iconv" with own transliteration rules?
Best regards
------------------------------
Michael Rösch
Abrechnungszentrum Emmendingen
------------------------------
Hello Michaël,
We used mainly to convert between UTF8 and LATIN1 (=ISO-8859-1) which less such transliteration extension. We use in fact own prefilter to eliminate such unwanted characters. But transliteration means often extending the length of the string. I have no solution if you have to preserve the length.
Note that transliteration depend also of LANG setting (or setlocale in C sources): translation of ö is different between de_DE and fr_FR
On our RedHat Linux there is no possibilities to define our own translation rules. But on some system it his possible. See for example geniconvtbl - man pages section 1: User Commands (oracle.com) I don't know about Solaris.
Best regards
Hello Michaël,
We used mainly to convert between UTF8 and LATIN1 (=ISO-8859-1) which less such transliteration extension. We use in fact own prefilter to eliminate such unwanted characters. But transliteration means often extending the length of the string. I have no solution if you have to preserve the length.
Note that transliteration depend also of LANG setting (or setlocale in C sources): translation of ö is different between de_DE and fr_FR
On our RedHat Linux there is no possibilities to define our own translation rules. But on some system it his possible. See for example geniconvtbl - man pages section 1: User Commands (oracle.com) I don't know about Solaris.
Best regards
Hello Bertrand,
thank you very much for your hints. We did already some investigations on it. Looks promising.
Best regards and have a nice weekend
Already have an account? Login
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.