Skip to main content
I'm trying to import email text into the database, for some incoming emails I'm getting the above error on the store of the text data. 
Is there any intelligent way to scan for which characters are objected to so I can then make decisions about translating/ditching them? 


------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
I'm trying to import email text into the database, for some incoming emails I'm getting the above error on the store of the text data. 
Is there any intelligent way to scan for which characters are objected to so I can then make decisions about translating/ditching them? 


------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
How is the field in question defined? The error 0131 is usually thrown when a character is not part of the allowed character set. E.g. when the field interface is C then you can only specify the characters that are part of the system character set (e.g. Windows-1252). If you use the interface W then the complete Unicode range should be possible. But you could (e.g.) limit this with the field syntax property Characters Allowed.

Scanning for "offending" characters might be a bit more tricky. You could try to save the email text with filedump using $DEF_CHARSET and then load it again. In case there's one or more character outside the character set then these will be "translated". A comparison of the original and "translated" text should then (hopefully) give you a clue. But right now I cannot think of a generic function that you could use here.

I hope this helps.

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
How is the field in question defined? The error 0131 is usually thrown when a character is not part of the allowed character set. E.g. when the field interface is C then you can only specify the characters that are part of the system character set (e.g. Windows-1252). If you use the interface W then the complete Unicode range should be possible. But you could (e.g.) limit this with the field syntax property Characters Allowed.

Scanning for "offending" characters might be a bit more tricky. You could try to save the email text with filedump using $DEF_CHARSET and then load it again. In case there's one or more character outside the character set then these will be "translated". A comparison of the original and "translated" text should then (hopefully) give you a clue. But right now I cannot think of a generic function that you could use here.

I hope this helps.

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
It's a text field (c*), dating from before W existed, (worse, it's not even an SC* so there are probably overflow fields in the database). 
It should (ideally) be converted to W* (SW*?) and mapped to nvarchar(max), but that will take a bit of work to re-assimilate all the data, 
I'll look at the filedump. 
It's a shame there's no way to programmatically check for errors before attempting the store. As with most automation, default errors are a terrible way to trap problems and handle them in proc.

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
It's a text field (c*), dating from before W existed, (worse, it's not even an SC* so there are probably overflow fields in the database). 
It should (ideally) be converted to W* (SW*?) and mapped to nvarchar(max), but that will take a bit of work to re-assimilate all the data, 
I'll look at the filedump. 
It's a shame there's no way to programmatically check for errors before attempting the store. As with most automation, default errors are a terrible way to trap problems and handle them in proc.

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
If you execute validatefield (or validate/e) before the store then this should also throw an error and $dataerrorcontext should then return the exact error.

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
If you execute validatefield (or validate/e) before the store then this should also throw an error and $dataerrorcontext should then return the exact error.

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
Indeed, but (unless the filedump works), there's then no way to 'follow up' on this message, so it's as easy to find out at store (it's a 10 line proc). 

I suppose one could write a (fairly complex) loop where you store the data, and add it back to the field one character at a time, running the validate, and dropping the character if the validate fails......

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
How is the field in question defined? The error 0131 is usually thrown when a character is not part of the allowed character set. E.g. when the field interface is C then you can only specify the characters that are part of the system character set (e.g. Windows-1252). If you use the interface W then the complete Unicode range should be possible. But you could (e.g.) limit this with the field syntax property Characters Allowed.

Scanning for "offending" characters might be a bit more tricky. You could try to save the email text with filedump using $DEF_CHARSET and then load it again. In case there's one or more character outside the character set then these will be "translated". A comparison of the original and "translated" text should then (hopefully) give you a clue. But right now I cannot think of a generic function that you could use here.

I hope this helps.

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
@Daniel Iseli

I have now come to set this up as it will require manual massage of the database entries on release. 
I have redefined the field as SW*, which has changed the default script generation  to nvarchar(max) in the database. So I have altered the field in the database
I am trying to insert a piece of text containing a 'smiley' (🙂), which shows on the editbox on the UI. However, it is still failing with 0131 come time to store in the database. 
I have changed $SYS_CHARSET and $DEF_CHARSET to UTF-8, I have set the field syntax on the field to FUL (and MUL) and nothing appears to allow me to store this character. 
Am I missing something obvious?
Regards,
Iain​​

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
@Daniel Iseli

I have now come to set this up as it will require manual massage of the database entries on release. 
I have redefined the field as SW*, which has changed the default script generation  to nvarchar(max) in the database. So I have altered the field in the database
I am trying to insert a piece of text containing a 'smiley' (🙂), which shows on the editbox on the UI. However, it is still failing with 0131 come time to store in the database. 
I have changed $SYS_CHARSET and $DEF_CHARSET to UTF-8, I have set the field syntax on the field to FUL (and MUL) and nothing appears to allow me to store this character. 
Am I missing something obvious?
Regards,
Iain​​

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------

I have no solution, but just throwing this out.

Is your problem the euro-sign, it usually is a problem depending on how the validation is done. The euro-sign should be available in eg. ISO-8859-15  or LATIN9.

Is your problem the BOM-character?

Regards RogerW.



------------------------------
Roger Wallin
Abilita Oy
------------------------------
@Daniel Iseli

I have now come to set this up as it will require manual massage of the database entries on release. 
I have redefined the field as SW*, which has changed the default script generation  to nvarchar(max) in the database. So I have altered the field in the database
I am trying to insert a piece of text containing a 'smiley' (🙂), which shows on the editbox on the UI. However, it is still failing with 0131 come time to store in the database. 
I have changed $SYS_CHARSET and $DEF_CHARSET to UTF-8, I have set the field syntax on the field to FUL (and MUL) and nothing appears to allow me to store this character. 
Am I missing something obvious?
Regards,
Iain​​

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
I have tried the lfiledump/lfileload option and if I use "UTF-8" as the dump type, it saves and loads the smiley face, and then fails the validation again, 
If I remove the UTF-8 it strips the smiley face, and passes the validation. 
Which seems to indicate to me that I have not defined the data field to take unicode characters properly. What am I missing? 
Regards, 
Iain

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------

I have no solution, but just throwing this out.

Is your problem the euro-sign, it usually is a problem depending on how the validation is done. The euro-sign should be available in eg. ISO-8859-15  or LATIN9.

Is your problem the BOM-character?

Regards RogerW.



------------------------------
Roger Wallin
Abilita Oy
------------------------------
I am trying to support customers' incoming emails, I'm not sure why some of them fail, but several I have seen it's been the smiley face character. (Or, in other cases the 'dots' causedby copying in a dotted list from outlook to uniface. )
Experiments all this morning have convinced me that setting the datatype to SW* (or W*) does not, in itself, provide support for the full sweep of unicode characters.

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
I have tried the lfiledump/lfileload option and if I use "UTF-8" as the dump type, it saves and loads the smiley face, and then fails the validation again, 
If I remove the UTF-8 it strips the smiley face, and passes the validation. 
Which seems to indicate to me that I have not defined the data field to take unicode characters properly. What am I missing? 
Regards, 
Iain

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------

Do you have working unicode tables or is that new for you in Uniface?
This is from an outdated application  that used Unicode at us....

Table Interface    "U*W(U)"

Field interface  "VW1000"

Filed Syntax  "YCR,LEN(0-1000)"



------------------------------
Roger Wallin
Abilita Oy
------------------------------
@Daniel Iseli

I have now come to set this up as it will require manual massage of the database entries on release. 
I have redefined the field as SW*, which has changed the default script generation  to nvarchar(max) in the database. So I have altered the field in the database
I am trying to insert a piece of text containing a 'smiley' (🙂), which shows on the editbox on the UI. However, it is still failing with 0131 come time to store in the database. 
I have changed $SYS_CHARSET and $DEF_CHARSET to UTF-8, I have set the field syntax on the field to FUL (and MUL) and nothing appears to allow me to store this character. 
Am I missing something obvious?
Regards,
Iain​​

------------------------------
Iain Sharp
Head of Technical Services
Pci Systems Ltd
Sheffield GB
------------------------------
The character needs to be part of the font you are using for the field in question. The default fonts of Windows do not include all the available characters. There are some Unicode fonts available that have more characters. For more info see:
On my system I have found the font "Lucida Sans Unicode" that includes the following smiley character: ☺ (U+263A; UTF-8 hex: E2 98 BA). When I try to store this then everything works fine when the Unicode field (SW* and FUL) is using this font.

When I copy the smiley from your message then I get a character with the UTF-8 hex value F0 9F 99 82. As far as I can see the font "Lucida Sans Unicode" does not include this character and will correctly cause the error 0131.

Not sure if there is a straightforward solution for this - besides the filedump/fileload workaround I have mentioned before. Sorry. Maybe I can find another workaround, but probably will take some time.

Regards,
Daniel

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------
The character needs to be part of the font you are using for the field in question. The default fonts of Windows do not include all the available characters. There are some Unicode fonts available that have more characters. For more info see:
On my system I have found the font "Lucida Sans Unicode" that includes the following smiley character: ☺ (U+263A; UTF-8 hex: E2 98 BA). When I try to store this then everything works fine when the Unicode field (SW* and FUL) is using this font.

When I copy the smiley from your message then I get a character with the UTF-8 hex value F0 9F 99 82. As far as I can see the font "Lucida Sans Unicode" does not include this character and will correctly cause the error 0131.

Not sure if there is a straightforward solution for this - besides the filedump/fileload workaround I have mentioned before. Sorry. Maybe I can find another workaround, but probably will take some time.

Regards,
Daniel

------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------

It seems that you could use $string to create these smiley characters. If I understand it correctly then these characters are in the range of 1F600-1F64F (Emoticons). For example:

@$fieldname = $replace(@$fieldname, 1, $string("🙂"), $string("☺"), -1)

This will replace the emoticons smiley with the one I have mentioned in my last post (and is part of the font "Lucida Sans Unicode" ).

Since we do not really know which characters could be part of an email there are potentially a lot of Unicode blocks you need to check. Could be quite tedious. And I do not really know if there is actually a font that would include all these Unicode characters. Probably not. Looking at the fonts that support 'Emoticons' then these seem quite "exotic".

Maybe using filedump/fileload is the way forward. At least you can be sure that the text that has been dumped can be stored (when loaded). You might lose some characters, but are smileys business critical?

I hope this helps.

Regards,
Daniel



------------------------------
Daniel Iseli
Principal Technical Support Engineer
Uniface Services
Rocket Software, Switzerland
------------------------------