Hello there,
I am writing a data to xml in Universe Basic.
Its seem some invalid ASCII characters were included in the data which causing an error in xml.
Could anybody advise what is the command to exclude these bad characters or convert to null in Universe Basic before got written to xml please.
Your help would be appreciated. Thanks.
------------------------------
Yossy Tan
Senior Universe Developer
A2B Australia
Brisbane QLD AU
------------------------------
One simple solution may be OCONV all the output data using an MCP conversion. That will replace non-printable characters (control codes + upper ASCII) with a dot. No doubt, there are a variety of other solutions.
But, I would be wanting to know what the "bad" data is, in case it is actually good data that has been corrupted somehow. You could write a progam to go through your data to tell you which have any given data fragment not equal to its OCONV(fragment, 'MCP') equivalent.
And finally, work out how it got there, and stop it at its source.
Cheers,
Brian
------------------------------
Brian Speirs
Senior Analyst - Information Systems
Self Registered
Wellington NZ
------------------------------
Hello there,
I am writing a data to xml in Universe Basic.
Its seem some invalid ASCII characters were included in the data which causing an error in xml.
Could anybody advise what is the command to exclude these bad characters or convert to null in Universe Basic before got written to xml please.
Your help would be appreciated. Thanks.
------------------------------
Yossy Tan
Senior Universe Developer
A2B Australia
Brisbane QLD AU
------------------------------
Hi Yossi,
It's not only non-printable characters you have to worry about when writing an XML file. You also need to do the following convertions.
1) Convert the Amper sign '&' to '&'
2) The Apostrophe to '''
3) The < to '<'
4) The > to '>'
5) The double-quote to '"'
We wrote a Function that you can pass a value and it will return the value converted to a proper XML string.
Regards, Sam
------------------------------
Samuel A Powell
President/Developer
Advanced Transportation Systems Inc
Colorado Springs CO US
------------------------------
Hi Yossi,
It's not only non-printable characters you have to worry about when writing an XML file. You also need to do the following convertions.
1) Convert the Amper sign '&' to '&'
2) The Apostrophe to '''
3) The < to '<'
4) The > to '>'
5) The double-quote to '"'
We wrote a Function that you can pass a value and it will return the value converted to a proper XML string.
Regards, Sam
------------------------------
Samuel A Powell
President/Developer
Advanced Transportation Systems Inc
Colorado Springs CO US
------------------------------
I use a routine throughout our web applications for JSON as well as XML data.
ENCODE.IT:*
VAR1 = PARAM1
VAR1 = CHANGE(VAR1,'"',"%22")
VAR1 = CHANGE(VAR1,'#',"%23")
VAR1 = CHANGE(VAR1,'$',"%24")
VAR1 = CHANGE(VAR1,'%',"%25")
VAR1 = CHANGE(VAR1,'&',"%26")
VAR1 = CHANGE(VAR1,'*',"%2A")
VAR1 = CHANGE(VAR1,'+',"%2B")
VAR1 = CHANGE(VAR1,',',"%2C")
VAR1 = CHANGE(VAR1,'/',"%2F")
VAR1 = CHANGE(VAR1,'\\',"%5C")
VAR1 = CHANGE(VAR1,':',"%3A")
VAR1 = CHANGE(VAR1,';',"%3B")
VAR1 = CHANGE(VAR1,'=',"%3D")
VAR1 = CHANGE(VAR1,'?',"%2F")
VAR1 = CHANGE(VAR1,'@',"%40")
RETURN.ID = VAR1
RETURN
------------------------------
Doug Averch
Owner
U2 Logic
------------------------------
Hello there,
I am writing a data to xml in Universe Basic.
Its seem some invalid ASCII characters were included in the data which causing an error in xml.
Could anybody advise what is the command to exclude these bad characters or convert to null in Universe Basic before got written to xml please.
Your help would be appreciated. Thanks.
------------------------------
Yossy Tan
Senior Universe Developer
A2B Australia
Brisbane QLD AU
------------------------------
Yossy,
While all the answers are good, we may not be addressing your issue.
I for one am not sure how you are creating the XML file, and I am not entirely sure where the invalid characters are coming from.
For instance, are you using the the XDOM API in U2 in the BASIC code?
If so, what is the error code and message that is returned after the UDO or XDOM function call fails?
Note you can use the the XMLGetError(errorCode, errorMessage) function after the XDOM function that returns a status that does not return XML.SUCCESS
Are you using NLS with UniVerse?
What is the characters that are read from the database that are not working in the XML?
Are they valid for the encoding specified by the XML that is created ?
i.e. the first line of the XML file
<?xml version="1.0" encoding="UTF-8"?>
Note that there are XML options that can change how the XDOM functions work.
>XMLGETOPTIONS TOXML
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<xmlconfig>
<config key="standalone" value="yes" />
<config key="out-xml-declaration" value="true" />
<config key="out-format-pretty-print" value="true" />
<config key="out-normalize-characters" value="true" />
<config key="out-split-cdata-sections" value="true" />
<config key="out-validation" value="false" />
<config key="out-expand-entity-references" value="false" />
<config key="out-whitespace-in-element-content" value="true" />
<config key="out-discard-default-content" value="true" />
<config key="out-format-canonical" value="false" />
<config key="out-write-bom" value="false" />
<config key="matchelement" value="1" />
<config key="useunderscore" value="1" />
<config key="emptyattribute" value="0" />
<config key="elementmode" value="0" />
<config key="schematype" value="ref" />
<config key="hidemv" value="0" />
<config key="hidems" value="0" />
<config key="collapsemv" value="0" />
<config key="collapsems" value="0" />
<config key="hideroot" value="0" />
<config key="root" value="" />
<config key="record" value="" />
<config key="TARGET" value="" />
<config key="encode" value="" />
</xmlconfig>
In addition to the default options, there are also options for the in/out encoding
i.e.
<config key="in-encoding" value="iso-8859-1" />
<config key="out-encoding" value="utf-8" />
The in-encoding is for the codepage used on the data you are putting into the XDOM object, and the out-encoding is the setting for the xml created when you do the XDOMWrite
Based on the characters that are failing, you may just need to set the out-encoding to the same codepage used for the data being read.
i.e.
$INCLUDE UNIVERSE.INCLUDE XML.H
S = XDOMCreateRoot(domHandle)
S = XMLSetOptions("out-encoding=iso-8859-1")
S = XDOMWrite(domHandle, xmlDocument, XML.TO.STRING)
CRT xmlDocument
S = XMLSetOptions("out-encoding=utf-8")
S = XDOMWrite(domHandle, xmlDocument, XML.TO.STRING)
CRT xmlDocument
When you compile and run the above code it will produce:
<?xml version="1.0" encoding="iso-8859-1" standalone="yes" ?>
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
If this or any of the other suggestion have not helped, can you provide small reproducible example that shows what you are getting?
------------------------------
Mike Rajkowski
MultiValue Product Evangelist
Rocket Internal - All Brands
DENVER CO US
------------------------------
One simple solution may be OCONV all the output data using an MCP conversion. That will replace non-printable characters (control codes + upper ASCII) with a dot. No doubt, there are a variety of other solutions.
But, I would be wanting to know what the "bad" data is, in case it is actually good data that has been corrupted somehow. You could write a progam to go through your data to tell you which have any given data fragment not equal to its OCONV(fragment, 'MCP') equivalent.
And finally, work out how it got there, and stop it at its source.
Cheers,
Brian
------------------------------
Brian Speirs
Senior Analyst - Information Systems
Self Registered
Wellington NZ
------------------------------
Hi Brian,
I found out, during the data entry process, some of the operators use cut-and paste to enter part of the data.
This is what I believe the source of the problem. It was not by the corrupted data.
I have applied your solution in my program. My issue has been resolved now.
Thanks a lot for your solution.
I really appreciated your help.
Cheers
Yossy
Hi Yossi,
It's not only non-printable characters you have to worry about when writing an XML file. You also need to do the following convertions.
1) Convert the Amper sign '&' to '&'
2) The Apostrophe to '''
3) The < to '<'
4) The > to '>'
5) The double-quote to '"'
We wrote a Function that you can pass a value and it will return the value converted to a proper XML string.
Regards, Sam
------------------------------
Samuel A Powell
President/Developer
Advanced Transportation Systems Inc
Colorado Springs CO US
------------------------------
Hi Sam,
Thanks a lot for your advice.
I haven't created the function as you suggested yet but I will keep in mind, I might do it later.
Cheers
Yossy
Yossy,
While all the answers are good, we may not be addressing your issue.
I for one am not sure how you are creating the XML file, and I am not entirely sure where the invalid characters are coming from.
For instance, are you using the the XDOM API in U2 in the BASIC code?
If so, what is the error code and message that is returned after the UDO or XDOM function call fails?
Note you can use the the XMLGetError(errorCode, errorMessage) function after the XDOM function that returns a status that does not return XML.SUCCESS
Are you using NLS with UniVerse?
What is the characters that are read from the database that are not working in the XML?
Are they valid for the encoding specified by the XML that is created ?
i.e. the first line of the XML file
<?xml version="1.0" encoding="UTF-8"?>
Note that there are XML options that can change how the XDOM functions work.
>XMLGETOPTIONS TOXML
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<xmlconfig>
<config key="standalone" value="yes" />
<config key="out-xml-declaration" value="true" />
<config key="out-format-pretty-print" value="true" />
<config key="out-normalize-characters" value="true" />
<config key="out-split-cdata-sections" value="true" />
<config key="out-validation" value="false" />
<config key="out-expand-entity-references" value="false" />
<config key="out-whitespace-in-element-content" value="true" />
<config key="out-discard-default-content" value="true" />
<config key="out-format-canonical" value="false" />
<config key="out-write-bom" value="false" />
<config key="matchelement" value="1" />
<config key="useunderscore" value="1" />
<config key="emptyattribute" value="0" />
<config key="elementmode" value="0" />
<config key="schematype" value="ref" />
<config key="hidemv" value="0" />
<config key="hidems" value="0" />
<config key="collapsemv" value="0" />
<config key="collapsems" value="0" />
<config key="hideroot" value="0" />
<config key="root" value="" />
<config key="record" value="" />
<config key="TARGET" value="" />
<config key="encode" value="" />
</xmlconfig>
In addition to the default options, there are also options for the in/out encoding
i.e.
<config key="in-encoding" value="iso-8859-1" />
<config key="out-encoding" value="utf-8" />
The in-encoding is for the codepage used on the data you are putting into the XDOM object, and the out-encoding is the setting for the xml created when you do the XDOMWrite
Based on the characters that are failing, you may just need to set the out-encoding to the same codepage used for the data being read.
i.e.
$INCLUDE UNIVERSE.INCLUDE XML.H
S = XDOMCreateRoot(domHandle)
S = XMLSetOptions("out-encoding=iso-8859-1")
S = XDOMWrite(domHandle, xmlDocument, XML.TO.STRING)
CRT xmlDocument
S = XMLSetOptions("out-encoding=utf-8")
S = XDOMWrite(domHandle, xmlDocument, XML.TO.STRING)
CRT xmlDocument
When you compile and run the above code it will produce:
<?xml version="1.0" encoding="iso-8859-1" standalone="yes" ?>
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
If this or any of the other suggestion have not helped, can you provide small reproducible example that shows what you are getting?
------------------------------
Mike Rajkowski
MultiValue Product Evangelist
Rocket Internal - All Brands
DENVER CO US
------------------------------
Hi Mike,
Thanks a lot for your advice.
I did not use XDOM API in my code.
So far my problem has been resolved using OCONV "MCP".
I will keep your advice for my future reference.
Thanks again for your support and advice.
Cheers
Yossy
Hi Brian,
I found out, during the data entry process, some of the operators use cut-and paste to enter part of the data.
This is what I believe the source of the problem. It was not by the corrupted data.
I have applied your solution in my program. My issue has been resolved now.
Thanks a lot for your solution.
I really appreciated your help.
Cheers
Yossy
Hi Yossy,
Just remember - this is a "band-aid" solution. You should really make sure that your data is validated on input (which would hopefully take care of any issues introduced by the cut and paste method of data entry), and then make sure any data is suitably encoded on output (as suggested by Sam).
I have attached a sample encoding routine - but note that this is incomplete as it only allows 7-bit ASCII whereas XML *should* allow most unicode characters.
Good luck!
Brian
------------------------------
Brian Speirs
Senior Analyst - Information Systems
Self Registered
Wellington NZ
------------------------------
Hi Yossi,
It's not only non-printable characters you have to worry about when writing an XML file. You also need to do the following convertions.
1) Convert the Amper sign '&' to '&'
2) The Apostrophe to '''
3) The < to '<'
4) The > to '>'
5) The double-quote to '"'
We wrote a Function that you can pass a value and it will return the value converted to a proper XML string.
Regards, Sam
------------------------------
Samuel A Powell
President/Developer
Advanced Transportation Systems Inc
Colorado Springs CO US
------------------------------
Morning Guys,
I have a different method of encoding data into an xml structure.
The 'lazy person' approach. Simple, easy - just the way I like things.
I use the LIST or SORT command with the TOXML keyword.
This nicely handles the correct encoding of <>& (et al) characters.
In my UniVerse Basic program
COMMAND = 'SORT CUSTOMERS BY NAME ACCOUNT NAME PHONE EMAIL TOXML'
EXECUTE COMMAND CAPTURING DETAILS
DETAILS now contains the CUSTOMERS records encoded in an xml structure.
Alternatively, I simply execute the SORT or LIST directly at TCL, and use screen capture to capture the output..
And depending on the version of UniVerse that you are on, there may also be the TOJSON keyword available.
I can then take DETAILS, save it as a file with an .xml extension, open it using Excel, and Excel will automatically do some formatting and add filters. However Excel will not handle MV fields using this method.
This method is excellent for adhoc extracts.
------------------------------
Bruce Philp
Senior Services Consultant
Meier Business Systems PTY LTD
Carnegie VIC AU
------------------------------