Skip to main content
Question

Encoding in Universe Python

  • December 7, 2023
  • 3 replies
  • 0 views

Héctor Cortiguera
Forum|alt.badge.img

Hi all

I'm trying to use the PyCallFunction to run some Python code but I'm having trouble decoding Universe strings in the Python side.


This is a snippet of the Python code:

COD_DEF_SYS = 'cp1252'
SEP_MUL_DEF = chr(253)
SEP_FLD_DEF = chr(254)


def python_compression(operation: str, files: str, extra_params: str) -> str:
    if operation == 'c':
        file_as_bytes = bytes(file, COD_DEF_SYS, errors='backslashreplace')
        bytes_separador = bytes(SEP_FLD_DEF, COD_DEF_SYS)
        bytes_multivalorado = bytes(SEP_MUL_DEF, COD_DEF_SYS)
        xresultado = file_as_bytes.split(bytes_separador)
        print(file_as_bytes)
        print(bytes_separador)
        print(bytes_multivalorado)
        print(xresultado)

And this is the BASIC call:

   LISTA_F=''
   LISTA_F<1>='hola'
   LISTA_F<2>='que'
   LISTA_F<3>='tal'
   LISTA_F<4,1>='multi'
   LISTA_F<4,2>='valorado'
   RESPUESTA=PyCallFunction(NOMBRE.MODULO.PY, NOMBRE.FUNCION.PY,'c', LISTA_F, 'test')

This prints the following:

b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'
b'xfe'
b'xfd'
[b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado']
b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'

If I print the string I'm getting from the PyCallFunction I get this:

hola´ú¥que´ú¥tal´ú¥multi´ú¢valorado

What encoding is this?

3 replies

Héctor Cortiguera
Forum|alt.badge.img

Hi all

I'm trying to use the PyCallFunction to run some Python code but I'm having trouble decoding Universe strings in the Python side.


This is a snippet of the Python code:

COD_DEF_SYS = 'cp1252'
SEP_MUL_DEF = chr(253)
SEP_FLD_DEF = chr(254)


def python_compression(operation: str, files: str, extra_params: str) -> str:
    if operation == 'c':
        file_as_bytes = bytes(file, COD_DEF_SYS, errors='backslashreplace')
        bytes_separador = bytes(SEP_FLD_DEF, COD_DEF_SYS)
        bytes_multivalorado = bytes(SEP_MUL_DEF, COD_DEF_SYS)
        xresultado = file_as_bytes.split(bytes_separador)
        print(file_as_bytes)
        print(bytes_separador)
        print(bytes_multivalorado)
        print(xresultado)

And this is the BASIC call:

   LISTA_F=''
   LISTA_F<1>='hola'
   LISTA_F<2>='que'
   LISTA_F<3>='tal'
   LISTA_F<4,1>='multi'
   LISTA_F<4,2>='valorado'
   RESPUESTA=PyCallFunction(NOMBRE.MODULO.PY, NOMBRE.FUNCION.PY,'c', LISTA_F, 'test')

This prints the following:

b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'
b'xfe'
b'xfd'
[b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado']
b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'

If I print the string I'm getting from the PyCallFunction I get this:

hola´ú¥que´ú¥tal´ú¥multi´ú¢valorado

What encoding is this?

The values I'm getting in the input string are weird.

Instead of the usual value and multivalue separators I'm getting way too big numbers:


value separator is 63742 instead of 254
multivalue separator is 63741 instead of 253

    for c in files:
        print(ord(c))

104
111
108
97
63742
113
117
101
63742
116
97
108
63742
109
117
108
116
105
63741
118
97
108
111
114
97
100
111

Forum|alt.badge.img
  • Participating Frequently
  • December 8, 2023

Hi all

I'm trying to use the PyCallFunction to run some Python code but I'm having trouble decoding Universe strings in the Python side.


This is a snippet of the Python code:

COD_DEF_SYS = 'cp1252'
SEP_MUL_DEF = chr(253)
SEP_FLD_DEF = chr(254)


def python_compression(operation: str, files: str, extra_params: str) -> str:
    if operation == 'c':
        file_as_bytes = bytes(file, COD_DEF_SYS, errors='backslashreplace')
        bytes_separador = bytes(SEP_FLD_DEF, COD_DEF_SYS)
        bytes_multivalorado = bytes(SEP_MUL_DEF, COD_DEF_SYS)
        xresultado = file_as_bytes.split(bytes_separador)
        print(file_as_bytes)
        print(bytes_separador)
        print(bytes_multivalorado)
        print(xresultado)

And this is the BASIC call:

   LISTA_F=''
   LISTA_F<1>='hola'
   LISTA_F<2>='que'
   LISTA_F<3>='tal'
   LISTA_F<4,1>='multi'
   LISTA_F<4,2>='valorado'
   RESPUESTA=PyCallFunction(NOMBRE.MODULO.PY, NOMBRE.FUNCION.PY,'c', LISTA_F, 'test')

This prints the following:

b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'
b'xfe'
b'xfd'
[b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado']
b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'

If I print the string I'm getting from the PyCallFunction I get this:

hola´ú¥que´ú¥tal´ú¥multi´ú¢valorado

What encoding is this?

Hello Hector

The marker characters don't play nicely with any modern string encodings. If you look at the u2py.DynArray() it uses a byte array internally (as do the equivalents in .NET).

Rather than messing about with encodings, if I'm passing data into a Python function or method I tend to structure it in a way that makes sense to Python rather than to U2 - for example, passing it in JSON format, which just needs a json.loads() to turn it into native Python types. I've found that is generally safer and easier. Having some useful wrappers around the horrible UDO library also helps :)


Mike Rajkowski
Forum|alt.badge.img+1

Hi all

I'm trying to use the PyCallFunction to run some Python code but I'm having trouble decoding Universe strings in the Python side.


This is a snippet of the Python code:

COD_DEF_SYS = 'cp1252'
SEP_MUL_DEF = chr(253)
SEP_FLD_DEF = chr(254)


def python_compression(operation: str, files: str, extra_params: str) -> str:
    if operation == 'c':
        file_as_bytes = bytes(file, COD_DEF_SYS, errors='backslashreplace')
        bytes_separador = bytes(SEP_FLD_DEF, COD_DEF_SYS)
        bytes_multivalorado = bytes(SEP_MUL_DEF, COD_DEF_SYS)
        xresultado = file_as_bytes.split(bytes_separador)
        print(file_as_bytes)
        print(bytes_separador)
        print(bytes_multivalorado)
        print(xresultado)

And this is the BASIC call:

   LISTA_F=''
   LISTA_F<1>='hola'
   LISTA_F<2>='que'
   LISTA_F<3>='tal'
   LISTA_F<4,1>='multi'
   LISTA_F<4,2>='valorado'
   RESPUESTA=PyCallFunction(NOMBRE.MODULO.PY, NOMBRE.FUNCION.PY,'c', LISTA_F, 'test')

This prints the following:

b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'
b'xfe'
b'xfd'
[b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado']
b'hola\\uf8feque\\uf8fetal\\uf8femulti\\uf8fdvalorado'

If I print the string I'm getting from the PyCallFunction I get this:

hola´ú¥que´ú¥tal´ú¥multi´ú¢valorado

What encoding is this?

Hector,

When you pass the data it is not seen as a Dynamic array unless you marshal it into a DynArray.

Import u2py into your python code, then marshal the data passed to a u2py.DynArray.  You would then be able to convert the DynArray to a python list.
here is a simple test method:
import u2py

def test(instring):
    print(instring)
    d = u2py.DynArray(instring)
    l = d.to_list()
    print(str(l))