Question

Left shift of fields in file when the previous field contains a utf8 character

Forum|Forum|21 days ago
February 3, 2026
11 replies
75 views

M Carmen De Paz
Participating Frequently

Hi ,we works with Micro Focus Visual cobol 8.0 for Eclipse on linux/Unix

We have a program with a file that we defines so:

file-control. select prevcob assign to wk-prevcob organization is line sequential.

data division. file section. fd prevcob. 01 reg-prevcob pic x(185).

and a working variable that defines the file structure as follows:

01 wk-reg-prevcob-pru. 10 prevcob-nombre-pru pic x(40) value spaces. 10 prevcob-impdivisa-pru pic 9(16)v99 value 0. 10 resto pic x(127).

We define a cusor that obtains data from a database with UTF-8 locales in Informix and passes it to the variables of the working (the host variable are defined like pic x(300) and pic s9(16)v9(2))

move clinombr to prevcob-nombre-pru move srvimpdivi to prevcob-impdivisa-pru

we write the file as follws:

write reg-prevcob from wk-reg-prevcob-pru

The problem is that when field prevcob-nombre-pru contains a utf8 character, field prevcob-impdivisa-pru appears to shift one position to the left and the file becomes misconfigured.this is a image of dthe file edited with VI editor

The file is in a directory of the server unix.

¿is it possible that the file appears alingned?

Chris Glazier
Moderator
Forum|Forum|21 days ago
February 3, 2026

Line Sequential files can contain only printable ASCII characters unless you use the configuration option INSERTNULL. It may also be a problem with how vi interprets the character. Can you please attach a small file that demonstrates the problem so that I may review it?

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|21 days ago
February 4, 2026

Hi Chris, It's great to talk to you again. I hope you can help me.

I think the problem is that bytes are being written, not characters. For example, the byte representation of the letter Ñ is c391 (2 bytes), so 40 bytes translate to 39 printable characters.

I've run an `od` (octal dump) on the file, and there are indeed 40 bytes in the field containing the Ñ.

Is there any way to maintain the 40 printable characters even if it contains UTF-8 characters?

I attach the file , we genetate the file without extension , but I had to add an extension because otherwise it wouldn't let me attach it

FileShift.txt

Chris Glazier
Moderator
Forum|Forum|21 days ago
February 4, 2026

The UTF-8 representation of the character is always going to be 2 bytes in the file as that is how it is represented in the UTF-8 character set.

If you want it to be stored as one byte you would have to convert it to ASCII/ANSI before writing it.

No guarantees but here is an example that works in your case. It will convert a string of UTF-8 characters including yours to ANSI and then write the file.

       identification division.
       program-id. Program1.

       environment division.
       configuration section.
           select test-file assign to "testfile.dat"
                            organization is line sequential
                            file status is file-status.
       data division.
       fd test-file.
       01 test-rec pic x(15).
       working-storage section.
       01 file-status   pic x(2).
       01 my-char       pic x(15)   value spaces.
       01 my-utf        pic x(15)  value x"C391C391C391C3913132".
       01 out-length    pic x(4) comp-x value 15.
       01 reserved      pic x(4) comp-x value 0.
       01 status-code   pic x(4) comp-5.
       procedure division.


           call "CBL_STRING_CONVERT" using by reference my-utf
                                by value 15
                                by value 0
                                by reference my-char
                                by reference out-length
                                by value 3
                                by value 0
                                by reference reserved
                                returning status-code.
           open output test-file
           move my-char to test-rec
           write test-rec
           display file-status
           close test-file
           open input test-file
           read test-file
           close test-file

           goback.

       end program Program1.

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|21 days ago
February 4, 2026

Thank you so much for your help Chris, but unfortunately it doesn't work for what I need.

I wrote this porgram :

identification division.
program-id. anook7.
author. cdepaz.
date-written. 04.02.2026.
******************************************************************
* progrma de prueba fichero con caracteres UTF8
******************************************************************
******************************************************************
environment division.
******************************************************************
******************************************************************
configuration section.
******************************************************************
special-names.
decimal-point is comma.

input-output section.
******************************************************************
file-control.
select test-file assign to "testfile.dat"
organization is line sequential
file status is file-status.
******************************************************************
data division.
file section.
fd test-file.
01 test-rec.
05 test-rec-char pic x(15).
05 test-rec-num pic 9(05).

******************************************************************
working-storage section.
******************************************************************
01 sw-fin pic 9 value zero.
88 si-fin value 1.
88 no-fin value zero.

01 file-status pic x(2).
01 my-char pic x(15) value spaces.
01 my-utf pic x(15) value x"C391C391C391C3913132".
01 out-length pic x(4) comp-x value 15.
01 reserved pic x(4) comp-x value 0.
01 status-code pic x(4) comp-5.

*-----------------------------------------------------------------
******************************************************************
linkage section.
*-----------------------------------------------------------------
procedure division.
******************************************************************
*-----------------------------------------------------------------
call "CBL_STRING_CONVERT" using by reference my-utf
by value 15
by value 0
by reference my-char
by reference out-length
by value 3
by value 0
by reference reserved
returning status-code.
open output test-file
move 'ABCDEFG' to test-rec-char
move 0 to test-rec-num
write test-rec

move my-char to test-rec-char
move 0 to test-rec-num
write test-rec
close test-file

open input test-file
read test-file at end set si-fin to true
end-read
perform until si-fin
display test-rec
read test-file at end set si-fin to true
end-perform
close test-file
.
fin.
goback.

end the resul is :

There is a left shift for each utf8 character

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|20 days ago
February 4, 2026

the problem is that in our system the environment variable is LANG=es_ES.UTF-8 and the out-encoding 3 ->

ASCII/MBCS characters (current locale) It depends on the locale, and since it's utf8 the function does nothing

Chris Glazier
Moderator
Forum|Forum|20 days ago
February 5, 2026

Is the problem with storing the accented UTF-8 characters in the file or in how they are displayed on a terminal? Is it ok to store the 2 bytes in the file as long as it displays correctly or do you want the extended ASCII equivalent stored in the file as one character?

I have found if there is some type of field terminator (“,” or tab?) between the columns then I can get it to display correctly using the column command like:

column -t -s $',' testfile.dat

This is when I define the file as:
fd test-file.
01 test-rec.
05 test-rec-char pic x(15).
05 delim pic x. *> insert a comma
05 test-rec-num pic 9(5).

If I know more of what you actually plan to do with these characters, I might be able to be more helpful.

Thanks

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|20 days ago
February 5, 2026

The problem is in the file, and i would like store the extended ASCII equivalent in the file.

We need to generate a file with a fixed data structure to pass to another software provider. The data is expected in specific positions, which is why it's so important that there are no offsets. What I don't know is the encoding the third party uses, although I imagine it's not UTF-8; it's possible it's ISO-8895-1.

I'm now trying to use the C function iconv in a similar way to how we sometimes use it in the operating system for conversion, but I'm getting some runtime errors.

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|19 days ago
February 5, 2026

I've been studying the cobutf8 utility I found in the Micro Focus documentation, which is based precisely on iconv. Do you think it could be useful to me? how can i use it?

Chris Glazier
Moderator
Forum|Forum|19 days ago
February 5, 2026

I don’t think the cobutf8 utility will help but I am not an expert in it.

I was able to get the “iconv” function call API to work with COBOL though.

Here is the new example:

      $set sourceformat"variable"
       identification division.
       program-id. Program1.

       environment division.
       configuration section.
           select test-file assign to "testfile.dat"
                            organization is line sequential
                            file status is file-status.
       data division.
       fd test-file.
       01 test-rec.
          05 test-rec-char  pic x(15).
          05 test-rec-num   pic 9(5).
       working-storage section.
       01 file-status   pic x(2).
       01 my-char       pic x(20)   value spaces.
       01 my-utf        pic x(15)  value x"C391C391C391C39131322020202020".
       01 in-len        pic x(8) comp-5 value 15.
       01 out-len       pic x(8) comp-5 value 20.
       01 capacity      pic x(8) comp-5 value zeroes.
       01 status-code   pic x(8) comp-5.
       01 cdesc         pointer.
       01 to-charset    pic x(13)   value z"WINDOWS-1252".
       01 from-charset  pic x(6)    value z"UTF-8".
       01 utf-point     pointer.
       01 char-point    pointer.
       procedure division.

           set utf-point to address of my-utf
           set char-point to address of my-char
           move length of my-utf to in-len
           move length of my-char to out-len capacity

           call "iconv_open" using to-charset, from-charset
              returning cdesc

           if cdesc = null
              display "error on iconv_open"
              stop run
           end-if

           call "iconv" using by value cdesc
                              by reference utf-point
                              by reference in-len
                              by reference char-point
                              by reference out-len
              returning status-code
           if status-code = -1
              display "error on iconv"
              stop run
           end-if

           call "iconv_close" using by value cdesc

           open output test-file
           move "ABCDEFG" to test-rec-char
           move zeroes to test-rec-num
           write test-rec
           display file-status
           move spaces to test-rec-char
           move my-char(1:capacity - out-len) to test-rec-char
           write test-rec
           close test-file

           open input test-file
           perform until exit
              read test-file
                  at end
                     exit perform
                  not at end
                   display "read ok"
               end-read
           end-perform

           close test-file

           goback.

M Carmen De Paz
Author
Participating Frequently
Forum|Forum|16 days ago
February 9, 2026

Thank you so much for your reply. I've implemented it and it works perfectly, but I'd like to delve deeper into the cobutf8 utility because I think it suits our needs. Is there anyone who knows it and could help me?

Or would it be better to start a new discussion?

Chris Glazier
Moderator
Forum|Forum|16 days ago
February 9, 2026

I am happy that the iconv program worked for you!

Please start a new discussion, so that it doesn’t get lost in this one, and specify exactly what problem you are trying to solve using the cobutf8 utility.

Thanks

Recent badge winners

Sign up

Please log in or register:

Welcome to the Rocket Forum!

Please log in or register:

Scanning file for viruses.

This file cannot be downloaded