Open-source Languages & Tools for z/OS

 View Only
  • 1.  Python: read an EBCDIC file

    Posted 12-16-2021 13:46
    I try to read a sequential dataset (FB LRECL(1000)) by using this simple code:

    # ==================================================
    count = 0
    thefilepath = "//'sa8sb22.resultfb'"
    count = len(open(thefilepath, 'r', encoding="cp500").readlines( ))
    print("There are ", count , " lines in the file")
    # ==================================================

    I have 133 records in the file, but the above code delivers 1 (one). I suppose, no read is executed.
    I suppose it is realted to the encoding, but I don't find any better as the cp500.
    The file is definitive EBCDIC (x'81' for 'a').

    Any hints pls.?

    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------


  • 2.  RE: Python: read an EBCDIC file

    Posted 12-17-2021 12:41
    Addendum: with Java / jzos I can read the file  without any problem.

    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------



  • 3.  RE: Python: read an EBCDIC file

    Posted 12-20-2021 06:39
    This is how I read an EBCDIC file.  in the read command you need to provide the length of the record.  In this case it was 87.  Might include a CR or LF as I copied the file to USS before reading.

    fieldwidths = (8, 8, 4, -1, 1, -2, 7, 46, 8, 1, -2) # negative widths represent ignored padding fields
    fmtstring = ' '.join('{}{}'.format(abs(fw), 'x' if fw < 0 else 's')
    for fw in fieldwidths)

    with open('file.txt', 'r', encoding="cp1140") as f:     #source file encoding
    read_data = bytes(f.read(87), 'utf-8')                          #destination encoding
    while read_data:
    fieldstruct = struct.Struct(fmtstring)
    jobnames = namedtuple('F1','F2 F3 F4 F5 F6 F7 F8 F9')
    parse = fieldstruct.unpack_from
    fields = parse(read_data)
    f1 = jobnames._make(fields)

    I also used Struct to parse the fields.

    This only works for character data.  No packed decimal or binary fields.

    ​Chris Hanks​

    ------------------------------
    Christopher Hanks
    Architect
    Citigroup Technology Inc
    Tampa FL US
    ------------------------------



  • 4.  RE: Python: read an EBCDIC file

    Posted 12-20-2021 07:57
    Edited by Gabor Markon 12-20-2021 08:00
    Hello Christopher,

    thanks for the detailed answer.
    The cp1140 is an extra package, isn't it? In this case, I'll try to download and install it. 
    Unfortunately, I do not have access to my system because of an adminstration error.
    I will try it, and give a feedback anyway.

    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------



  • 5.  RE: Python: read an EBCDIC file

    ROCKETEER
    Posted 12-20-2021 12:09
    Hello Gabor,

    What encoding does your data set really use? Are there any characters that look differently in the IBM-500 encoding than in IBM-1047? If it's in IBM-1047, you can use the solution below.

    Internally, Python first converts your data from the encoding you specify, and then splits it into lines using the linefeed characters. Unfortunately the 'native' mapping of characters in the community version of Python (and therefore in ours, too) doesn't work well for EBCDIC newline characters (0x15) used on z/OS. To do the proper mapping, you need to specify encoding as 'cp1047_oe':

    count = len(open(thefilepath, 'r', encoding="cp1047_oe").readlines( ))

    This way Python will split text into lines correctly, and you should be getting 133 lines in the count. Please note that cp1047_oe is Rocket's extension to the original community version of Python, and is not available on other systems (e.g. Windows). You do not need to download any extra packages to use cp1047_oe.

    Christopher's solution will work if you have fixed-width records inside of your data set; but even in this case I'd highly recommend to split data on newlines explicitly (e.g. using .split()) before converting them to structs. Just in case you read records of different lengths, which is easily possible when you open files in text mode.

    Regards,
    Vladimir

    ------------------------------
    Vladimir Ein
    Software Engineer
    Rocket Software
    ------------------------------



  • 6.  RE: Python: read an EBCDIC file

    Posted 12-21-2021 04:37
    Hello Vladimir,

    thank you very much, now I understand the whole background.
    I will try your solution as soon as I am bale to use the affected mainframe again.

    Regards
    Gábor

    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------



  • 7.  RE: Python: read an EBCDIC file

    Posted 12-20-2021 17:32
    I like this cp1047_oe encoding.

    Gabor, be aware that readlines() returns a list.

    With the stock encoding you are getting a single item in the list therefore count=1.
    With the nifty Rocket encoding you get a list with n items. n being the number of records.

    Compare
    lines = open(thefilepath, 'r', encoding="cp500").readlines( )
    print(lines)

    with the result you get with cp1047_oe.

    As a recent Python experimenter the object structure and conceptual class stuff takes a while to get used to.
    docs.python.org is your friend as well as Google :-)

    ------------------------------
    hank oerlemans
    Lead Software Engineer
    HCL America Inc.
    North Sydney NSW AU
    ------------------------------



  • 8.  RE: Python: read an EBCDIC file

    Posted 01-05-2022 13:09

    Hello Hank,

    sorry, I did not read the messges in the last time. My workplace notebook was destroyed by a system error to the ground, and I succeeded to restore it (say, to 80%) by now.

    Thank you for your suggestion; I will try it as soon as possible, I must struggle first to get back my lost mainframe credentials as well.

    Best regards

    Gabor 



    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------



  • 9.  RE: Python: read an EBCDIC file

    Posted 01-06-2022 09:04
    Hi Hank, 
    I've successfully restored my mainframe access today. My first try was  to try the CP1047_OE setting - works perfect!
    Thank you, and thanks for all for the competent information!
    Regards
    Gabor


    ------------------------------
    Gabor Markon
    Mainframe Architect
    Self Registered
    Budapest HU
    ------------------------------