Hi,I try to run a model training task on a power8 CPU + GPU server. I used provided env yaml to create my conda enviroment, the installation is successful, but when I attempt to run Python or any Python-related package, I encounter illegal instruction errors. I check the python-config, the CFLAG for python3.9 is -mcpu=power9 -mtune=power10 -mcpu=power9 -mtune=power10, however my cpu is power 8, this might be the problem. I checked conda package provided by RocketCE, there is only one python3.9 package, which is the one i'm using. Here are some details of my problem.
ENV using:
https://anaconda.org/rocketce/rocketce-1.9.1-conda-env-py3.9-cuda11.8
Details:
- System Architecture: Power8
- CUDA version: 11.8
- Driver Version: 520.61.05
- Error:
Illegal instruction (core dumped)
Installation step:
conda env create -f rocketce-1.9.1-conda-env-py3.9-cuda11.8.yaml -n megatron-lm
------------------------------
Qingyang Huang
Rocket Forum Shared Account
------------------------------