Skip to main content

Hi,

does anyone here have experience with https://github.com/openai/whisper, I tried to transcribe mp3 length 13:35 min  on different  hardware  with following results  :

i used LARGE MODEL : 

1)  AC922 GPU 1xV100 utilization (U)=30 % ,  transcription time  (TS) = 5:38 min

2) AC922 32xP9 CPU   U=90 + %, TS= 80 min

3) 1050 p311  pytorch-cpu 2.1.1  8CPU  TS=28 min   U = more less 4.5 cores only few threads work

4) 1050 p311  pytorch-cpu 2.1.1  12CPU   TS=25 min

in 4)  i used 

export OPENBLAS_NUM_THREADS=12
export GOTO_NUM_THREADS=12
export OMP_NUM_THREADS=12

I found out that by incrasing the number of threads (export OPENBLAS....)  I can increase CPU utilization but this does not shorten the transition time , but rather increases it

Is it possible to get closer to V100 with some tuning or this is the best which can i expect from p10  with this quite large model ? 

 Is it  normal that with pytorch-cpu 2.1.2  py311_1  rocketce i got this message  ?:

/data/miniconda3/lib/python3.11/site-packages/whisper/transcribe.py:126: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Error in cpuinfo: processor architecture is not supported in cpuinfo
Error in cpuinfo: processor architecture is not supported in cpuinfo

thanks

Tomas



------------------------------
Tomas Kovacik
Rocket Forum Shared Account
------------------------------

Hi,

does anyone here have experience with https://github.com/openai/whisper, I tried to transcribe mp3 length 13:35 min  on different  hardware  with following results  :

i used LARGE MODEL : 

1)  AC922 GPU 1xV100 utilization (U)=30 % ,  transcription time  (TS) = 5:38 min

2) AC922 32xP9 CPU   U=90 + %, TS= 80 min

3) 1050 p311  pytorch-cpu 2.1.1  8CPU  TS=28 min   U = more less 4.5 cores only few threads work

4) 1050 p311  pytorch-cpu 2.1.1  12CPU   TS=25 min

in 4)  i used 

export OPENBLAS_NUM_THREADS=12
export GOTO_NUM_THREADS=12
export OMP_NUM_THREADS=12

I found out that by incrasing the number of threads (export OPENBLAS....)  I can increase CPU utilization but this does not shorten the transition time , but rather increases it

Is it possible to get closer to V100 with some tuning or this is the best which can i expect from p10  with this quite large model ? 

 Is it  normal that with pytorch-cpu 2.1.2  py311_1  rocketce i got this message  ?:

/data/miniconda3/lib/python3.11/site-packages/whisper/transcribe.py:126: UserWarning: FP16 is not supported on CPU; using FP32 instead
  warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Error in cpuinfo: processor architecture is not supported in cpuinfo
Error in cpuinfo: processor architecture is not supported in cpuinfo

thanks

Tomas



------------------------------
Tomas Kovacik
Rocket Forum Shared Account
------------------------------

Hi Tomas,

Thanks for reaching out.

We don't have a direct measurement for whisper for comparison at present.

In general, some tweaks are required for getting best performance out of P10 LPAR.

Let me investigate and come back.

Could you please answer some of following questions?

  • How was the whisper compiled for P9 and P10? Any specific flags used?
  • What is the configuration of the P10 LPAR  from CPU perspective?



------------------------------
Suyog Jadhav
Rocket Internal - All Brands
------------------------------

Hi Tomas,

Thanks for reaching out.

We don't have a direct measurement for whisper for comparison at present.

In general, some tweaks are required for getting best performance out of P10 LPAR.

Let me investigate and come back.

Could you please answer some of following questions?

  • How was the whisper compiled for P9 and P10? Any specific flags used?
  • What is the configuration of the P10 LPAR  from CPU perspective?



------------------------------
Suyog Jadhav
Rocket Internal - All Brands
------------------------------

Hi,

this is the example how libraries were installed on GPU environment :

conda install pytorch=2.1.2=cuda12.2_py310_1
pip install llvmlite
conda install tiktoken=0.6.0
pip install numba
pip install -U openai-whisper

Others environments were installed similarly with libraries and python  version visible in the listings :

#P10 GPU 

#conda list | egrep -w  "openai-whisper|numba|pytorch|llvmlite|tiktoken"

llvmlite                  0.43.0                   pypi_0    pypi
numba                     0.60.0                   pypi_0    pypi
openai-whisper            20231117                 pypi_0    pypi
pytorch                   2.1.2           cuda12.2_py310_1    https://ftp.osuosl.org/pub/open-ce/current
pytorch-base              2.1.2           cuda12.2_py310_pb4.21.12_7    https://ftp.osuosl.org/pub/open-ce/current
tiktoken                  0.6.0           py310ha2369f3_0    https://ftp.osuosl.org/pub/open-ce/current

python --version
Python 3.10.13


# P9 CPU
conda list | egrep -w  "openai-whisper|numba|pytorch|llvmlite|tiktoken"


llvmlite                  0.43.0                   pypi_0    pypi
numba                     0.60.0                   pypi_0    pypi
openai-whisper            20231117                 pypi_0    pypi
pytorch                   1.13.1          cpu_py310hc26b713_0
tiktoken                  0.6.0           py310ha2369f3_0    https://ftp.osuosl.org/pub/open-ce/current

python --version
Python 3.10.13

#P10 CPU MMA
(base) [root@sk06qmn50v ~]# conda list | egrep -w  "openai-whisper|numba|pytorch|llvmlite|tiktoken"

llvmlite                  0.43.0                   pypi_0    pypi
numba                     0.60.0                   pypi_0    pypi
openai-whisper            20240927                 pypi_0    pypi
pytorch-base              2.1.2           cpu_py311_pb4.21.12_7    rocketce
pytorch-cpu               2.1.2                   py311_1    rocketce
tiktoken                  0.6.0           py311ha2369f3_0    rocketce

python --version
Python 3.11.5

P10 1050 hardware  : 72 cores 40 activated
LPAR configuration was  8 or 12 CPU dedicated mode  16,32 GB RAM 
thanks
Tomas



------------------------------
Tomas Kovacik
Rocket Forum Shared Account
------------------------------