Description
Hello,
I used the gpu configuration
oneAPI is installed correctly
I am in a python virtual environment ai_tr
I have this issue with Pytorch, the two import are on the top of the file :
import torch
import intel_extension_for_pytorch as ipex
I runned: source ${ONEAPI_ROOT}/setvars.s
with output :
(ai_tr) axel@Artishima:~/ai_tr/cod$ source ${ONEAPI_ROOT}/setvars.sh
:: WARNING: setvars.sh has already been run. Skipping re-execution.
To force a re-execution of setvars.sh, use the '--force' option.
Using '--force' can result in excessive use of your environment variables.
usage: source setvars.sh [--force] [--config=file] [--help] [...]
--force Force setvars.sh to re-run, doing so may overload environment.
--config=file Customize env vars using a setvars.sh configuration file.
--help Display this help message and exit.
... Additional args are passed to individual env/vars.sh scripts
and should follow this script's arguments.
Some POSIX shells do not accept command-line options. In that case, you can pass
command-line options via the SETVARS_ARGS environment variable. For example:
$ SETVARS_ARGS="ia32 --config=config.txt" ; export SETVARS_ARGS
$ . path/to/setvars.sh
The SETVARS_ARGS environment variable is cleared on exiting setvars.sh.
With --force :
(ai_tr) axel@Artishima:~/ai_tr/cod$ source ${ONEAPI_ROOT}/setvars.sh --force
:: initializing oneAPI environment ...
-bash: BASH_VERSION = 5.1.16(1)-release
args: Using "$@" for setvars.sh arguments: --force
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::
But this error keep appearing whenether I try to run my training python file:
xpu /home/axel/ai_tr/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/lazy_init.py:73: UserWarning: DPCPP Device count is zero! (Triggered internally at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:120.) _C._initExtension() /home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py:985: UserWarning: dpcppSetDevice: device_id is out of range (Triggered internally at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:159.) return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) Traceback (most recent call last): File "/home/axel/ai_tr/cod/train.py", line 190, in <module> m = model.to(device) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to return self._apply(convert) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 639, in _apply module._apply(fn) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply param_applied = fn(param) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) RuntimeError: Number of dpcpp devices should be greater than zero!
Everything related to mkl is installed correctl and path are set correctly and working, I am on ubuntu 22.04 using torch 13.1 on WSL 2 on windows 11 with intel drivers installed on windows 11 on Arc 770 with i9 13900K.
The error is trigerred here :
#the line below is triggering the error
m = model.to(device)
m = ipex.optimize(m)
# print the number of parameters in the model
print(sum(p.numel() for p in m.parameters())/1e6, 'M parameters')
# create a PyTorch optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)```
Also the name optimize is a weird naming.
It does seems to be an out of range issue, I have no idea how to solve this issue.