Open
Description
Describe the issue
I have been using ipex-xpu for a while with Arc series dGPU. It's great.
Recently, I tred to run the same code on my iGPU(Intel(R) UHD Graphics 770 1.3 [1.3.26241] / Intel(R) Graphics [0x46a6] 1.3 [1.3.26241]), and got the same error message when using F.linear:
onednn_verbose,info,oneDNN v3.2.0 (commit 67bc621a2da4aefc51f0a59b2af2398fa1d3e1c8)
onednn_verbose,info,cpu,runtime:threadpool,nthr:10
onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost
onednn_verbose,info,gpu,runtime:DPC++
onednn_verbose,info,gpu,engine,0,backend:Level Zero,name:Intel(R) Arc(TM) A730M Graphics,driver_version:1.3.26241,binary_kernels:enabled
onednn_verbose,info,gpu,engine,1,backend:Level Zero,name:Intel(R) Graphics [0x46a6],driver_version:1.3.26241,binary_kernels:enabled
onednn_verbose,info,experimental features are enabled
onednn_verbose,info,use batch_normalization stats one pass is enabled
onednn_verbose,info,prim_template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,error,level_zero,errcode 1879048196
Traceback (most recent call last):
File "/home/arda/ruonan/gpu-test/test_chatglm2.py", line 46, in <module>
output = model.generate(**inputs, do_sample=False, temperature=0.9, max_new_tokens=32)
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/transformers/generation/utils.py", line 1538, in generate
return self.greedy_search(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/transformers/generation/utils.py", line 2362, in greedy_search
outputs = self(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 845, in forward
transformer_outputs = self.transformer(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 741, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 588, in forward
hidden_states, kv_cache = layer(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 510, in forward
attention_output, kv_cache = self.self_attention(
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 342, in forward
mixed_x_layer = self.query_key_value(hidden_states)
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/arda/miniconda3/envs/ruonan-ipex2/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: could not create a primitive
I just wonder is there any plan to support Iris iGPU?