Closed
Description
Describe the bug
Getting error while using enable_sequential_cpu_offload.
The models are used without quantization so it should work.
Reproduction
from diffusers import CogView4Pipeline
import torch
pipe = CogView4Pipeline.from_pretrained("THUDM/CogView4-6B", torch_dtype=torch.bfloat16)
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()
prompt = "A vibrant cherry red sports car sits proudly under the gleaming sun, its polished exterior smooth and flawless, casting a mirror-like reflection. The car features a low, aerodynamic body, angular headlights that gaze forward like predatory eyes, and a set of black, high-gloss racing rims that contrast starkly with the red. A subtle hint of chrome embellishes the grille and exhaust, while the tinted windows suggest a luxurious and private interior. The scene conveys a sense of speed and elegance, the car appearing as if it's about to burst into a sprint along a coastal road, with the ocean's azure waves crashing in the background."
image = pipe(
prompt=prompt,
guidance_scale=3.5,
num_images_per_prompt=1,
num_inference_steps=50,
width=1024,
height=1024,
).images[0]
image.save("cogview4.png")
Logs
(venv) C:\aiOWN\diffuser_webui>python cogview4.py
Loading checkpoint shards: 100%|████████████████████████████████████| 3/3 [00:00<00:00, 19.10it/s]
Loading checkpoint shards: 100%|████████████████████████████████████| 4/4 [00:00<00:00, 4.69it/s]
Loading pipeline components...: 100%|███████████████████████████████| 5/5 [00:02<00:00, 1.79it/s]
Traceback (most recent call last):
File "C:\aiOWN\diffuser_webui\cogview4.py", line 10, in <module>
image = pipe(
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\cogview4\pipeline_cogview4.py", line 538, in __call__
prompt_embeds, negative_prompt_embeds = self.encode_prompt(
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\cogview4\pipeline_cogview4.py", line 273, in encode_prompt
prompt_embeds = self._get_glm_embeds(prompt, max_sequence_length, device, dtype)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\diffusers\pipelines\cogview4\pipeline_cogview4.py", line 217, in _get_glm_embeds
prompt_embeds = self.text_encoder(
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\hooks.py", line 176, in new_forward
output = module._old_forward(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\transformers\models\glm\modeling_glm.py", line 561, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\hooks.py", line 171, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\hooks.py", line 370, in pre_forward
return send_to_device(args, self.execution_device), send_to_device(
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\utils\operations.py", line 174, in send_to_device
return honor_type(
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\utils\operations.py", line 81, in honor_type
return type(obj)(generator)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\utils\operations.py", line 175, in <genexpr>
tensor, (send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys) for t in tensor)
File "C:\aiOWN\diffuser_webui\venv\lib\site-packages\accelerate\utils\operations.py", line 155, in send_to_device
return tensor.to(device, non_blocking=non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!
System Info
- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Windows-10-10.0.26100-SP0
- Running on Google Colab?: No
- Python version: 3.10.11
- PyTorch version (GPU?): 2.5.1+cu124 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.27.1
- Transformers version: 4.49.0
- Accelerate version: 1.4.0.dev0
- PEFT version: 0.14.1.dev0
- Bitsandbytes version: 0.45.3
- Safetensors version: 0.5.2
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 4060 Laptop GPU, 8188 MiB
- Using GPU in script?:
- Using distributed or parallel set-up in script?: