Skip to content

train_dreambooth_lora_flux validation RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same #9476

Open
@squewel

Description

@squewel

Describe the bug

When train_dreambooth_lora_flux attempts to generate images during validation, RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same is thrown

Reproduction

Just follow the steps from README_flux.md for DreamBooth LoRA with text-encoder training:

export OUTPUT_DIR="trained-flux-dev-dreambooth-lora"

accelerate launch train_dreambooth_lora_flux.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="bf16" \
  --train_text_encoder\
  --instance_prompt="a photo of sks dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=4 \
  --optimizer="prodigy" \
  --learning_rate=1. \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --seed="0" \
  --push_to_hub```

### Logs

```shell
09/19/2024 23:08:58 - INFO - __main__ - Running validation... ███████████████████████████████████████████████████| 7/7 [00:00<00:00, 13.76it/s]
 Generating 4 images with prompt: a photo of sks dog
W0919 23:12:39.471000 139969377689600 torch/fx/experimental/symbolic_shapes.py:4449] [0/3] xindex is not in var_ranges, defaulting to unknown range.
W0919 23:17:03.532000 139969377689600 torch/fx/experimental/symbolic_shapes.py:4449] [0/4] xindex is not in var_ranges, defaulting to unknown range.
Traceback (most recent call last):
  File "/workspace/flux-diffusers/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1890, in <module>
    main(args)
  File "/workspace/flux-diffusers/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 1810, in main
    images = log_validation(
  File "/workspace/flux-diffusers/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 189, in log_validation
    images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
  File "/workspace/flux-diffusers/diffusers/examples/dreambooth/train_dreambooth_lora_flux.py", line 189, in <listcomp>
    images = [pipeline(**pipeline_args, generator=generator).images[0] for _ in range(args.num_validation_images)]
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/flux-diffusers/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py", line 762, in __call__
    image = self.vae.decode(latents, return_dict=False)[0]
  File "/workspace/flux-diffusers/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/workspace/flux-diffusers/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 321, in decode
    decoded = self._decode(z).sample
  File "/workspace/flux-diffusers/diffusers/src/diffusers/models/autoencoders/autoencoder_kl.py", line 292, in _decode
    dec = self.decoder(z)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/flux-diffusers/diffusers/src/diffusers/models/autoencoders/vae.py", line 291, in forward
    sample = self.conv_in(sample)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 458, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/workspace/flux-diffusers/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 454, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same

System Info

Diffusers:

- Platform: Linux-5.4.0-167-generic-x86_64-with-glibc2.35
- Running on Google Colab?: No
- Python version: 3.10.12
- PyTorch version (GPU?): 2.4.1+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.25.0
- Transformers version: 4.44.2
- Accelerate version: 0.34.2
- PEFT version: 0.12.0
- Bitsandbytes version: not installed
- Safetensors version: 0.4.5
- xFormers version: not installed
- Accelerator: NVIDIA L40, 46068 MiB

Accelerate config:

compute_environment: LOCAL_MACHINE                                                                                          
debug: false                                                                                                                
distributed_type: 'NO'                                                                                                      
downcast_bf16: 'no'
dynamo_config:
  dynamo_backend: INDUCTOR
enable_cpu_affinity: false
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: 'no'
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

Who can help?

@sayakpaul @linoytsaban

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions