Question about non-convergence of training autoencoderkl.

### Describe the bug

When training the Autoencoderkl model, its loss does not converge on the ImageNet dataset. Unlike 
[this](https://github.com/huggingface/diffusers/pull/10605#issuecomment-2601776571).

### Reproduction

**Script**

```
accelerate launch --multi_gpu --num_processes=2 --gpu_ids=0,1 \
     train_autoencoderkl.py \
    --pretrained_model_name_or_path stabilityai/sd-vae-ft-mse \
    --max_train_steps 850000 \
    --validation_steps 100 \
    --checkpointing_steps 1000 \
    --gradient_accumulation_steps 2 \
    --learning_rate 4.5e-6 \
    --lr_scheduler cosine \
    --report_to wandb \
    --mixed_precision bf16 \
    --train_batch_size 8 \
    --dataloader_num_workers 16 \
    --output_dir autoencoderkl-model/imagenet \
    --train_data_dir /datasets/image/imagenet-test/train \
    --validation_image ./val/ILSVRC2012_val_00000293.JPEG ./val/ILSVRC2012_val_00002138.JPEG \
    --resolution 128 \
```

 **Logs**

![Image](https://github.com/user-attachments/assets/7e5c8b9a-96be-40f5-a791-7d907e079f6a)

### Logs

```shell

```

### System Info

- 🤗 Diffusers version: 0.33.0.dev0
- Platform: Linux-5.15.0-67-generic-x86_64-with-glibc2.17
- Running on Google Colab?: No
- Python version: 3.8.20
- PyTorch version (GPU?): 2.4.1+cu121 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.30.1
- Transformers version: 4.46.3
- Accelerate version: 1.0.1
- PEFT version: not installed
- Bitsandbytes version: 0.45.4
- Safetensors version: 0.5.3
- xFormers version: 0.0.28.post1
- Accelerator: NVIDIA GeForce RTX 3090, 24576 MiB
NVIDIA GeForce RTX 3090, 24576 MiB
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>


### Who can help?

@lavinal712 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about non-convergence of training autoencoderkl. #11221

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about non-convergence of training autoencoderkl. #11221

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions