Skip to content

Fixes training resuming: Advanced Dreambooth LoRa Training #6566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

steverhoades
Copy link
Contributor

@steverhoades steverhoades commented Jan 13, 2024

What does this PR do?

Fixes #6482
Part of #6552

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Train

!accelerate launch scripts/train_dreambooth_lora_sdxl_advanced_orig.py \
  --report_to="wandb" \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
  --pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
  --dataset_name="./training_set" \
  --output_dir="father_lora_v21" \
  --cache_dir="./dataset_cache_dir" \
  --caption_column="prompt" \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of Brian de palma" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4   \
  --gradient_checkpointing \
  --snr_gamma=5.0 \
  --lr_scheduler="cosine_with_restarts" \
  --lr_warmup_steps=0 \
  --repeats=10 \
  --max_train_steps=20 \
  --checkpointing_steps=10 \
  --validation_prompt="a photo of Brian de palma in a suit, looking at the camera" \
  --validation_epochs=1 \
  --with_prior_preservation \
  --class_data_dir="./prior_preservation-man-v2" \
  --num_class_images=110 \
  --class_prompt="a photo of a man" \
  --rank=32 \
  --optimizer="prodigy" \
  --prodigy_safeguard_warmup=True \
  --prodigy_use_bias_correction=True \
  --adam_beta1=0.9 \
  --adam_beta2=0.99 \
  --adam_weight_decay=0.01 \
  --train_text_encoder \
  --learning_rate=1 \
  --text_encoder_lr=1 \
  --resume_from_checkpoint="checkpoint-10" \
  --seed="0"

Resume

!accelerate launch scripts/train_dreambooth_lora_sdxl_advanced.py \
  --report_to="wandb" \
  --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
  --dataset_name="./training_set_ti" \
  --output_dir="father_lora_v1-test" \
  --cache_dir="./dataset_cache_dir" \
  --caption_column="prompt" \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of TOK man" \
  --resolution=1024 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4   \
  --gradient_checkpointing \
  --snr_gamma=5.0 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=20 \
  --checkpointing_steps=10 \
  --checkpoints_total_limit=10 \
  --validation_prompt="a photo of TOK man in a suit, looking directly at the camera" \
  --validation_epochs=1 \
  --with_prior_preservation \
  --class_data_dir="./prior_preservation-man-v2" \
  --num_class_images=100 \
  --class_prompt="a photo of man" \
  --rank=32 \
  --optimizer="prodigy" \
  --prodigy_safeguard_warmup=True \
  --prodigy_use_bias_correction=True \
  --adam_beta1=0.9 \
  --adam_beta2=0.99 \
  --adam_weight_decay=0.01 \
  --learning_rate=1 \
  --text_encoder_lr=1 \
  --train_text_encoder_ti \
  --train_text_encoder_frac=0.5 \
  --resume_from_checkpoint="checkpoint-10" \
  --seed="0"

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Let's also include the training example commands as instructed here.

@sayakpaul sayakpaul requested a review from linoytsaban January 15, 2024 11:37
@steverhoades steverhoades force-pushed the fix_train_dreamboth_lora_sdxl_advanced branch from 0e2769e to ca1f7df Compare January 16, 2024 04:15
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the PR! 🔥

@sayakpaul sayakpaul merged commit 181280b into huggingface:main Jan 16, 2024
@sayakpaul
Copy link
Member

Thanks for getting this in. Much appreciated!

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
…ce#6566)

* Fixes huggingface#6418 Advanced Dreambooth LoRa Training

* change order of import to fix nit

* fix nit, use cast_training_params

* remove torch.compile fix, will move to a new PR

* remove unnecessary import
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

train_dreambooth_lora_sdxl_advanced.py --resume_from_checkpoint fails with ValueError: Attempting to unscale FP16 gradients.
4 participants