Faster set_adapters #10777

Luvata · 2025-02-12T06:34:02Z

What does this PR do?

The previous code iterated through model.named_modules() for each adapter, which can be very costly when the number of adapters reaches hundreds.

I've slightly changed the logic to iterate over model.named_modules() only once, setting the adapters for each submodule within that pass.

I haven't run extensive qualitative tests yet, but in my local experiment (Flux with 150+ adapters 😅) this change is significantly faster.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

PEFT: @sayakpaul @BenjaminBossan

sayakpaul · 2025-02-12T06:39:32Z

Thanks for this PR. Do you have any benchmarking numbers on this?

HuggingFaceDocBuilderDev · 2025-02-12T06:45:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Luvata · 2025-02-12T07:36:41Z

I made a small benchmark with Colab here.

import time
from tqdm import tqdm
from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")


def load_n_lora(n):
    pipe.unload_lora_weights()
    adapter_names = []
    for i in range(n):
        adapter_names.append(f"floor{i}")
        pipe.load_lora_weights("maria26/Floor_Plan_LoRA", adapter_name=f"floor{i}") # also very slow
    return adapter_names

for n_lora in [1, 5, 10, 20, 50]:
    adapter_names = load_n_lora(n_lora)
    adapter_weights = [1./n_lora] * n_lora

    start = time.time()
    pipe.set_adapters(adapter_names, adapter_weights=adapter_weights)
    end = time.time()
    print(f"n_lora: {n_lora}, time: {end - start}")

n_lora	main branch set_adapters	faster set_adapters	load_lora_weights
1	0.0164	0.012	0.466
5	0.118	0.067	2.763
10	0.699	0.206	6.082
20	1.762	1.283	14.064
50	9.661	5.975	49.573

it's consistenly 1.5 to 2x faster on SD1.5, and the difference will be more significant with larger models (e.g Flux)

btw, load_lora_weights is also very slow as the number of LoRA increases, but since in my use case it runs only once at first to load all lora, I didn't look into optimizing it.

sayakpaul · 2025-02-12T07:51:32Z

Thanks for providing the benchmark!

btw, load_lora_weights is also very slow as the number of LoRA increases, but since in my use case it runs only once at first to load all lora, I didn't look into optimizing it.

Can you ensure you're using low_cpu_mem_usage=True. By default it should be set to True based on your env but just double-checking.

Luvata · 2025-02-12T08:27:20Z

I rerun the benchmark, set low_cpu_mem_usage=True, average over 3 runs

load_lora_weights still very slow. I will look into it tomorrow

n_lora	main: load_lora_weights	main: set_adapters	this branch: load_lora_weights	set_adapters
1	0.482 ± 0.027	0.017 ± 0.001	0.521 ± 0.064	0.013 ± 0.000
5	2.532 ± 0.245	0.266 ± 0.218	2.576 ± 0.136	0.074 ± 0.001
10	5.734 ± 0.157	0.470 ± 0.173	5.271 ± 0.359	0.281 ± 0.069
20	13.121 ± 0.204	1.926 ± 0.190	13.936 ± 1.141	1.283 ± 0.061
50	50.295 ± 0.381	10.506 ± 0.027	49.062 ± 1.400	6.517 ± 0.327

sayakpaul · 2025-02-12T08:40:54Z

Can you confirm if the machine you're using to benchmark is shared by other uses? Sometimes that can perturbate the results.

It's bit weird that you're experiencing such low load times (even though this issue should be in a different thread). We benchmarked it with low_cpu_mem_usage and you can check the results here: #9510.

Don't you think it's natural to see more load times with increasing number of LoRAs being loaded as we're also doing unload_lora_weights()?

Luvata · 2025-02-12T08:49:53Z

Oh I see, I forgot the unload is still inside the timing, i'm rerunning the benchmark now
btw, for reference, my local experiment with Flux Schnell, loading LoRA weights with the default setting for 152 adapters (rank 4) takes around 4 minutes with no unload.

Luvata · 2025-02-12T09:08:29Z

Btw this is my screenshot of loading 152 lora with Flux Schnell, rank 4. It looks fast at first but then getting slower and slower

It’s running in my lab cluster, only me, not shared with anyone, and low_cpu_mem_usage is set to True.

I've rerun twice, still 5 minutes
I should open a new issue for load_lora_weights

sayakpaul · 2025-02-12T09:15:38Z

btw, for reference, my local experiment with Flux Schnell, loading LoRA weights with the default setting for 152 adapters (rank 4) takes around 4 minutes with no unload.

What is the expected time you would like so here? 👀 4 mins for 152 LoRAs seem reasonable to me.

Luvata · 2025-02-12T09:22:42Z

In the first 7 seconds, diffusers can load 20 LoRAs, so I expect it could load faster overall. nvm, I'll look into it more closely tomorrow since it's pretty late now

BenjaminBossan

The changes here LGTM, even if there was no speedup. Thanks.

Regarding the LoRA loading, I'd suggest to open another issue and check the underlying issue there.

sayakpaul · 2025-02-12T11:00:07Z

Failing test is unrelated. Thanks for your contributions!

Luvata added 3 commits February 12, 2025 00:36

Update peft_utils.py

24c1d48

Update peft_utils.py

dd82368

Update peft_utils.py

5fae2f1

sayakpaul requested a review from BenjaminBossan February 12, 2025 07:51

BenjaminBossan approved these changes Feb 12, 2025

View reviewed changes

Merge branch 'main' into faster-set-adapters

6c28594

sayakpaul merged commit 067eab1 into huggingface:main Feb 12, 2025
11 of 12 checks passed

Luvata deleted the faster-set-adapters branch February 13, 2025 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Faster set_adapters #10777

Faster set_adapters #10777

Uh oh!

Luvata commented Feb 12, 2025

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 •

edited

Loading

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 •

edited

Loading

Uh oh!

sayakpaul commented Feb 12, 2025 •

edited

Loading

Uh oh!

Luvata commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 •

edited

Loading

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025

Uh oh!

BenjaminBossan left a comment

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Uh oh!

Uh oh!

Faster set_adapters #10777

Faster set_adapters #10777

Uh oh!

Conversation

Luvata commented Feb 12, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Luvata commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Luvata commented Feb 12, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Feb 12, 2025

Uh oh!

Uh oh!

Uh oh!

Luvata commented Feb 12, 2025 •

edited

Loading

Luvata commented Feb 12, 2025 •

edited

Loading

sayakpaul commented Feb 12, 2025 •

edited

Loading

Luvata commented Feb 12, 2025 •

edited

Loading