Skip to content

model_cpu_offload failed in unidiffusers pipeline #11443

Open
@yao-matrix

Description

@yao-matrix

Describe the bug

unidiffusers cpu_offload failed with the log in Reproduction column.

I took a deeper look, it seems that in this case, self.text_decoder.encode will be called after text_encoder and before image_encoder. The thing is this is just a submodule of text_decoder and not in model_cpu_offload_seq, so didn't register hook while enable_model_cpu_offload. It became an orphan. I don't have a good idea to fix it, since it's an embeded submodule in a sub-model and whether to trigger it is a runtime decision based on reduce_text_emb_dim. But I'm willing to contribute to the fix of it.

@yiyixuxu @sayakpaul @DN6

Reproduction

pytest -rA tests/pipelines/unidiffuser/test_unidiffuser.py::UniDiffuserPipelineFastTests::test_model_cpu_offload_forward_pass you can see below error log. The same issue happens on CUDA too.

self = Linear(in_features=32, out_features=32, bias=True)
input = tensor([[[-0.8407, -0.3964, -0.6832, ..., -0.2908, 0.1523, -1.0043],
[-0.8155, -0.1579, 0.6659, ..., 1.4...375, -0.4626, -0.3352],
[-1.2005, -0.1820, 0.4218, ..., -0.3822, -0.5105, -0.2234]]],
device='xpu:0')

def forward(self, input: Tensor) -> Tensor:
    # print(f"input.device: {input.device}, weight device: {self.weight.device}")
  return F.linear(input, self.weight, self.bias)

E RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and xpu:0! (when checking argument for argument mat1 in method wrapper_XPU_addmm)

Logs

System Info

N/A

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions