Description
Model/Pipeline/Scheduler description
The current solution from PR #4015 does not result in an optimal solution for the use of the refinement model as an ensemble model because it works by a float value to determine the selection of timesteps to execute. This is further described in the commentary on the PR.
I am proposing overloading the type of denoising_end
and denoising_start
with int as well as float. In the event that it is an int, the approximate timestep/sigma will be used to determine the start and end of the denoising loop for the sampling step.
After this is done, add in the two new pipelines. These pipelines will be easy for the end user to use and will already use the optimal solution (base model for timesteps 201-1000, refinement model for timesteps 0-100) for the ensemble model, and will work with all schedulers. It will be used something like:
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-0.9", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
refiner = StableDiffusionXLImg2ImgPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-0.9", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
ensemble_pipe = StableDiffusionEnsemblePipeline(pipe, refiner)
image = pipe(prompt=prompt, output_type="latent", num_inference_steps=30).images
Open source status
- The model implementation is available
- The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
No response