Skip to content

Commit 748cb0f

Browse files
Add CogVideoX DDIM Inversion to Community Pipelines (#10956)
* add cogvideox ddim inversion script * implement as a pipeline, and add documentation --------- Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
1 parent 790a909 commit 748cb0f

File tree

2 files changed

+682
-0
lines changed

2 files changed

+682
-0
lines changed

examples/community/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ PIXART-α Controlnet pipeline | Implementation of the controlnet model for pixar
8383
| [🪆Matryoshka Diffusion Models](https://huggingface.co/papers/2310.15111) | A diffusion process that denoises inputs at multiple resolutions jointly and uses a NestedUNet architecture where features and parameters for small scale inputs are nested within those of the large scales. See [original codebase](https://github.com/apple/ml-mdm). | [🪆Matryoshka Diffusion Models](#matryoshka-diffusion-models) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/pcuenq/mdm) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/tolgacangoz/1f54875fc7aeaabcf284ebde64820966/matryoshka_hf.ipynb) | [M. Tolga Cangöz](https://github.com/tolgacangoz) |
8484
| Stable Diffusion XL Attentive Eraser Pipeline |[[AAAI2025 Oral] Attentive Eraser](https://github.com/Anonym0u3/AttentiveEraser) is a novel tuning-free method that enhances object removal capabilities in pre-trained diffusion models.|[Stable Diffusion XL Attentive Eraser Pipeline](#stable-diffusion-xl-attentive-eraser-pipeline)|-|[Wenhao Sun](https://github.com/Anonym0u3) and [Benlei Cui](https://github.com/Benny079)|
8585
| Perturbed-Attention Guidance |StableDiffusionPAGPipeline is a modification of StableDiffusionPipeline to support Perturbed-Attention Guidance (PAG).|[Perturbed-Attention Guidance](#perturbed-attention-guidance)|[Notebook](https://github.com/huggingface/notebooks/blob/main/diffusers/perturbed_attention_guidance.ipynb)|[Hyoungwon Cho](https://github.com/HyoungwonCho)|
86+
| CogVideoX DDIM Inversion Pipeline | Implementation of DDIM inversion and guided attention-based editing denoising process on CogVideoX. | [CogVideoX DDIM Inversion Pipeline](#cogvideox-ddim-inversion-pipeline) | - | [LittleNyima](https://github.com/LittleNyima) |
8687

8788
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
8889

@@ -5222,3 +5223,39 @@ with torch.no_grad():
52225223

52235224
In the folder examples/pixart there is also a script that can be used to train new models.
52245225
Please check the script `train_controlnet_hf_diffusers.sh` on how to start the training.
5226+
5227+
# CogVideoX DDIM Inversion Pipeline
5228+
5229+
This implementation performs DDIM inversion on the video based on CogVideoX and uses guided attention to reconstruct or edit the inversion latents.
5230+
5231+
## Example Usage
5232+
5233+
```python
5234+
import torch
5235+
5236+
from examples.community.cogvideox_ddim_inversion import CogVideoXPipelineForDDIMInversion
5237+
5238+
5239+
# Load pretrained pipeline
5240+
pipeline = CogVideoXPipelineForDDIMInversion.from_pretrained(
5241+
"THUDM/CogVideoX1.5-5B",
5242+
torch_dtype=torch.bfloat16,
5243+
).to("cuda")
5244+
5245+
# Run DDIM inversion, and the videos will be generated in the output_path
5246+
output = pipeline_for_inversion(
5247+
prompt="prompt that describes the edited video",
5248+
video_path="path/to/input.mp4",
5249+
guidance_scale=6.0,
5250+
num_inference_steps=50,
5251+
skip_frames_start=0,
5252+
skip_frames_end=0,
5253+
frame_sample_step=None,
5254+
max_num_frames=81,
5255+
width=720,
5256+
height=480,
5257+
seed=42,
5258+
)
5259+
pipeline.export_latents_to_video(output.inverse_latents[-1], "path/to/inverse_video.mp4", fps=8)
5260+
pipeline.export_latents_to_video(output.recon_latents[-1], "path/to/recon_video.mp4", fps=8)
5261+
```

0 commit comments

Comments
 (0)