-
Notifications
You must be signed in to change notification settings - Fork 6k
Hunyuan I2V #10983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hunyuan I2V #10983
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@a-r-r-o-w Hi, I'm Kaisa Lim who is using and studying image AI using diffusers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
self.vae_scale_factor_spatial = self.vae.spatial_compression_ratio if getattr(self, "vae", None) else 8 | ||
self.video_processor = VideoProcessor(vae_scale_factor=self.vae_scale_factor_spatial) | ||
|
||
def _get_llama_prompt_embeds( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not copied from the other pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has extra logic to deal with image embeddings
@Kaisa-Supergene I'll take a look into that asap. I believe these values are from the official code and so, for the integration, we're going to use these anyway (even if they're incorrect). We can update on our end if it is indeed different. |
Failing tests are unrelated |
I got this🧐 |
i got a same issue too. |
This version gives another different error🤣 |
that is bad news lol. |
I'm on the v4.48.0-dev branch of transformers during the integration. Here's my environment where it does not error out:
I think we might have to version guard Hunyuan-I2V if it is causing problems |
Nice work!! Looks like the inference scripts and model ckpt is from Tencent March 6 release. They have released another version on March 7 to fix the ID consistent bug, with 16-dim input channel to the transformer instead of 33 input channels. Any plans to adapt that as well? Thank you! |
Thanks to the Tencent Hunyuan team for the amazing release!
Checkpoint: https://huggingface.co/hunyuanvideo-community/HunyuanVideo-I2V
Example:
output2.mp4