The downsampling ratio of VAE in maisi #1841
Unanswered
DopamineLcy
asked this question in
Q&A
Replies: 1 comment
-
We currently do not have 8x8x8 downsampling VAE. Thank you for letting us know that this is desired! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The downsampling ratio of VAE in maisi is 4, i.e. a [128, 128, 128] patch results in a [32, 32, 32] latent.
While 32 ^ 3 = 32768 is very large for a transformer-based model.
Is there any pre-trained VAE with a larger down-sampling ratio like 8 ([128, 128, 128]->[16,16,16]?
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions