feat: add bpe #119

Cyberhan123 · 2023-12-19T10:27:41Z

fix: #85

leejet · 2023-12-19T11:39:44Z

I don't think it's necessary to load the merge file, vocab is already ready, just use it. The only thing we need to do is implement the actual bpe function.

Cyberhan123 · 2023-12-19T11:47:05Z

Okay, I'll adjust it

Cyberhan123 · 2023-12-19T11:51:57Z

I don't think it's necessary to load the merge file, vocab is already ready, just use it. The only thing we need to do is implement the actual bpe function.

I actually want to implement customized clip loading. For example, chinese-clip, I wonder if this idea is feasible?

edit: Possible obstacles are that the tokenizer it uses is not bpe at all.

leejet · 2023-12-19T12:36:58Z

The clip model works if it can produce output with similar distribution to the clip model used by stable diffusion or if there is a dedicated unet model. I don't know if chinese-clip is suitable for stable diffusion, because I haven't studied its training process in detail.

Cyberhan123 · 2023-12-20T02:57:40Z

I don't think it's necessary to load the merge file, vocab is already ready, just use it. The only thing we need to do is implement the actual bpe function.

@leejet I just checked the built-in vocab.json. The vocab.json cannot repalce the function of merge.txt because it does not provide subword order. Another problem is that the token id of the bpe algorithm is the frequency of word occurrence, not a fixed mapping, so I am a bit worried that unexpected errors may occur when using the built-in vocabulary list.

leejet · 2023-12-20T16:26:12Z

I've improved the tokenizer's handling of Unicode, generating vocab using merges information, similar to https://github.com/openai/CLIP/blob/main/clip/simple_tokenizer.py. Now you can merge the latest master branch into your PR, making it convenient to implement BPE. Please refer to https://github.com/openai/CLIP/blob/main/clip/simple_tokenizer.py for guidance.

FSSRepo · 2023-12-20T16:47:33Z

I already have an implementation of embeddings (textual inversion) ready, and it works fine. However, it requires a specific way of loading the embedding from the prompt bad quality, ugly, face malformed, bad anatomy, embd:bad-art-neg, embd:<embedding filename> since custom embeddings are added to the end of token_embed_weight, and to create ''new tokens" that point to the position of the new embeddings is required.

Another issue is that it is unclear how other projects handle very long prompts of more than 77 tokens. Do they only use the first 77 and ignore the rest? There are some embeddings that require up to 75 tokens, and that covers almost the entirety of the available context.

Cyberhan123 · 2023-12-21T02:05:43Z

I've improved the tokenizer's handling of Unicode, generating vocab using merges information, similar to https://github.com/openai/CLIP/blob/main/clip/simple_tokenizer.py. Now you can merge the latest master branch into your PR, making it convenient to implement BPE. Please refer to https://github.com/openai/CLIP/blob/main/clip/simple_tokenizer.py for guidance.

Thx

# Conflicts: # model.cpp # model.h # stable-diffusion.cpp

Cyberhan123 · 2023-12-21T06:45:49Z

Now We have completed all the logic, but I feel that the performance is not very good. After investigation, I found that load_from_merges does a lot of loops. Now we need to make a trade-off, whether we want faster loading speed or less memory.

If we want to improve the speed, we can add a built-in vocab.json, which will increase the memory by about 1MB. If we need to less memory, we can keep the current code.

Cyberhan123 · 2023-12-21T07:12:40Z

I already have an implementation of embeddings (textual inversion) ready, and it works fine. However, it requires a specific way of loading the embedding from the prompt bad quality, ugly, face malformed, bad anatomy, embd:bad-art-neg, embd:<embedding filename> since custom embeddings are added to the end of token_embed_weight, and to create ''new tokens" that point to the position of the new embeddings is required.

Another issue is that it is unclear how other projects handle very long prompts of more than 77 tokens. Do they only use the first 77 and ignore the rest? There are some embeddings that require up to 75 tokens, and that covers almost the entirety of the available context.

@FSSRepo Please check out the link below:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/cf2772fab0af5573da775e7437e6acdca424f26e/modules/sd_hijack_clip.py#L203

In summary, it will be split into multiple chunks

leejet · 2023-12-21T13:01:51Z

Now We have completed all the logic, but I feel that the performance is not very good. After investigation, I found that load_from_merges does a lot of loops. Now we need to make a trade-off, whether we want faster loading speed or less memory.

If we want to improve the speed, we can add a built-in vocab.json, which will increase the memory by about 1MB. If we need to less memory, we can keep the current code.

On my computer, load_merges takes 82ms, and load_vocab takes 53ms. There isn't a significant difference. These operations, akin to loading model weights, run only once, and the time spent compared to loading model weights is negligible. I believe there's no need to do this.

Cyberhan123 · 2023-12-21T13:06:32Z

We can merge this pr

Cyberhan123 · 2023-12-21T13:09:37Z

@FSSRepo
Looking forward to the merge of #104 PR so I can go ahead and working on hip's PR

FSSRepo · 2023-12-21T13:16:49Z

My pull request is ready; however, a user is experiencing issues with the VAE tiling function for reasons unknown to me. I have tested it with images of various resolutions and haven't encountered problems with incomplete images.

Cyberhan123 · 2023-12-21T13:20:47Z

My pull request is ready; however, a user is experiencing issues with the VAE tiling function for reasons unknown to me. I have tested it with images of various resolutions and haven't encountered problems with incomplete images.

I tested on mac and windows cuda and didn't encounter this problem

leejet · 2023-12-21T13:24:32Z

My pull request is ready; however, a user is experiencing issues with the VAE tiling function for reasons unknown to me. I have tested it with images of various resolutions and haven't encountered problems with incomplete images.

I've finished reviewing your PR and I'm ready to merge it, but I'm unsure whether Green-Sky will still encounter the issue.

leejet · 2023-12-21T13:26:34Z

Once I've reviewed the changes in this PR, I might merge it first.

mwt11a · 2023-12-21T17:09:54Z

I am finding this PR results in images that have very little relation to the prompt. Running on Linux and using CPU only.

Even if a simple prompt like "photo of a cat next to a christmas tree, wearing a christmas hat". Master will generate a image of a cat generally sitting next to a christmas tree and wearing a red santa type hat. While this PR will generate a woman wearing a woolly hat. Generally a close up head shot of her.

Cyberhan123 · 2023-12-22T02:25:45Z

I am finding this PR results in images that have very little relation to the prompt. Running on Linux and using CPU only.

Even if a simple prompt like "photo of a cat next to a christmas tree, wearing a christmas hat". Master will generate a image of a cat generally sitting next to a christmas tree and wearing a red santa type hat. While this PR will generate a woman wearing a woolly hat. Generally a close up head shot of her.

I just tried it and it happened to me too. Let me see what caused it.

FSSRepo · 2023-12-22T04:32:17Z

@Cyberhan123 Is this a complete implementation of BPE, or are there still things missing, as you could start deleting "Implement BPE Tokenizer" from the TODO section of the README?

Cyberhan123 · 2023-12-22T06:08:24Z

@Cyberhan123 Is this a complete implementation of BPE, or are there still things missing, as you could start deleting "Implement BPE Tokenizer" from the TODO section of the README?

@FSSRepo This is the complete implementation, I just made a low-level mistake, which has been fixed now.

Cyberhan123 · 2023-12-22T06:11:14Z

@mwt11a
If you want, please pull the latest code and test it, it works fine for me.

./sd -m ../../models/sd-v1-4.ckpt -p "photo of a cat next to a christmas tree, wearing a christmas hat" --type q8_0

Cyberhan123 · 2023-12-22T06:12:37Z

emmm, the cat generated by this model looks a bit like Gargamel...

Cyberhan123 · 2023-12-22T16:39:23Z

@leejet
when I build with msvc and ninja，I think I know what caused it: The -DCMAKE_BUILD_TYPE=Debug options

But The -DCMAKE_BUILD_TYPE=Release not caused.

So I change the code to:

  auto end = merges.begin() + 49152 - 256 - 2 + 1; // The Exception caused.
  merges = std::vector<std::u32string>(merges.begin() + 1, end);

Then I jumped into the exception. We can know that the macro _ITERATOR_DEBUG_LEVEL, which checks whether the iterator overflows, will only take effect in debug mode, and will be assigned a value of (void) _Off in Release mode;

Cyberhan123 · 2023-12-22T17:06:29Z

I am more accustomed to using JetBrains products, such as webstorm, goland, and clion. Maybe due to lack of experience, I didn't find a suitable way to format the code, which caused extra work for you. I'm sorry.

FSSRepo · 2023-12-22T18:59:45Z

I'm working with VSCode 😂😂😂 and a lot of extensions

leejet · 2023-12-23T04:04:08Z

@Cyberhan123 I've completed reviewing the code in this PR, and it looks good to me. I'm planning to merge it now. Thank you for your contribution.

add bpe

26a7d2a

Cyberhan123 added 2 commits December 21, 2023 10:53

Merge branch 'master' into bpe

abd454e

# Conflicts: # model.cpp # model.h # stable-diffusion.cpp

merge type, cov string to u32string

74a1255

Cyberhan123 mentioned this pull request Dec 21, 2023

stable-diffusion: implement ESRGAN upscaler + Metal Backend #104

Merged

fix split word

6e2fe5f

Cyberhan123 and others added 2 commits December 22, 2023 15:04

update README.md

c0ba67b

some fixes

f2b1372

small fixes

b04821b

leejet merged commit 0e64238 into leejet:master Dec 23, 2023

Cyberhan123 deleted the bpe branch January 5, 2024 12:47

feat: add bpe #119

feat: add bpe #119

Uh oh!

Conversation

Cyberhan123 commented Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Dec 19, 2023

Uh oh!

Cyberhan123 commented Dec 19, 2023

Uh oh!

Cyberhan123 commented Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Dec 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyberhan123 commented Dec 20, 2023

Uh oh!

leejet commented Dec 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FSSRepo commented Dec 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

leejet commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

FSSRepo commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 21, 2023

Uh oh!

leejet commented Dec 21, 2023

Uh oh!

leejet commented Dec 21, 2023

Uh oh!

mwt11a commented Dec 21, 2023

Uh oh!

Cyberhan123 commented Dec 22, 2023

Uh oh!

FSSRepo commented Dec 22, 2023

Uh oh!

Cyberhan123 commented Dec 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyberhan123 commented Dec 22, 2023

Uh oh!

Cyberhan123 commented Dec 22, 2023

Uh oh!

Cyberhan123 commented Dec 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyberhan123 commented Dec 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FSSRepo commented Dec 22, 2023

Uh oh!

leejet commented Dec 23, 2023

Uh oh!

Uh oh!

Cyberhan123 commented Dec 19, 2023 •

edited

Loading

Cyberhan123 commented Dec 19, 2023 •

edited

Loading

leejet commented Dec 19, 2023 •

edited

Loading

leejet commented Dec 20, 2023 •

edited

Loading

FSSRepo commented Dec 20, 2023 •

edited

Loading

Cyberhan123 commented Dec 22, 2023 •

edited

Loading

Cyberhan123 commented Dec 22, 2023 •

edited

Loading

Cyberhan123 commented Dec 22, 2023 •

edited

Loading