Skip to content

CUDA cannot generate images  #95

Closed
ggml-org/ggml
#669
@wailovet

Description

@wailovet

I encountered a strange problem. After using CUDA, I got a pure green picture when running.But it works fine on another computer.

sd_cuda.exe  -m meinamix_meinaV11-f16.gguf -p "1girl" -v
Option:
    n_threads:       6
    mode:            txt2img
    model_path:      meinamix_meinaV11-f16.gguf
    output_path:     output.png
    init_img:
    prompt:          1girl
    negative_prompt:
    cfg_scale:       7.00
    width:           512
    height:          512
    sample_method:   euler_a
    schedule:        default
    sample_steps:    20
    strength:        0.75
    rng:             cuda
    seed:            42
    batch_count:     1
System Info:
    BLAS = 1
    SSE3 = 1
    AVX = 1
    AVX2 = 1
    AVX512 = 0
    AVX512_VBMI = 0
    AVX512_VNNI = 0
    FMA = 1
    NEON = 0
    ARM_FMA = 0
    F16C = 1
    FP16_VA = 0
    WASM_SIMD = 0
    VSX = 0
[DEBUG] stable-diffusion.cpp:3701 - Using CUDA backend
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce GTX 1070, compute capability 6.1
[INFO]  stable-diffusion.cpp:3715 - loading model from 'meinamix_meinaV11-f16.gguf'
[DEBUG] stable-diffusion.cpp:3733 - load_from_file: - kv   0:                              sd.model.name str
[DEBUG] stable-diffusion.cpp:3733 - load_from_file: - kv   1:                             sd.model.dtype i32
[DEBUG] stable-diffusion.cpp:3733 - load_from_file: - kv   2:                           sd.model.version i8
[DEBUG] stable-diffusion.cpp:3733 - load_from_file: - kv   3:                            sd.vocab.tokens arr
[INFO]  stable-diffusion.cpp:3743 - Stable Diffusion 1.x | meinamix_meinaV11.safetensors
[INFO]  stable-diffusion.cpp:3751 - model data type: f16
[DEBUG] stable-diffusion.cpp:3755 - loading vocab
[DEBUG] stable-diffusion.cpp:3771 - ggml tensor size = 416 bytes
[DEBUG] stable-diffusion.cpp:887  - clip params backend buffer size =  236.18 MB (449 tensors)
[DEBUG] stable-diffusion.cpp:2028 - unet params backend buffer size =  1641.16 MB (706 tensors)
[DEBUG] stable-diffusion.cpp:3118 - vae params backend buffer size =  95.47 MB (164 tensors)
[DEBUG] stable-diffusion.cpp:3780 - preparing memory for the weights
[DEBUG] stable-diffusion.cpp:3798 - loading weights
[DEBUG] stable-diffusion.cpp:3903 - model size = 1969.67MB
[INFO]  stable-diffusion.cpp:3913 - total memory buffer size = 1972.80MB (clip 236.18MB, unet 1641.16MB, vae 95.47MB)
[INFO]  stable-diffusion.cpp:3915 - loading model from 'meinamix_meinaV11-f16.gguf' completed, taking 0.92s
[INFO]  stable-diffusion.cpp:3939 - running in eps-prediction mode
[DEBUG] stable-diffusion.cpp:3966 - finished loaded file
[DEBUG] stable-diffusion.cpp:4647 - prompt after extract and remove lora: "1girl"
[INFO]  stable-diffusion.cpp:4652 - apply_loras completed, taking 0.00s
[DEBUG] stable-diffusion.cpp:1118 - parse '1girl' to [['1girl', 1], ]
[DEBUG] stable-diffusion.cpp:521  - split prompt "1girl" to tokens ["1</w>", "girl</w>", ]
[DEBUG] stable-diffusion.cpp:1051 - learned condition compute buffer size: 1.58 MB
[DEBUG] stable-diffusion.cpp:4061 - computing condition graph completed, taking 455 ms
[DEBUG] stable-diffusion.cpp:1118 - parse '' to [['', 1], ]
[DEBUG] stable-diffusion.cpp:521  - split prompt "" to tokens []
[DEBUG] stable-diffusion.cpp:1051 - learned condition compute buffer size: 1.58 MB
[DEBUG] stable-diffusion.cpp:4061 - computing condition graph completed, taking 415 ms
[INFO]  stable-diffusion.cpp:4681 - get_learned_condition completed, taking 876 ms
[INFO]  stable-diffusion.cpp:4691 - sampling using Euler A method
[INFO]  stable-diffusion.cpp:4694 - generating image: 1/1
[DEBUG] stable-diffusion.cpp:2384 - diffusion compute buffer size: 552.57 MB
  |==================================================| 20/20 - 7.42s/it
[INFO]  stable-diffusion.cpp:4706 - sampling completed, taking 157.10s
[INFO]  stable-diffusion.cpp:4714 - generating 1 latent images completed, taking 157.12s
[INFO]  stable-diffusion.cpp:4716 - decoding 1 latents
[DEBUG] stable-diffusion.cpp:3252 - vae compute buffer size: 1664.00 MB
[DEBUG] stable-diffusion.cpp:4605 - computing vae [mode: DECODE] graph completed, taking 6.65s
[INFO]  stable-diffusion.cpp:4724 - latent 1 decoded, taking 6.66s
[INFO]  stable-diffusion.cpp:4728 - decode_first_stage completed, taking 6.66s
[INFO]  stable-diffusion.cpp:4735 - txt2img completed in 164.66s
save result image to 'output.png'


output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions