Skip to content

Bug: llama-server crash with --embeddings #9978

Closed
@mokeyish

Description

@mokeyish

What happened?

After starting with the following command, it will occasionally crash suddenly while running.

llama-server -m ./bge-large-zh-v1.5 --port 3358 -a emb@bge-large-zh-v1.5 -ngl 100 -c 8192 --samplers tempera
ture;top_p --embeddings -ub 8192 --pooling cls

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 4 CUDA devices:
Device 0: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 1: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 2: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 3: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
version: 3945 (45f0976)
built with cc (Debian 10.2.1-6) 10.2.1 20210110 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

[Thread debugging using libthread_db enabled]                                                                                         
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".                                                            
0x00007f210ada7787 in __GI___wait4 (pid=3074567, stat_loc=0x7fffc0b6d3a4, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:
27                                                                                                                                    
27      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.                                                                
#0  0x00007f210ada7787 in __GI___wait4 (pid=3074567, stat_loc=0x7fffc0b6d3a4, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait
4.c:27                                                                                                                                
27      in ../sysdeps/unix/sysv/linux/wait4.c                                                                                         
#1  0x00007f210b21b638 in ggml_abort () from /home/user/llama.cpp/build/ggml/src/libggml.so                                       
#2  0x00007f210b21f700 in ggml_compute_forward_get_rows () from //home/user/llama.cpp/build/ggml/src/libggml.so                    
#3  0x00007f210b24d0a2 in ggml_graph_compute_thread.isra () from /home/user/llama.cpp/build/ggml/src/libggml.so                   
#4  0x00007f210b250cf6 in ggml_graph_compute () from /home/user/llama.cpp/build/ggml/src/libggml.so                               
#5  0x00007f210b25caf3 in ggml_backend_cpu_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/user/llama.cpp/build/ggml/src/
libggml.so                                                                                                                            
#6  0x00007f210b261d75 in ggml_backend_sched_graph_compute_async () from /home/user/llama.cpp/build/ggml/src/libggml.so           
#7  0x00007f2120eb15c2 in llama_decode () from /home/user/llama.cpp/build/src/libllama.so                                         
#8  0x000055776938aa04 in server_context::update_slots() ()                                                                           
#9  0x000055776936d7e1 in server_queue::start_loop() ()                                                                               
#10 0x0000557769324981 in main ()                                                                                                     
[Inferior 1 (process 2694414) detached]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcritical severityUsed to report critical severity bugs in llama.cpp (e.g. Crashing, Corrupted, Dataloss)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions