Closed
Description
What happened?
After starting with the following command, it will occasionally crash suddenly while running.
llama-server -m ./bge-large-zh-v1.5 --port 3358 -a emb@bge-large-zh-v1.5 -ngl 100 -c 8192 --samplers tempera
ture;top_p --embeddings -ub 8192 --pooling cls
Name and Version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 4 CUDA devices:
Device 0: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 1: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 2: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
Device 3: NVIDIA A100 80GB PCIe, compute capability 8.0, VMM: yes
version: 3945 (45f0976)
built with cc (Debian 10.2.1-6) 10.2.1 20210110 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f210ada7787 in __GI___wait4 (pid=3074567, stat_loc=0x7fffc0b6d3a4, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:
27
27 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0 0x00007f210ada7787 in __GI___wait4 (pid=3074567, stat_loc=0x7fffc0b6d3a4, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait
4.c:27
27 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x00007f210b21b638 in ggml_abort () from /home/user/llama.cpp/build/ggml/src/libggml.so
#2 0x00007f210b21f700 in ggml_compute_forward_get_rows () from //home/user/llama.cpp/build/ggml/src/libggml.so
#3 0x00007f210b24d0a2 in ggml_graph_compute_thread.isra () from /home/user/llama.cpp/build/ggml/src/libggml.so
#4 0x00007f210b250cf6 in ggml_graph_compute () from /home/user/llama.cpp/build/ggml/src/libggml.so
#5 0x00007f210b25caf3 in ggml_backend_cpu_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/user/llama.cpp/build/ggml/src/
libggml.so
#6 0x00007f210b261d75 in ggml_backend_sched_graph_compute_async () from /home/user/llama.cpp/build/ggml/src/libggml.so
#7 0x00007f2120eb15c2 in llama_decode () from /home/user/llama.cpp/build/src/libllama.so
#8 0x000055776938aa04 in server_context::update_slots() ()
#9 0x000055776936d7e1 in server_queue::start_loop() ()
#10 0x0000557769324981 in main ()
[Inferior 1 (process 2694414) detached]