[User] Cmake Server Unresponsive

It feels like there's minimal/no testing before llama.cpp merges commits. I don't understand why there's so many unintenional consequences, and bugs. 

- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [x] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [x] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new bug or useful enhancement to share.

# Expected Behavior

I expect ./server to function so that I can visit http://127.0.0.1:8080

# Current Behavior

Here's what server looks like after navigating to http://127.0.0.1:8080: 
![Screenshot_20230814-093754_Iceraven](https://github.com/ggerganov/llama.cpp/assets/130917767/6302b7cb-a549-4581-a725-05d08cb82209)

It's blank, refreshing doesn't help. Here's the log:
```
./server -m ~/Pygmalion-Vicuna-1.1-7b.ggmlv3.Q4_0.bin -c 2048 -t 2 -b 7
{"timestamp":1692016653,"level":"INFO","function":"main","line":1179,"message":"build info","build":983,"commit":"1cd06fa"}
{"timestamp":1692016653,"level":"INFO","function":"main","line":1184,"message":"system info","n_threads":2,"total_threads":8,"system_info":"AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 | "}
llama.cpp: loading model from /data/data/com.termux/files/home/Pygmalion-Vicuna-1.1-7b.ggmlv3.Q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 5504
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_head_kv  = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: n_gqa      = 1
llama_model_load_internal: rnorm_eps  = 5.0e-06
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: freq_base  = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.08 MB
llama_model_load_internal: mem required  = 3647.96 MB (+ 1024.00 MB per state)
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size =    3.42 MB

llama server listening at http://127.0.0.1:8080

{"timestamp":1692016659,"level":"INFO","function":"main","line":1413,"message":"HTTP server listening","hostname":"127.0.0.1","port":8080}
{"timestamp":1692016667,"level":"INFO","function":"log_server_request","line":1152,"message":"request","remote_addr":"127.0.0.1","remote_port":37016,"status":200,"method":"GET","path":"/","params":{}}
{"timestamp":1692016667,"level":"INFO","function":"log_server_request","line":1152,"message":"request","remote_addr":"127.0.0.1","remote_port":37016,"status":404,"method":"GET","path":"/json-schema-to-grammar.mjs","params":{}}
{"timestamp":1692016667,"level":"INFO","function":"log_server_request","line":1152,"message":"request","remote_addr":"127.0.0.1","remote_port":37016,"status":200,"method":"GET","path":"/index.js","params":{}}
{"timestamp":1692016667,"level":"INFO","function":"log_server_request","line":1152,"message":"request","remote_addr":"127.0.0.1","remote_port":37018,"status":200,"method":"GET","path":"/completion.js","params":{}}
```
# Environment and Context

`$ uname -a`

Linux localhost 4.14.190-23725627-abG975WVLS8IWD1 #2 SMP PREEMPT Mon Apr 10 18:16:39 KST 2023 aarch64 Android
```
Python 3.11.4

GNU Make 4.4.1
Built for aarch64-unknown-linux-android

clang version 16.0.6
Target: aarch64-unknown-linux-android24
Thread model: posix
InstalledDir: /data/data/com.termux/files/usr/bin
```


# Steps to Reproduce

1. git clone
2. `cd llama.cpp, cmake -B build -DCMAKE_C_FLAGS=-march=armv8.4a, cd build, cmake --build . --config Release`
3. ./server ...


./server functions using make, fails on CMake.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[User] Cmake Server Unresponsive #2611

Expected Behavior

Current Behavior

Environment and Context

Steps to Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[User] Cmake Server Unresponsive #2611

Description

Expected Behavior

Current Behavior

Environment and Context

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions