Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Being able to run the server with --n_gqa 8
Current Behavior
warnings.warn(
usage: __main__.py [-h] [--model MODEL]
[--model_alias MODEL_ALIAS] [--n_ctx N_CTX]
[--n_gpu_layers N_GPU_LAYERS]
[--tensor_split TENSOR_SPLIT]
[--rope_freq_base ROPE_FREQ_BASE]
[--rope_freq_scale ROPE_FREQ_SCALE]
[--seed SEED] [--n_batch N_BATCH]
[--n_threads N_THREADS] [--f16_kv F16_KV]
[--use_mlock USE_MLOCK] [--use_mmap USE_MMAP]
[--embedding EMBEDDING] [--low_vram LOW_VRAM]
[--last_n_tokens_size LAST_N_TOKENS_SIZE]
[--logits_all LOGITS_ALL] [--cache CACHE]
[--cache_type CACHE_TYPE]
[--cache_size CACHE_SIZE]
[--vocab_only VOCAB_ONLY] [--verbose VERBOSE]
[--host HOST] [--port PORT]
[--interrupt_requests INTERRUPT_REQUESTS]
[--n_gqa N_GQA] [--rms_norm_eps RMS_NORM_EPS]
__main__.py: error: argument --n_gqa: invalid Optional value: '8'