Description
Update:
Disabling GPU Offloading (--n-gpu-layers 83
to --n-gpu-layers 0
) seems to "fix" my issue with Embeddings.
Dear Llama Community,
I might need a hint about embeddings API on the (example)server. And already say thanks a lot for taking your time and any help
I'm sending a string (e.g. "hello") to the /embedding endpoint of the llama-example-server against a 70b llama2 model.
In the raw result, I see this json:
{"embedding":[3.1069589551009844e-41,4.1016006050787396e-42,4.736388809417882e-43,3.8956097308229915e-43,5.1834030195374983e-42,4.200111887120774e-41,1.0165019060212223e-41,4.1883409800204457e-41,0.0883883461356163,7.370829922348538e-43,3.685414961174269e-43,1.1832564232758755e-41,4.188761369559743e-41,4.75040179406113e-42,1.8483126744444337e-42,4.512181055125911e-43,0.0,0.0,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,1.401298464324817e-45,0.0,0.0,null,null,null,null,null,0.009384572505950928,-0.029291629791259766,0.007542848587036133,0.025960087776184082,-0.005349218845367432,0.014909744262695313,0.007542848587036133,-0.0035074949264526367,-0.0016657710075378418,-0.0163995623588562,-0.012716114521026611,...
My goal is to covert this to an OpenAI compatible format and use it with langchain and a chroma db.
(see here for a llama.cpp-to-openai-server: https://github.com/adrianliechti/llama/blob/main/llama-openai/provider/llama/llama.go)
Currently the vector db search returns no results when using the llama-embeddings (but works fine using the openai embeddings).
My assumption is that I somehow have to convert or format this result... I am not very sure about the "null" since I haven' see such values in OpenAI results...
Do these embeddings number look normal or weird to you? do you have a hint how to proper convert them? If i parse them as float32 in golang, the "null" would get "0". would that make sense?
Thanks a ton!! Adrian