Skip to content

Bug: Invalid Embeddings if GPU offloaded (CUDA) #3625

Closed
@adrianliechti

Description

@adrianliechti

Update:
Disabling GPU Offloading (--n-gpu-layers 83 to --n-gpu-layers 0) seems to "fix" my issue with Embeddings.

Dear Llama Community,

I might need a hint about embeddings API on the (example)server. And already say thanks a lot for taking your time and any help

I'm sending a string (e.g. "hello") to the /embedding endpoint of the llama-example-server against a 70b llama2 model.
In the raw result, I see this json:

{"embedding":[3.1069589551009844e-41,4.1016006050787396e-42,4.736388809417882e-43,3.8956097308229915e-43,5.1834030195374983e-42,4.200111887120774e-41,1.0165019060212223e-41,4.1883409800204457e-41,0.0883883461356163,7.370829922348538e-43,3.685414961174269e-43,1.1832564232758755e-41,4.188761369559743e-41,4.75040179406113e-42,1.8483126744444337e-42,4.512181055125911e-43,0.0,0.0,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,1.401298464324817e-45,0.0,0.0,null,null,null,null,null,0.009384572505950928,-0.029291629791259766,0.007542848587036133,0.025960087776184082,-0.005349218845367432,0.014909744262695313,0.007542848587036133,-0.0035074949264526367,-0.0016657710075378418,-0.0163995623588562,-0.012716114521026611,...

My goal is to covert this to an OpenAI compatible format and use it with langchain and a chroma db.
(see here for a llama.cpp-to-openai-server: https://github.com/adrianliechti/llama/blob/main/llama-openai/provider/llama/llama.go)

Currently the vector db search returns no results when using the llama-embeddings (but works fine using the openai embeddings).

My assumption is that I somehow have to convert or format this result... I am not very sure about the "null" since I haven' see such values in OpenAI results...

Do these embeddings number look normal or weird to you? do you have a hint how to proper convert them? If i parse them as float32 in golang, the "null" would get "0". would that make sense?

Thanks a ton!! Adrian

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions