Skip to content

v1.0 drops embeddings_util.py breaking semantic text search #676

Closed
@mrbullwinkle

Description

@mrbullwinkle

Describe the bug

The previous version of the OpenAI Python library contained embeddings_utils.py which provided functions like cosine_similarity which are used for semantic text search with embeddings. Without this functionality existing code including OpenAI's cookbook example: https://cookbook.openai.com/examples/semantic_text_search_using_embeddings will fail due to this dependency.

Are there plans to add this support back-in or should we just create our own cosine_similarity function based on the one that was present in embeddings_utils:

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

To Reproduce

Cookbook example cannot be converted to use v1.0 without removing the dependency on embeddings_utils.py https://cookbook.openai.com/examples/semantic_text_search_using_embeddings

Code snippets

from openai.embeddings_utils import get_embedding, cosine_similarity

# search through the reviews for a specific product
def search_reviews(df, product_description, n=3, pprint=True):
    product_embedding = get_embedding(
        product_description,
        engine="text-embedding-ada-002"
    )
    df["similarity"] = df.embedding.apply(lambda x: cosine_similarity(x, product_embedding))

    results = (
        df.sort_values("similarity", ascending=False)
        .head(n)
        .combined.str.replace("Title: ", "")
        .str.replace("; Content:", ": ")
    )
    if pprint:
        for r in results:
            print(r[:200])
            print()
    return results


results = search_reviews(df, "delicious beans", n=3)

OS

Windows

Python version

Python v3.10.11

Library version

openai-python==1.0.0rc2

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions