Metadata field not properly deserialized when using async_mode=True with PGVector

When using the `PGVector` class with `async_mode=True`, the `metadata` field of the `Document` objects returned from query methods (e.g., `asimilarity_search_with_score_by_vector`) is not deserialized into a Python `dict`. Instead, it remains as a `Fragment` object or another non-dict type. This causes a `ValidationError` when the `Document` class expects `metadata` to be a dictionary.

**To Reproduce**

Steps to reproduce the behavior:

1. Initialize `PGVector` with `async_mode=True` and `use_jsonb=True`.
2. Add documents to the vector store with metadata.
3. Perform an asynchronous similarity search, e.g., `asimilarity_search` or `asimilarity_search_with_score_by_vector`.
4. Observe that the returned `Document` objects have `metadata` fields that are not dictionaries.

**Expected behavior**

The `metadata` field of the returned `Document` objects should be properly deserialized into Python dictionaries, matching the behavior when `async_mode=False`.

**Actual behavior**

When `async_mode=True`, the `metadata` field is a `Fragment` object (from `asyncpg`), leading to errors when the code expects a `dict`.

**Error message**

```
ValidationError: 1 validation error for Document
metadata
  Input should be a valid dictionary [type=dict_type, input_value=Fragment(buf=b'{"user_id": "ahmed"}'), input_type=Fragment]
```

**Environment:**

- `langchain_postgres` version: 0.0.12
- Python version: 10,11,12
- Database: PostgreSQL with `pgvector` extension
- Async driver: `asyncpg`

**Additional context**

This issue arises because `asyncpg` returns JSONB fields as `Record` or `Fragment` objects, which are not automatically deserialized into Python dictionaries by SQLAlchemy when using asynchronous sessions.

**Code to Reproduce**

Ensure that the required connection details like `connection_string`, `collection_name`, and `embedding_model` are securely provided when testing the code.

```python
from langchain_postgres.vectorstores import PGVector

# Setup the connection to PGVector
connection_string = 'your_connection_string_here'
collection_name = 'your_collection_name_here'
embedding_model = 'your_embedding_model_here'

# Initialize PGVector with the necessary parameters
vstore = PGVector(
    connection=connection_string,
    collection_name=collection_name,
    embeddings=embedding_model,
    use_jsonb=True,
    pre_delete_collection=False,
    async_mode=True  # Set to True to reproduce the issue
)

# Add a document with metadata
vstore.add_document({"user_id": "ahmed"}, metadata={"data": "example"})

# Perform an asynchronous similarity search
result = vstore.asimilarity_search_with_score_by_vector()
print(result.metadata)  # The issue: metadata is not returned as a dictionary
```

**Proposed Solution**

Modify the `_results_to_docs_and_scores` method in the `PGVector` class to ensure that the `metadata` field is correctly converted into a dictionary before creating the `Document` objects.


Related Issues:
#118


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata field not properly deserialized when using async_mode=True with PGVector #124

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metadata field not properly deserialized when using async_mode=True with PGVector #124

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions