Description
Confirm this is an issue with the Python library and not an underlying OpenAI API
- This is an issue with the Python library
Describe the bug
Progressive memory leak despite use of close(). Leads to eventually OOM on even large memory systems after just a few days.
Related: #820
To Reproduce
I'm using 1.12.0 and still hit this issue.
Using text completion. I'm also using close() in a try-finally, so close() does not help. A global client connection doesn't make sense to enforce. The old client pre 1.x had global attributes that made the API poor. I presume there are some legacy parts still in place.
In h2oGPT, we use OpenAI client for OpenAI or vLLM connections, and I see a 5GB memory leak for every 6000 connections. This happens whether I yield the generator for streaming or just exit after creating the completion.
Normally connections are not as intense, but this was easily reproducible by bisecting the OpenAI creation/generation parts of the code. For typical workloads this leads to OOM on a 256GB system after just few days of usage.
Here is repro. Please choose the to be some endpoint that you have setup like vLLM or TGI or gpt3.5 turbo so not expensive. Choose api_key and model accordingly.
import os
import psutil
from openai import OpenAI
for i in range(6000):
client_args = dict(base_url='<choose>', api_key="EMPTY")
client = OpenAI(**client_args)
responses = client.completions.create(
model='h2oai/h2ogpt-4096-llama2-13b-chat',
prompt="Say exactly one word.",
stream=True,
)
client.close()
p = psutil.Process(os.getpid())
print(p.memory_full_info())
The memory consumed is not increasing every step in loop, but it does monotonically increase from pss=48523264 to pss=107862016 within a few minutes (i.e. doubled) and continues this indefinitely.
The problem seems to be even stronger when doing concurrent requests in multi-threaded setup, as if the clean-up is not thread safe. I'm trying to put together a repro that would showcase the 5GB after 6000 connections that only takes half hour to run. But perhaps the above is sufficient.
Code snippets
No response
OS
ubuntu 22
Python version
Python v3.10
Library version
openai v1.12.0