Skip to content

Connection reset from long-running or stale API connections #371

Closed
@mathcass

Description

@mathcass

Describe the bug

As we've used the openai.ChatCompletion.create (with gpt-3.5-turbo), we've had intermittent

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))

without a clear reproduction. At first I thought it was #91 and due to too many open connections to the OpenAI servers. Now I think it looks more like #368 instead, but I have some hypotheses about it. I'm opening a new issue separate from #368 in case they're different. If this is a duplicate, we can feel free to tack on my details there.

My hypothesis is that if you have a long running process (like a web server), and it calls out to OpenAI, that periods of inactivity cause the server side to terminate the connection and it takes a long time for the client to reestablish the connection. I dug into related issues on the requests side (like this one, psf/requests#4937) that hinted at the root cause. Essentially, what I think is happening is that,

  • First connection is made to OpenAI, returns a result, requests maintains a connection under the hood with default keep-alive
  • some time passes, in my experience, around 10 minutes should do
  • New connection is made to OpenAI, but the client throws a ConnectionResetError
    • A new call after this succeeds

I believe that the OpenAI servers are terminating the connection after a brief time (perhaps minutes) but the client still tries to keep it alive.

The reason why I think this is a bug worth reporting is that I think you could modify the client code so it responds more gracefully to these server-side settings. Changing some of the keep-alive settings from the default ones would help out several folks using this.

To Reproduce

  1. Write a long-running program. In our case, we have a Python web server running FastAPI
  2. As part of a route for the server, call OpenAI to do some work. In our case, we're calling openai.ChatCompletion.create with gpt-3.5-turbo to manipulate some input language and respond back with it
  3. Run the server and call the endpoint once
  4. Wait 10 minutes
  5. Call the endpoint again
  6. You'll likely get a Connection reset by peer issue on the second call

Code snippets

No response

OS

Linux

Python version

Python v3.8

Library version

openai-python 0.27.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfixed in v1Issues addressed by the v1 beta

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions