Skip to content

Pool never closes if acquire with timeout is cancelled #547

Closed
@aaliddell

Description

@aaliddell
  • asyncpg version: 0.20.1
  • PostgreSQL version: 11
  • **Do you use a PostgreSQL SaaS? No
  • Python version: 3.6 and 3.7
  • Platform: Linux
  • Do you use pgbouncer?: No
  • Did you install asyncpg with pip?: Yes
  • If you built asyncpg locally, which version of Cython did you use?: N/A
  • Can the issue be reproduced under both asyncio and
    uvloop?
    : Yes

Here's a demo:

import asyncio
import asyncpg

async def main():
    # Create a pool
    pool = await asyncpg.create_pool(...)

    # An example function using the pool
    async def use_pool(pool):
        async with pool.acquire(timeout=200):
            pass

    # Schedule the task and cancel
    task = asyncio.ensure_future(use_pool(pool))
    await asyncio.sleep(0.0000000001)  # Yield to let task start
    task.cancel()

    print('Closing pool')
    await pool.close()
    print('Closed')

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
loop.close()

This code creates a pool, then starts a task that tries to acquire a connection with a timeout. This task is allowed to start, but is then cancelled. Following this, the pool is closed. The expected behaviour is that the pool should close cleanly, as no connections are in use.

When running this code, I see 'Closing pool' printed, then the script hangs indefinitely. It appears the pool is waiting for the release of the connection, which has not correctly handled the cancellation. When observing this in actual code, printing the pending tasks shows one stuck at PoolConnectionHolder.wait_until_released() (line 229). This suggests that the _in_use future is not being resolved correctly during cancellation.

However, if I remove the timeout=200 on the acquire, the bug goes away. Additionally, the 0.0000000001 second sleep is somewhat important:

  • Setting this to a higher value eventually leads to the code working, as the task will complete before being cancelled
  • Removing this sleep also makes the code 'work', as the task is never scheduled.

Effectively, there is a very short 'critical' window (~10us seconds on this machine) during which a cancellation arriving will lead to the pool being unclosable. Hence this bug has been a total pain to hunt for.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions