Description
- asyncpg version: 0.12
- PostgreSQL version: 9.6.4
- Python version: 3.5.3
- Platform: Linux x86_64 (Debian Stretch)
- Do you use pgbouncer?: no
- Did you install asyncpg with pip?: no
- If you built asyncpg locally, which version of Cython did you use?: 0.25.2
- Can the issue be reproduced under both asyncio and
uvloop?: yes
My DB schema contains thousands of tables inherited from one table. These tables store some logs or historical information and have around 100 columns. As a result, PostgreSQL information schema table pg_attribute
contains enormous number of records.
I use my own DB user sergey
to connect to PostgreSQL who is a superuser. When I connect as this user, some simple queries run very long in asyncpg.
Preparation. This code creates 5000 inherited tables with 100 columns to prepare similar environment for the test cases below:
import asyncpg
import asyncio
loop = asyncio.get_event_loop()
async def run():
conn = await asyncpg.connect(user='sergey')
await conn.execute(
'create table testlog ({})'.format(
','.join('c{:02} varchar'.format(n) for n in range(100))
)
)
for n in range(5000):
await conn.execute(
'create table testlog_{:04} () inherits (testlog)'.format(n)
)
loop.run_until_complete(run())
Example 1: Connect as sergey
and run this simple query:
import asyncpg
import asyncio
#import uvloop
#asyncio.set_event_loop(uvloop.new_event_loop())
loop = asyncio.get_event_loop()
async def run():
conn = await asyncpg.connect(user='sergey')
return await conn.fetch('select $1::integer[]', [1, 2, 3])
print(loop.run_until_complete(run()))
While it was running for about 2 minutes, I saw this query in pg_activity_stat
:
WITH RECURSIVE typeinfo_tree(
oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim,
range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth)
AS (
WITH composite_attrs
AS (
SELECT
c.reltype AS comptype_oid,
array_agg(ia.atttypid ORDER BY ia.attnum) AS typoids,
array_agg(ia.attname::text ORDER BY ia.attnum) AS names
FROM
pg_attribute ia
INNER JOIN pg_class c
ON (ia.attrelid = c.oid)
WHERE
ia.attnum > 0 AND NOT ia.attisdropped
GROUP BY
c.reltype
),
typeinfo
AS (
SELECT
t.oid AS oid,
ns.nspname AS ns,
t.typname AS name,
t.typtype AS kind,
(CASE WHEN t.typtype = 'd' THEN
(WITH RECURSIVE typebases(oid, depth) AS (
Unfortunately, pg_activity_stat
does not show the whole query but only first part. I guess the slow execution has to do with the size of pg_attribute
.
Example 2: Now connect as user postgres
and run the same query. It is still not instant but much faster (~1 second):
import asyncpg
import asyncio
#import uvloop
#asyncio.set_event_loop(uvloop.new_event_loop())
loop = asyncio.get_event_loop()
async def run():
conn = await asyncpg.connect(user='postgres')
return await conn.fetch('select $1::integer[]', [1, 2, 3])
print(loop.run_until_complete(run()))
Example 3: Not all queries execute long. This one is fast:
import asyncpg
import asyncio
#import uvloop
#asyncio.set_event_loop(uvloop.new_event_loop())
loop = asyncio.get_event_loop()
async def run():
conn = await asyncpg.connect(user='sergey')
return await conn.fetch('select $1::integer', 1)
print(loop.run_until_complete(run()))
P.S. I tried vacuum full analyze pg_attribute
and vacuum full analyze pg_class
, it did not help.