Skip to content

Avoiding bitmap scan with ANY #18

Open
@andreak

Description

@andreak

Hi.

As Teodor commented on #15 PG will use bitmap scan when having queries like this:

EXPLAIN ANALYZE
SELECT
    del.entity_id,
    del.received_timestamp,
    del.received_timestamp <=> '3000-01-01' :: TIMESTAMP,
    del.folder_id
FROM origo_email_delivery del
WHERE del.fts_all @@ to_tsquery('simple', 'andreas&joseph')
      AND del.folder_id IN (44961, 204483, 44965, 2470519)
ORDER BY del.received_timestamp <=> '3000-01-01' :: TIMESTAMP
LIMIT 10 offset 10000
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                       QUERY PLAN                                                                       │
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit  (cost=32066.97..32066.97 rows=1 width=32) (actual time=217.899..217.900 rows=10 loops=1)                                                        │
│   ->  Sort  (cost=32042.41..32066.97 rows=9822 width=32) (actual time=217.300..217.688 rows=10010 loops=1)                                             │
│         Sort Key: ((received_timestamp <=> '3000-01-01 00:00:00'::timestamp without time zone))                                                        │
│         Sort Method: quicksort  Memory: 1541kB                                                                                                         │
│         ->  Bitmap Heap Scan on origo_email_delivery del  (cost=452.67..31391.13 rows=9822 width=32) (actual time=196.797..214.216 rows=14806 loops=1) │
│               Recheck Cond: ((fts_all @@ '''andreas'' & ''joseph'''::tsquery) AND (folder_id = ANY ('{44961,204483,44965,2470519}'::bigint[])))        │
│               Heap Blocks: exact=13043                                                                                                                 │
│               ->  Bitmap Index Scan on rum_idx  (cost=0.00..450.22 rows=9822 width=0) (actual time=195.369..195.369 rows=14806 loops=1)                │
│                     Index Cond: ((fts_all @@ '''andreas'' & ''joseph'''::tsquery) AND (folder_id = ANY ('{44961,204483,44965,2470519}'::bigint[])))    │
│ Planning time: 0.721 ms                                                                                                                                │
│ Execution time: 217.969 ms                                                                                                                             │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(11 rows)

... preventing ordered output, hence requiring the extra Sort-step and preventing the query from using only the index. This results in all tuples must be processed then sorted to match the LIMIT.

Will this be fixed, ie. will index scan be able to process ANY, or will it require fixing something in PG, if so - what needs to be done?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions