When I attempted to implement faceted search against a Postgres full-text index, I found I had issues getting Postgres (9.2) to pick up the index:
CREATE INDEX search_idx2 ON data2 USING gin(to_tsvector('english', search));
SELECT *
FROM (
SELECT author, count(*) c
FROM data2
WHERE search @@ to_tsquery('linux')
GROUP BY author
) counts
ORDER BY c desc;
This query will do a full scan on the table (~9-12 minutes on a 1.1 million row table).
Sort (cost=233587.16..233587.17 rows=6 width=22)" Sort Key: (count(*))" HashAggregate (cost=233586.96..233587.02 rows=6 width=14)" Seq Scan on data2 (cost=0.00..233578.12 rows=1768 width=14)" Filter: 1"
The fix is simply to apply the function used on the search column to the query, which cuts the time to ~23 seconds on a 1.1 million row table:
SELECT *
FROM (
SELECT author, count(*) c
FROM data2
WHERE to_tsvector('english', search) @@ to_tsquery('linux')
GROUP BY author
) counts
ORDER BY c desc;
And the resulting plan:
HashAggregate (cost=74668.49..74669.28 rows=79 width=14) (actual time=23726.071..23726.598 rows=1625 loops=1) Bitmap Heap Scan on data2 (cost=306.87..74535.03 rows=26692 width=14) (actual time=15.051..23683.230 rows=27604 loops=1) Recheck Cond: (to_tsvector('english'::regconfig, (search)::text) @@ to_tsquery('linux'::text)) Rows Removed by Index Recheck: 42867 Bitmap Index Scan on search_idx2 (cost=0.00..300.19 rows=26692 width=0) (actual time=11.436..11.436 rows=27604 loops=1) Index Cond: (to_tsvector('english'::regconfig, (search)::text) @@ to_tsquery('linux'::text)) Total runtime: 23727.012 ms
- search)::text @@ to_tsquery('linux'::text [↩]