🧑🏿‍🤝‍🧑🏼 🐸 🍿 Cover Indexes for GiST 🍿 ✊🏼 🏂🏽

The covering index is not just another feature that may come in handy. This thing is purely practical. Without them, Index Only Scan may not give a win. Although the covering index in different situations is effective in different ways.

This is not entirely about covering indexes: strictly speaking, the so-called inclusive indexes have appeared in Postgres. But, in order: a covering index is an index that contains all the column values required by the query; however, access to the table itself is no longer required. Nearly. You can read about “almost” and other nuances in an article by Yegor Rogov , included in his index series of 10 (!) Parts. And the inclusive index is created specifically for searching on typical queries: the values of fields that cannot be searched are added to the search index, they are needed only so as not to refer to the table again. Such indexes are formed with the keyword INCLUDE.

Anastasia Lubennikova (Postgres Professional) finalized the btree method so that additional columns could be included in the index. This patch was included in PostgreSQL 11. But the patches for the GiST / SP-GiST access methods did not have time to mature before the release of this version. By the 12th GiST ripened.

A constructive desire to have inclusive indexes for GiST arose a long time ago: a test patch by Andrey Borodin was offered to the community back in mid-April 2018. He did all the basic, very difficult work.

In early August 2019, Alexander Korotkov added cosmetic improvements and committed the patch.

For demonstration and some research, we will generate a set of 3 million rectangles. At the same time, a few words about the box type, since not all manipulations with it are intuitive.

The type of box - that is, the rectangle - has been in Postgres for a long time, it is defined by 2 points (the geometric type of point) - the opposite vertices of the rectangle (that is, the rectangle cannot be oblique, littered to the side). We read in the documentation : “values of type box are written in one of the following forms:

( ( x1 , y1 ) , ( x2 , y2 ) ) ( x1 , y1 ) , ( x2 , y2 ) x1 , y1 , x2 , y2

In practice, you have to write, say, like this:

 SELECT box('1,2', '3,4'); box ------------- (3,4),(1,2) (1 row)

First, Postgres shows us the top right vertex, then the bottom left. If we write like this,

 SELECT box('5,2', '3,4'); box ------------- (5,4),(3,2) (1 row)

then we’ll make sure that Postgres didn’t give the peaks that they gave him. He calculated the upper right and lower left from our upper left and lower right. This is a convenient property when the location of the vertices is not known in advance - in case of random generation, for example. The notation '1,2', '3,4' is equivalent to point (1,2), point (3,4). This form is sometimes more convenient.

For business: search in 3 million rectangles

 CREATE TABLE boxes(id serial, thebox box, name text);

We will generate 3 million random rectangles. We want a normal distribution, but in order not to use the tablefunc extension, we use the “poor” approach: we use random () - random (), which also gives a nice picture (see fig.) With rectangles, the larger the closer to the center. Their centers of gravity are also random. Such distributions are characteristic of some types of real city data. And those who want to delve into the laws of statistics or refresh memories can read about the difference in random variables, for example, here .

 INSERT INTO boxes(thebox, name) SELECT box( point( random()-random(), random()-random() ), point( random()-random(), random()-random() ) ), 'box no.' || x FROM generate_series(1,3000000) AS g(x);

The size of the table that shows \dt+

is 242MB. Now you can start the search.

We are looking without an index:

 EXPLAIN ANALYZE SELECT thebox, name FROM boxes WHERE thebox @> box('0.5, 0.4','0.3, 0.2'); QUERY PLAN -------------------------------------------------------------------------------------- Gather (cost=1000.00..47853.00 rows=3000 width=46) (actual time=0.140..246.998 rows=139189 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on boxes (cost=0.00..46553.00 rows=1250 width=46) (actual time=0.011..106.708 rows=46396 loops=3) Filter: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Rows Removed by Filter: 953604 Planning Time: 0.040 ms Execution Time: 259.262 ms (8 rows)

We see that there is a Parallel Seq Scan - sequential scan (albeit parallelized).

Create a regular, non-inclusive index:

 CREATE INDEX ON boxes USING gist(thebox);

The size of the boxes_thebox_idx

index, which shows \di+

, 262MB. In response to the same request, we get:

 EXPLAIN ANALYZE SELECT thebox, name FROM boxes WHERE thebox @> box('0.5, 0.4','0.3, 0.2'); QUERY PLAN -------------------------------------------------------------------------------- Bitmap Heap Scan on boxes (cost=159.66..9033.30 rows=3000 width=46) (actual time=29.101..80.283 rows=139189 loops=1) Recheck Cond: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Heap Blocks: exact=30629 -> Bitmap Index Scan on boxes_thebox_idx (cost=0.00..158.91 rows=3000 width=0) (actual time=25.029..25.029 rows=139189 loops=1) Index Cond: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Planning Time: 0.053 ms Execution Time: 86.206 ms (7 rows)

Search time was reduced by a factor of three, and instead of Parallel Seq Scan, they received a Bitmap Index Scan. It does not parallelize, but it works faster.

Now kill the old index and create an inclusive one:

 CREATE INDEX ON boxes USING spgist(thebox) INCLUDE(name);

Index of boxes_thebox_name_idx

fatter: 356MB. Go:

 EXPLAIN ANALYZE SELECT thebox, name FROM boxes WHERE thebox @> box('0.5, 0.4','0.3, 0.2'); QUERY PLAN ------------------------------------------------------------------------------ Bitmap Heap Scan on boxes (cost=207.66..9081.30 rows=3000 width=46) (actual time=86.822..152.014 rows=139189 loops=1) Recheck Cond: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Heap Blocks: exact=30629 -> Bitmap Index Scan on boxes_thebox_name_idx (cost=0.00..206.91 rows=3000 width=0) (actual time=83.044..83.044 rows=139189 loops=1) Index Cond: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Planning Time: 3.807 ms Execution Time: 157.997 ms (7 rows)

Index Only Scan is used, but the picture is sad: the time is almost 2 times longer than without it. We read the handbook of the creator of indices, in part I :

‹Rang PostgreSQL indexes do not contain information that allows you to judge the visibility of rows. Therefore, the access method returns all versions of rows that fall under the search condition, regardless of whether they are visible to the current transaction or not. However, if the indexing mechanism had to look into the table each time to determine visibility, this scanning method would not be any different from ordinary index scanning. The problem is solved by the fact that PostgreSQL supports the so-called visibility map for tables, in which the vacuum process marks pages where the data has not changed long enough for all transactions to see it, regardless of the start time and isolation level. If the identifier of the row returned by the index refers to such a page, then visibility can not be checked. ››

We do VACUUM. Repeat:

 EXPLAIN ANALYZE SELECT thebox, name FROM boxes WHERE thebox @> box('0.5, 0.4','0.3, 0.2'); QUERY PLAN -------------------------------------------------------------------------------- Index Only Scan using boxes_thebox_name_idx on boxes (cost=0.41..236.91 rows=3000 width=46) (actual time=0.104..38.651 rows=139189 loops=1) Index Cond: (thebox @> '(0.5,0.4),(0.3,0.2)'::box) Heap Fetches: 0 Planning Time: 0.052 ms Execution Time: 44.337 ms (5 rows)

Quite a different matter! Twice the gain compared to the non-inclusive index.

Selectivity and gain

The performance of inclusive indexes is highly dependent on the selectivity of conditions in queries. To investigate this dependence a little, we will solve the inverse problem: we will generate a label with an index of type point and we will look for how many points will fall in the given box. Spread the dots evenly squared.

 CREATE TABLE test_covergist(id serial, tochka point, name text);

 INSERT INTO test_covergist(tochka, name) SELECT point(trunc(1000000*random()), trunc(1000000*random())), 'point no.' || gx FROM generate_series(1,3000000) AS g(x);

The size of the table is 211 MB.

 CREATE INDEX on test_covergist USING gist(tochka);

Size 213 MB.

We obviously take all available points into a square:

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','3000000,3000000') @> tochka; QUERY PLAN ------------------------------------------------------------------------------------- Bitmap Heap Scan on test_covergist (cost=135.66..8778.83 rows=3000 width=32) (actual time=1087.964..1864.059 rows=3000000 loops=1) Recheck Cond: ('(3000000,3000000),(0,0)'::box @> tochka) Heap Blocks: exact=27025 Buffers: shared read=54287 -> Bitmap Index Scan on test_covergist_tochka_idx (cost=0.00..134.91 rows=3000 width=0) (actual time=1084.949..1084.949 rows=3000000 loops=1) Index Cond: (tochka <@ '(3000000,3000000),(0,0)'::box) Buffers: shared read=27262 Planning Time: 0.102 ms Execution Time: 2029.501 ms (9 rows)

We asked EXPLAIN to show the buffers. It will come in handy. Now the request execution time is more than 2 seconds, it can be seen that Buffers: shared read = 54287. In another situation, we could see a mixture of shared read and shared hit - that is, some buffers are read from disk (or from the OS cache), and some from the buffer cache. We know the approximate size of the table and indexes, so we will protect ourselves by setting shared buffers so that everything fits - restart Postgres with the option

 -o "-c shared_buffers=1GB"

Now:

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','3000000,3000000') @> tochka; QUERY PLAN ------------------------------------------------------------------------------- Bitmap Heap Scan on test_covergist (cost=135.66..8778.83 rows=3000 width=32) (actual time=231.032..613.326 rows=3000000 loops=1) Recheck Cond: ('(3000000,3000000),(0,0)'::box @> tochka) Heap Blocks: exact=27025 Buffers: shared hit=54248 -> Bitmap Index Scan on test_covergist_tochka_idx (cost=0.00..134.91 rows=3000 width=0) (actual time=228.068..228.068 rows=3000000 loops=1) Index Cond: (tochka <@ '(3000000,3000000),(0,0)'::box) Buffers: shared hit=27223 Planning Time: 0.070 ms Execution Time: 755.915 ms (9 rows)

That is, shared read became a shared hit, and time was reduced three times.

Another important detail in EXPLAIN: 3 million points are returned, and the forecast of the returned number of records is 3 thousand. Spoiler: this number will not change with any selectivity. The optimizer does not know how to evaluate cardinality when working with box or point types. And the plan will not change: for any size of the rectangle, there will be a Bitmap Index Scan on test_covergist_tochka_idx.

Here are two more measurements with the number of issued records, differing by orders of magnitude:

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','300000,300000') @> tochka; QUERY PLAN --------------------------------------------------------------------------------------- Bitmap Heap Scan on test_covergist (cost=135.66..8778.83 rows=3000 width=32) (actual time=27.889..134.054 rows=269882 loops=1) Recheck Cond: ('(300000,300000),(0,0)'::box @> tochka) Heap Blocks: exact=27024 Buffers: shared hit=29534 -> Bitmap Index Scan on test_covergist_tochka_idx (cost=0.00..134.91 rows=3000 width=0) (actual time=24.847..24.847 rows=269882 loops=1) Index Cond: (tochka <@ '(300000,300000),(0,0)'::box) Buffers: shared hit=2510 Planning Time: 0.074 ms Execution Time: 151.269 ms (9 rows)

It returns more than 10 times fewer records (actual ... rows = 269882), the time has decreased by about 5 times.

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','30000,30000') @> tochka; QUERY PLAN ---------------------------------------------------------------------------------------------- Bitmap Heap Scan on test_covergist (cost=135.66..8778.83 rows=3000 width=32) (actual time=1.882..16.095 rows=2780 loops=1) Recheck Cond: ('(30000,30000),(0,0)'::box @> tochka) Heap Blocks: exact=2624 Buffers: shared hit=2655 -> Bitmap Index Scan on test_covergist_tochka_idx (cost=0.00..134.91 rows=3000 width=0) (actual time=1.035..1.035 rows=2780 loops=1) Index Cond: (tochka <@ '(30000,30000),(0,0)'::box) Buffers: shared hit=31 Planning Time: 0.154 ms Execution Time: 16.702 ms (9 rows)

The contents of a 30K × 30K square (2780) are counted in just 16 ms. And when there are dozens of records, they are already extracted in fractions of a ms, and such measurements are not very reliable.

Finally, measure the same with the inclusive index:

 CREATE INDEX on test_covergist USING gist(tochka) INCLUDE(name);

Size 316 MB.

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','3000000,3000000') @> tochka; QUERY PLAN --------------------------------------------------------------------------------------- Index Only Scan using test_covergist_tochka_name_idx on test_covergist (cost=0.41..216.91 rows=3000 width=32) (actual time=0.160..568.707 rows=3000000 loops=1) Index Cond: (tochka <@ '(3000000,3000000),(0,0)'::box) Heap Fetches: 0 Buffers: shared hit=40492 Planning Time: 0.090 ms Execution Time: 709.837 ms (6 rows)

The time is almost the same as with a conventional index, although Index Only Scan.

But:

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','300000,300000') @> tochka; QUERY PLAN -------------------------------------------------------------------------------------------- Index Only Scan using test_covergist_tochka_name_idx on test_covergist (cost=0.41..216.91 rows=3000 width=32) (actual time=0.083..53.277 rows=269882 loops=1) Index Cond: (tochka <@ '(300000,300000),(0,0)'::box) Heap Fetches: 0 Buffers: shared hit=3735 Planning Time: 0.077 ms Execution Time: 66.162 ms (6 rows)

And it was 151 ms. And correspondingly:

 EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist WHERE box('0,0','300000,300000') @> tochka; QUERY PLAN -------------------------------------------------------------------------------------------- Index Only Scan using test_covergist_tochka_name_idx on test_covergist (cost=0.41..216.91 rows=3000 width=32) (actual time=0.043..0.639 rows=2780 loops=1) Index Cond: (tochka <@ '(30000,30000),(0,0)'::box) Heap Fetches: 0 Buffers: shared hit=52 Planning Time: 0.053 ms Execution Time: 0.791 ms (6 rows)

This is already a fraction of ms for the same 2780 point records.

Buffers like guns

An explanation can be sought and found in a shotgun that hasn’t shot yet, but was hanging on the wall: the number of blocks read. In the case of an inclusive index, only blocks of the index itself are read (Heap Fetches: 0). In three cases, these were numbers 40492, 3735, and 52. But when using the regular index, the blocks read consist of the sum of the bits read in the Bitmap Heap Scan index (54248 with 3 million entries) and those that had to be read from heap (27223) , since the name field cannot be extracted from a regular index. 54248 + 27223 = 81471. The exclusive was 40492. For two other cases: 29534 + 2510 = 31044 and 2655 + 31 = 2686. In the case of a regular index, more blocks are read anyway, but with an improvement in selectivity, the number of blocks read begins to differ by orders of magnitude rather than 2 times due to the fact that the number of necessary blocks from a heap decreases more slowly than reading index blocks.

Total records returned (thousand)	3000	270	2.7
Read blocks (Normal / Inclusive)	81471/40492	31044/3735	2686/52
Time	755/710	151/66	16 / 0.7

But maybe the point is not selectivity at all, but simply the size of the table? Just in case, we repeat the same steps, generating a table with 300 thousand, and not 3 million records:

 CREATE TABLE test_covergist_small(id serial, tochka point, name text); INSERT INTO test_covergist_small(tochka, name) SELECT point(trunc(1000000*random()), trunc(1000000*random())), 'point no.' || gx FROM generate_series(1,300000) AS g(x); CREATE INDEX ON test_covergist_small USING gist(tochka); EXPLAIN (ANALYZE, buffers) SELECT tochka, name FROM test_covergist_small WHERE box('0,0','3000000,3000000') @> tochka; QUERY PLAN ---------------------------------------------------------------------------- Bitmap Heap Scan on test_covergist_small (cost=14.61..867.19 rows=300 width=31) (actual time=36.115..130.435 rows=300000 loops=1) Recheck Cond: ('(3000000,3000000),(0,0)'::box @> tochka) Heap Blocks: exact=2500 Buffers: shared hit=5225 -> Bitmap Index Scan on test_covergist_small_tochka_idx (cost=0.00..14.53 rows=300 width=0) (actual time=35.894..35.895 rows=300000 loops=1) Index Cond: (tochka <@ '(3000000,3000000),(0,0)'::box) Buffers: shared hit=2725 Planning Time: 0.060 ms Execution Time: 158.580 (9 rows)

Next, we repeat the same for the inclusive index. Here are the results:

Total records returned (thousand)	300	27	0.25
Read blocks (Normal / Inclusive)	5225/3726	3026/352	270/8
Time	158/178	20/13	0.4 / 0.2

In the case of 100% coverage of points, the query was even a little slower than with the usual index. Further, as in the case of 3 million, everything fell into place. That is, selectivity is important.

Our company tested inclusive GiST indices on real data - a set with several million rectangles on a map of Moscow. The conclusion is the same: in many situations, such indexes noticeably speed up queries. But the article cannot be illustrated with pictures and numbers of tests: this data is not in the public domain.

Instead of a conclusion

Let's go back for a moment to random rectangles. Let's try to do the same with spgist. You can recall or find out what it is to understand the differences between SP-GiST and GiST by reading the article Indexes in PostgreSQL - 6 . Create an inclusive index:

 CREATE INDEX ON boxes USING spgist(thebox) INCLUDE(name); ERROR: access method "spgist" does not support included columns

Alas, for SP-GiST, inclusive indexes are not yet implemented.

So there is room for improvement!

Cover Indexes for GiST

For business: search in 3 million rectangles

Selectivity and gain

Buffers like guns

Instead of a conclusion

More articles: