Upgrade for the lazy: how PostgreSQL 12 improves performance







PostgreSQL 12 , the latest version of "the world's best open-source relational database," comes out in a couple of weeks (if everything goes according to plan). This corresponds to the usual schedule - a new version with a lot of new features is released once a year, and, frankly, it is impressive. Therefore, I became an active member of the PostgreSQL community.







In my opinion, unlike previous releases, PostgreSQL 12 does not contain one or two revolutionary functions (such as partitioning or concurrency of queries). I once joked that the main feature of PostgreSQL 12 is more stability. Isn't that what you need when you manage critical business data?







But PostgreSQL 12 is not limited to this: with new features and enhancements, applications will work better, and you just need to upgrade!







(Well, maybe even rebuild the indices, but in this release it's not as scary as we used to.)







It will be great to upgrade PostgreSQL and immediately enjoy significant improvements without unnecessary gestures. A few years ago, I analyzed the upgrade from PostgreSQL 9.4 to PostgreSQL 10 and saw how the application accelerated due to improved query parallelism in PostgreSQL 10. And, most importantly, almost nothing was required of me (just set the max_parallel_workers



configuration max_parallel_workers



).







Agree, itā€™s convenient when, right after the upgrade, applications work better. And we are trying very hard to please users, because PostgreSQL has more of them.







And how does a simple upgrade to PostgreSQL 12 make you happy? I'll tell you now.







Major indexing improvements



Without indexing, the database will not go far. And how else to quickly find information? The fundamental PostgreSQL indexing system is called the B-tree . This type of index is optimized for storage systems.







We just use the CREATE INDEX ON some_table (some_column)



, and PostgreSQL does a great job of keeping the index up to date while we constantly insert, update, and delete values. Everything works by itself, as if by magic.







But PostgreSQL indexes have one problem - they bloat and take up extra disk space, and the performance of retrieving and updating data is reduced. By "bloat," I mean ineffectively maintaining the index structure. This may or may not be due to garbage tuples that VACUUM removes (thanks for the info to Peter Geoghegan ). Index bloat is particularly noticeable in workloads where the index is actively changing.







PostgreSQL 12 seriously improves the performance of B-tree indexes, and experiments with tests such as TPC-C showed that space is now used, on average, by 40% less. Now we spend less time not only on maintaining the B-tree indexes (that is, writing operations), but also on extracting data, because the indexes have become much smaller.







Applications that are actively updating their tables ā€” typically OLTP applications ( real-time transaction processing ) ā€”are much more efficient in using the disk and processing requests. The more disk space, the more space the database has for growth without upgrading the infrastructure.







Some upgrade strategies require rebuilding B-tree indexes to take advantage of these (for example, pg_upgrade will not rebuild indexes automatically). In previous versions of PostgreSQL, rebuilding large indexes in tables led to significant downtime, because at that time it was impossible to make changes. But PostgreSQL 12 has another cool trick: you can now rebuild indexes in parallel with the REINDEX CONCURRENTLY command to completely avoid downtime.







PostgreSQL 12 has other improvements to the indexing infrastructure. Another thing that could not do without magic is the write-ahead log, which is also WAL (write-ahead log). A write-ahead log records every transaction in PostgreSQL in case of failure and replication. Applications use it for backup and restore at a point in time . Of course, the write-ahead log is written to disk, and this can affect performance.







PostgreSQL 12 has reduced the overhead of WALs created by the GiST, GIN, and SP-GiST indexes when building the index. This provides several tangible advantages: WAL records take up less disk space, and data plays faster, for example, during recovery from a failure or recovery at a point in time. If you use such indexes in your applications (for example, PostGIS-based geospatial applications use the GiST index a lot), this is another feature that will significantly improve performance without any effort on your part.







Partitioning - Bigger, Better, Faster



PostgreSQL 10 introduced declarative partitioning . PostgreSQL 11 made it much easier to use. In PostgreSQL 12, you can change the scale of sections.







In PostgreSQL 12, partitioning system performance is much better, especially if there are thousands of partitions in the table. For example, if a query affects only a few sections in a table, where there are thousands of them, it will run much faster. Performance has been improved not only for these types of queries. You will also notice how INSERT operations in tables with many partitions have accelerated.







Writing data using COPY - by the way, this is a great way to bulk load data and here is an example of receiving JSON - in partitioned tables in PostgreSQL 12 has also become more efficient. With COPY, everything was so fast, but in PostgreSQL 12 it flies.







Thanks to these benefits, PostgreSQL can store even larger datasets and make retrieval easier. And no effort on your part. If the application has many sections, for example, it records time series data, a simple upgrade will significantly improve its performance.







Although this improvement is not entirely from the ā€œupgrade and rejoiceā€ category, in PostgreSQL 12 you can create foreign keys that reference partitioned tables so that working with partitioning is a pleasure.







WITH queries are much better



When the patch was applied to the built-in generalized table expressions (they are CTEs, they are also WITH queries), I was eager to write an article about how application developers with PostgreSQL were delighted . This is one of those features that will speed up the application. Unless, of course, you are using CTE.







I often notice that newcomers to SQL like to use CTE: if you write them in a certain way, you directly feel that you are writing an imperative program. Personally, I loved to rewrite these queries in order to dispense with CTE and increase performance. Now everything is different.







PostgreSQL 12 allows you to embed a specific type of CTE without side effects ( SELECT



), which is used only once near the end of the query. If I kept statistics of requests with CTE, which I rewrote, most of them would fall into this category. This helps developers write understandable code that now also works quickly.







Moreover, PostgreSQL 12 optimizes the execution of SQL itself, you do not have to do anything. And although now, probably, I will not need to optimize such queries, it is great that PostgreSQL continues to work on query optimization.







Just-in-Time (JIT) - Now Default



On PostgreSQL 12 systems with LLVM support, JIT compilation is enabled by default. Firstly, you get JIT support for some internal operations, and secondly, queries with expressions (the simplest example is x + y) in the select lists (which you have after SELECT), aggregates, expressions with WHERE clauses and others can use JIT to improve performance.







Since JIT is included in PostgreSQL 12 by default, performance will improve on its own, but I recommend testing the application in PostgreSQL 11, where JIT just appeared to measure query performance and see if you need to tune anything.







But what about the other new features of PostgreSQL 12?



There are a lot of cool new features in PostgreSQL, from the ability to examine JSON data using standard SQL / JSON route expressions to multi-factor authentication with the clientcert=verify-full



parameter, created columns and much more. Enough for a separate post.







Like PostgreSQL 10, PostgreSQL 12 will improve overall performance immediately after the upgrade. Of course, you can have your own way - test the application under similar conditions on the working system before turning on the improvements, as I did with PostgreSQL 10. Even if PostgreSQL 12 is more stable now than I expected, donā€™t be lazy to test applications quality, before releasing them in production.








All Articles