Optimize Postgres queries to add value to applications
There is no silver bullet when it comes to query optimization. However, with the knowledge gleaned from monitoring tools, it is possible for DBAs to understand and troubleshoot slow-running queries.
During normal PostgreSQL operations, resource-intensive, unoptimized queries are inevitably encountered, causing slow response times that can lead to user frustration and customer churn. In financial services, for example, slow response times can lead to failed transactions. In e-commerce, it can lead to customers abandoning a shopping cart. Whether the delay caused by unoptimized queries is from milliseconds to minutes, or minutes to hours or days, it is critical for every database engineer, DBA, and software developer to identify and fix problematic queries as quickly as possible. In this article, we’ll look at the top tools and extensions that those working with PostgreSQL can use to streamline their queries.
Choose a workload monitoring tool
Workload monitoring tools help administrators see what a workload consists of – the types of queries being sent to the database and which are consuming resources and taking too long to execute. Postgres monitoring tools work with Zabbix Server, a free, open-source monitoring solution from Zabbix Inc. that can be downloaded from the Zabbix official website. There are two Postgres-specific monitoring tools to choose from. Both work well, so you might want to try both and see which one you prefer.
Mamonsu is an open-source active Zabbix agent that can collect RDBMS metrics, connect to a Zabbix server, and send the metrics there. It interacts with PostgreSQL 9.5 and higher and can work with different operating systems. Mamonsu provides various metrics related to PostgreSQL activities including connection and locking statistics, autovacuum workers, oldest transaction identifier and many others. You can download Mamonsu for free from the PostgresPro repository on Github.
Zabbix Agent 2, another tool for collecting various metrics, is available for PostgreSQL version 10+ and Zabbix Server version 4.4+. Zabbix Agent 2 features include the ability to collect 95+ metrics from multiple PostgreSQL instances with a single agent, support for custom plugins written in Golang, and options to monitor and review metrics in real-time via a command line. Zabbix Agent 2 can be downloaded from the Zabbix repository.
It is important to note that these tools do not themselves identify the problematic queries. Instead, they provide information that can help you identify the problems and figure out how to fix the problems. Armed with this information, you become a detective collecting clues to solve a mystery. The more experience you have, the easier this detection will be.
Below are some types of information that you can collect using monitoring tools.
Connection Metrics – Connection metrics indicate the number of client connections to the database and the total number of possible connections, limited by the amount of memory on the primary node. Once the limit is reached, new client connections will be blocked until existing ones are closed. For example, a sudden increase in the number of connections could indicate that slow queries are consuming bandwidth. Slow queries require locks, and locks require new connections to the database. One reason for a high lock count is that some transactions have not committed their changes and therefore have not released the locks they have acquired.
Sampling of locks – Lock Sampling shows the types of locks used within a defined period of time. One type of lock is a “write query” lock, which could indicate that a query performance issue is being caused by rewritten queries.
query plan – PostgreSQL has a built-in complex scheduler that automatically creates a query plan for each incoming query. The planner tries to choose the optimal approach for the query structure and the properties of the data to ensure the best performance. The EXPLAIN command displays the query plan for each query, which provides insight into aspects of query execution that might affect performance. A DBA can then attempt to manually tune the query plan to improve performance by collecting planner statistics and using PostgreSQL configuration settings and modules such as pg_hint_plan.
See also: Enabling DevOps at the intersection of AIOps and open source
Extensions for tracking resource-intensive queries
While workload monitoring tools can help identify the time intervals when a database is performing poorly, they cannot display resource-intensive query text. The following PostgreSQL extensions can be used for this purpose.
pg_stat_statements shows which queries have the longest execution time. It is included in the PostgreSQL standard distribution. For more information, see: https://www.postgresql.org/docs/13/pgstatstatements.html
pg_stat_kcache used to identify queries that consume the most CPU system and user time. This extension is not part of the PostgreSQL distribution and should therefore be downloaded separately. For more information, see: https://github.com/powa-team/pg_stat_kcache
auto_explain used to track query plans and parameters. It is part of the PostgreSQL distribution. For more information see: https://www.postgresql.org/docs/13/auto-explain.html
pg_store_plans used to collect information about execution plan statistics of all SQL statements executed by a server. For more information see: http://ossc-db.github.io/pg_store_plans/
pg_wait_sampling used to collect information about wait events of a specific process. Two types of statistics are collected. The “Wait Events History” shows examples of wait events for each process. “Waits profile” shows the number of samples per process and each wait event. This extension is not part of the PostgreSQL distribution and should therefore be downloaded separately. For more information, see: https://github.com/postgrespro/pg_wait_sampling
plprofiler used to create performance profiles of PL/pgSQL functions and stored procedures in the form of a FlameGraph that helps identify the longest procedure or function. This extension is not part of the PostgreSQL distribution and should therefore be downloaded separately. For more information, see: https://github.com/bigsql/plprofiler
pgpro_stats is a combination of the modules pg_stat_statements, pg_stat_kcache and pg_wait_sampling. It is included in the Postgres Professional distribution and can display query execution plans with related wait profiles. For more information, see https://postgrespro.ru/docs/enterprise/12/pgpro-stats?lang=en
Using pg_profile to detect resource intensive queries
pg_profile, which can be downloaded from this repository, is a particularly useful extension for creating a historical workload repository containing various metrics. These metrics include:
- SQL query statistics
- DML Statistics
- Metrics related to index usage
- Top growing tables
- Top tables after delete/update operations
- Metrics related to vacuum and auto-vacuum processes
- User Function Statistics
pg_profile regularly collects data from the Postgres Statistics Collector and extensions, making it possible to detect the most resource-consuming activities for defined periods in the past. pg_profile is easy to install and can be used in restricted environments without access to the server file system. It allows DBAs to set a specific time interval, e.g. B. two hours, and create a report that contains many performance statistics for that time interval. A difference report pg_profile compares performance statistics for two time periods side by side. Since one or two samples per hour is usually sufficient to detect the most resource-intensive activities, pg_profile does not incur much overhead.
pgpro_pwr is an improved version of pg_profile that can interact with pgpro_stats, collect its metrics and store them in a separate database for further processing. It is included with the Postgres Professional distribution and displays wait statistics and query execution plans in separate sections of a pgpro_pwr report. The more detailed statistics in pgpro_pwr allow for easier problem identification.
Additional query optimization tips
With the information gleaned from Postgres monitoring, DBAs can identify and fix problems that are causing slow-running queries. Other optimization strategies are:
Index scan only – Sometimes it makes sense to rewrite a query and create additional indexes. PostgreSQL can use the Index Only Scan access method to retrieve data from the index rather than the table, reducing the number of random reads and improving performance.
Checking execution plans with the LIMIT clause instead of the DISTINCT clause, which can reduce the need for calculations, which greatly improves performance.
It is also possible to create advanced statistics for multiple columns of a table to reduce the difference between the actual and estimated number of rows. Since PostgreSQL 12, the statistics even work for range conditions like “between”, “less than or equal to”, “greater than or equal to”, etc.
Conclusion
Postgres has several mechanisms for achieving goals, and choosing the right one is crucial. DBAs should always try to find a better or faster way to execute the query, so it’s important that someone with a lot of Postgres experience guides and coaches the DBA team. Alternatively, organizations hire an expert consultant to handle typical workloads that are processing too slowly.
There is no silver bullet when it comes to query optimization. However, with the knowledge gleaned from monitoring tools—along with patience and an eye for detail—DBAs can understand and troubleshoot slow-running queries. Remember that all of the above tools are available for download along with other resources and explanations on how to use them. You can also check out the PostgreSQL documentation and this blog post on PostgreSQL query optimization for more tips.
Comments are closed.