Trying to create an ASH analog for PostgreSQL

Formulation of the problem



To optimize PostgreSQL queries, the ability to analyze the activity history, in particular - expectations, locks, table statistics, is very required.



Available options



Historical load analysis tool or โ€œAWR for Postgresโ€ : a very interesting solution, but there is no history of pg_stat_activity and pg_locks.



Pgsentinel extension :

" All the accumulated information is stored only in RAM, and the consumed amount of memory is governed by the number of last stored records.



The queryid field is added - the same queryid from the pg_stat_statements extension (a preliminary installation is required). "



This would certainly help a lot, but the very nuisance is the first paragraph, โ€œ All accumulated information is stored only in RAM โ€, i.e. we have an impact on the target base. In addition, there is no lock history and table statistics. Those. generally speaking, the solution is incomplete: โ€œ There is no ready-made package for installation yet. It is proposed to download the sources and build the library yourself. First you need to install the โ€œdevelโ€ package for your server and write the path to pg_config in the PATH variable. "



In general - a lot of fuss, and in the case of serious production databases, maybe there will be no way to do something with the server. Again, you need to come up with something of your own.



Warning.



Due to the rather large volume and due to the incompletion of the testing period, the article is mainly for informational purposes, rather as a set of abstracts and intermediate results.

More detailed material will be prepared later, in parts.




Outline solution requirements



It is necessary to develop a tool to store:



Pg_stat_activity view history

Session lock history using pg_locks view



The decision requirement is to minimize the impact on the target database.



The general idea is that the data collection agent is launched not in the target database, but in the monitoring database as a systemd service. Yes, some data loss is possible, but this is not critical for reporting, but there is no impact on the target database from memory and disk space. And in the case of using a connection pool, the impact on user processes is minimal.



Implementation stages



1. Service tables



A separate scheme is used to store the tables, so as not to complicate the analysis of the main tables used.



DROP SCHEMA IF EXISTS activity_hist ; CREATE SCHEMA activity_hist AUTHORIZATION monitor ;
      
      





Important: the scheme is not created in the target database, but in the monitoring database.

Pg_stat_activity view history



Use the table to store the current snapshots of the pg_stat_activity view



activity_hist.history_pg_stat_activity:
 --ACTIVITY_HIST.HISTORY_PG_STAT_ACTIVITY DROP TABLE IF EXISTS activity_hist.history_pg_stat_activity; CREATE TABLE activity_hist.history_pg_stat_activity ( timepoint timestamp without time zone , datid oid , datname name , pid integer, usesysid oid , usename name , application_name text , client_addr inet , client_hostname text , client_port integer, backend_start timestamp with time zone , xact_start timestamp with time zone , query_start timestamp with time zone , state_change timestamp with time zone , wait_event_type text , wait_event text , state text , backend_xid xid , backend_xmin xid , query text , backend_type text , queryid bigint );
      
      





To speed up the insertion - no indexes or restrictions.



To store the history directly, a partitioned table is used:



activity_hist.archive_pg_stat_activity:
 DROP TABLE IF EXISTS activity_hist.archive_pg_stat_activity; CREATE TABLE activity_hist.archive_pg_stat_activity ( timepoint timestamp without time zone , datid oid , datname name , pid integer, usesysid oid , usename name , application_name text , client_addr inet , client_hostname text , client_port integer, backend_start timestamp with time zone , xact_start timestamp with time zone , query_start timestamp with time zone , state_change timestamp with time zone , wait_event_type text , wait_event text , state text , backend_xid xid , backend_xmin xid , query text , backend_type text , queryid bigint ) PARTITION BY RANGE (timepoint);
      
      





Since there are no requirements for insertion speed in this case, some indexes have been created to speed up reporting.



Session Lock History



To store current snapshots of session locks, use the table:



activity_hist.history_locking:
 --ACTIVITY_HIST.HISTORY_LOCKING DROP TABLE IF EXISTS activity_hist.history_locking; CREATE TABLE activity_hist.history_locking ( timepoint timestamp without time zone , locktype text , relation oid , mode text , tid xid , vtid text , pid integer , blocking_pids integer[] , granted boolean );
      
      





Also, to speed up the insertion - no indexes or restrictions.



To store the history directly, a partitioned table is used:



activity_hist.archive_locking:
 DROP TABLE IF EXISTS activity_hist.archive_locking; CREATE TABLE activity_hist.archive_locking ( timepoint timestamp without time zone , locktype text , relation oid , mode text , tid xid , vtid text , pid integer , blocking_pids integer[] , granted boolean ) PARTITION BY RANGE (timepoint);
      
      





Since there are no requirements for insertion speed in this case, some indexes have been created to speed up reporting.



2. Filling the current story



To directly capture snapshots of a view, a bash script is used that runs the plpgsql function.



get_current_activity.sh
 #!/bin/bash ######################################################### #get_current_activity.sh ERROR_FILE='/home/demon/get_current_activity'$(date +%Y%m%d-)'T'$(date +%H)$(date +%M)$(date +%S) host=$1 s_name=$2 s_pass=$3 psql -A -t -q -v ON_ERROR_STOP=1 -c "SELECT activity_hist.get_current_activity( '$host' , '$s_name' , '$s_pass' )" >/dev/null 2>$ERROR_FILE line_count=`cat $ERROR_FILE | wc -l` if [[ $line_count != '0' ]]; then rm -f /home/demon/*.err >/dev/null 2>/dev/null cp $ERROR_FILE $ERROR_FILE'.err' >/dev/null 2>/dev/null fi rm $ERROR_FILE >/dev/null 2>/dev/null exit 0
      
      





The plpgsql dblink function accesses views in the target database and inserts rows into service tables in the monitoring database.



get_current_activity.sql
 CREATE OR REPLACE FUNCTION activity_hist.get_current_activity( current_host text , current_s_name text , current_s_pass text ) RETURNS BOOLEAN AS $$ DECLARE database_rec record; dblink_str text ; BEGIN EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||current_host||' port=5432 dbname=postgres'|| ' user='||current_s_name||' password='||current_s_pass|| ' '')'; -------------------------------------------------------------------- --GET pg_stat_activity stats INSERT INTO activity_hist.history_pg_stat_activity ( SELECT * FROM dblink('LINK1', 'SELECT now() , datid , datname , pid , usesysid , usename , application_name , client_addr , client_hostname , client_port , backend_start , xact_start , query_start , state_change , wait_event_type , wait_event , state , backend_xid , backend_xmin , query , backend_type FROM pg_stat_activity ') AS t ( timepoint timestamp without time zone , datid oid , datname name , pid integer, usesysid oid , usename name , application_name text , client_addr inet , client_hostname text , client_port integer, backend_start timestamp with time zone , xact_start timestamp with time zone , query_start timestamp with time zone , state_change timestamp with time zone , wait_event_type text , wait_event text , state text , backend_xid xid , backend_xmin xid , query text , backend_type text ) ); --------------------------------------- --ACTIVITY_HIST.HISTORY_LOCKING INSERT INTO activity_hist.history_locking ( SELECT * FROM dblink('LINK1', 'SELECT now() , lock.locktype, lock.relation, lock.mode, lock.transactionid as tid, lock.virtualtransaction as vtid, lock.pid, pg_blocking_pids(lock.pid), lock.granted FROM pg_catalog.pg_locks lock LEFT JOIN pg_catalog.pg_database db ON db.oid = lock.database WHERE NOT lock.pid = pg_backend_pid() ') AS t ( timepoint timestamp without time zone , locktype text , relation oid , mode text , tid xid , vtid text , pid integer , blocking_pids integer[] , granted boolean ) ); PERFORM dblink_disconnect('LINK1'); RETURN TRUE ; END $$ LANGUAGE plpgsql;
      
      





To collect snapshots of the view, systemd service is used, and two scripts:



pg_current_activity.service
 # /etc/systemd/system/pg_current_activity.service [Unit] Description=Collect history of pg_stat_activity , pg_locks Wants=pg_current_activity.timer [Service] Type=forking StartLimitIntervalSec=0 ExecStart=/home/postgres/pgutils/demon/get_current_activity.sh 10.124.70.40 postgres postgres [Install] WantedBy=multi-user.target
      
      





pg_current_activity.timer
 # /etc/systemd/system/pg_current_activity.timer [Unit] Description=Run pg_current_activity.sh every 1 second Requires=pg_current_activity.service [Timer] Unit=pg_current_activity.service OnCalendar=*:*:0/1 AccuracySec=1 [Install] WantedBy=timers.target
      
      







Assign rights to scripts:

# chmod 755 pg_current_activity.timer

# chmod 755 pg_current_activity.service



We start the service:

# systemctl daemon-reload

# systemctl start pg_current_activity.service



Thus, the history of performances is collected in the form of every second shots. Of course, if you leave everything as it is, the tables will increase in size very quickly and more or less productive work will become impossible.



It is necessary to organize data archiving.



3.Archive history



For archiving, partitioned tables * are used.



New sections are created every hour, while old data from the history * tables is deleted, so the size of the history * tables does not change much and the insertion speed does not degrade over time.



Creating new sections is performed by the plpgsql function activity_hist.archive_current_activity. The algorithm of work is very simple (on the example of a section for the archive_pg_stat_activity table).



Create and fill in a new section
 EXECUTE format( 'CREATE TABLE ' || partition_name || ' PARTITION OF activity_hist.archive_pg_stat_activity FOR VALUES FROM ( %L ) TO ( %L ) ' , to_char(date_trunc('year', partition_min_range ),'YYYY')||'-'|| to_char(date_trunc('month', partition_min_range ),'MM')||'-'|| to_char(date_trunc('day', partition_min_range ),'DD')||' '|| to_char(date_trunc('hour', partition_min_range ),'HH24')||':00', to_char(date_trunc('year', partition_max_range ),'YYYY')||'-'|| to_char(date_trunc('month', partition_max_range ),'MM')||'-'|| to_char(date_trunc('day', partition_max_range ),'DD')||' '|| to_char(date_trunc('hour', partition_max_range ),'HH24')||':00' ); INSERT INTO activity_hist.archive_pg_stat_activity ( SELECT * FROM activity_hist.history_pg_stat_activity WHERE timepoint BETWEEN partition_min_range AND partition_max_range );
      
      





Create indexes
 EXECUTE format ( 'CREATE INDEX '||index_name|| ' ON '||partition_name||' ( wait_event_type , backend_type , timepoint )' ); EXECUTE format ('CREATE INDEX '||index_name|| ' ON '||partition_name||' ( wait_event_type , backend_type , timepoint , queryid )' );
      
      





Delete old data from the history_pg_stat_activity table
 DELETE FROM activity_hist.history_pg_stat_activity WHERE timepoint < partition_max_range;
      
      





Of course, periodically, old sections are deleted as unnecessary.



Basic reports



Actually, why is all this being done. To receive reports very remotely, roughly reminiscent of Oracle AWR.

It is important to add that in order to receive reports, it is necessary to build a relationship between the pg_stat_activity and pg_stat_statements views. Tables are linked by adding the 'queryid' column to the tables 'history_pg_stat_activity', 'archive_pg_stat_activity'. The way to add a column value is beyond the scope of this article and is described here - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? .

TOTAL CPU TIME FOR QUERIES



Inquiry :
 WITH hist AS ( SELECT aa.query ,aa.queryid , count(*) * interval '1 second' AS duration FROM activity_hist.archive_pg_stat_activity aa WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NULL ) ANDaa.state = 'active' GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid UNION SELECT ha.query ,ha.queryid, count(*) * interval '1 second' AS duration FROM activity_hist.history_pg_stat_activity_for_reports ha WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NULL )AND ha.state = 'active' GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid ) SELECT query , queryid , SUM( duration ) as duration FROM hist GROUP BY query , queryid ORDER BY 3 DESC
      
      





Example:

 ------------------------------------------------------------------- | TOTAL CPU TIME FOR QUERIES : 07:47:36 +----+----------------------------------------+-------------------- | #| queryid| duration +----+----------------------------------------+-------------------- | 1| 389015618226997618| 04:28:58 | 2| | 01:07:29 | 3| 1237430309438971376| 00:59:38 | 4| 4710212362688288619| 00:50:48 | 5| 28942442626229688| 00:15:50 | 6| 9150846928388977274| 00:04:46 | 7| -6572922443698419129| 00:00:06 | 8| | 00:00:01 +----+----------------------------------------+--------------------
      
      







TOTAL WAITINGS TIME FOR QUERIES



Inquiry :
 WITH hist AS ( SELECT aa.query ,aa.queryid , count(*) * interval '1 second' AS duration FROM activity_hist.archive_pg_stat_activity aa WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type IS NOT NULL ) GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid UNION SELECT ha.query ,ha.queryid, count(*) * interval '1 second' AS duration FROM activity_hist.history_pg_stat_activity_for_reports ha WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type IS NOT NULL ) GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid ) SELECT query , queryid , SUM( duration ) as duration FROM hist GROUP BY query , queryid ORDER BY 3 DESC
      
      





An example:

 ------------------------------------------------------------------- | TOTAL WAITINGS TIME FOR QUERIES : 21:55:04 +----+----------------------------------------+-------------------- | #| queryid| duration +----+----------------------------------------+-------------------- | 1| 389015618226997618| 16:19:05 | 2| | 03:47:04 | 3| 8085340880788646241| 00:40:20 | 4| 4710212362688288619| 00:13:35 | 5| 9150846928388977274| 00:12:25 | 6| 28942442626229688| 00:11:32 | 7| 1237430309438971376| 00:09:45 | 8| 2649515222348904837| 00:09:37 | 9| | 00:03:45 | 10| 3167065002719415275| 00:02:20 | 11| 5731212217001535134| 00:02:13 | 12| 8304755792398128062| 00:01:31 | 13| 2649515222348904837| 00:00:59 | 14| 2649515222348904837| 00:00:22 | 15| | 00:00:12 | 16| 3422818749220588372| 00:00:08 | 17| -5730801771815999400| 00:00:03 | 18| -1473395109729441239| 00:00:02 | 19| 2404820632950544954| 00:00:02 | 20| -6572922443698419129| 00:00:02 | 21| 2369289265278398647| 00:00:01 | 22| 180077086776069052| 00:00:01 +----+----------------------------------------+--------------------
      
      





WAITINGS FOR QUERIES



Requests:
 WITH hist AS ( SELECT aa.wait_event_type , aa.wait_event FROM activity_hist.archive_pg_stat_activity aa WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND aa.wait_event IS NOT NULL GROUP BY aa.wait_event_type , aa.wait_event UNION SELECT ha.wait_event_type , ha.wait_event FROM activity_hist.history_pg_stat_activity_for_reports ha WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ha.wait_event IS NOT NULL GROUP BY ha.wait_event_type , ha.wait_event ) SELECT wait_event_type , wait_event FROM hist GROUP BY wait_event_type , wait_event ORDER BY 1 ASC,2 ASC ---------------------------------------------------------------------- WITH hist AS ( SELECT aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid , count(*) * interval '1 second' AS duration FROM activity_hist.archive_pg_stat_activity aa WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( aa.wait_event_type = waitings_stat_rec.wait_event_type AND aa.wait_event = waitings_stat_rec.wait_event ) GROUP BY aa.wait_event_type , aa.wait_event , aa.query ,aa.queryid UNION SELECT ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid, count(*) * interval '1 second' AS duration FROM activity_hist.history_pg_stat_activity_for_reports ha WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND backend_type = 'client backend' AND datname != 'postgres' AND ( ha.wait_event_type = waitings_stat_rec.wait_event_type AND ha.wait_event = waitings_stat_rec.wait_event ) GROUP BY ha.wait_event_type , ha.wait_event , ha.query ,ha.queryid ) SELECT query , queryid , SUM( duration ) as duration FROM hist GROUP BY query , queryid ORDER BY 3 DESC
      
      





Example:

 ------------------------------------------------ | WAITINGS FOR QUERIES +----------------------------------------------- | wait_event_type = Client| | wait_event = ClientRead| | Total time = 00:46:56| ------------------------------------------------ | #| queryid| duration +-----+--------------------+-------------------- | 1| 8085340880788646241| 00:40:20 | 2| | 00:03:45 | 3| 5731212217001535134| 00:01:53 | 4| | 00:00:12 | 5| 9150846928388977274| 00:00:09 | 6| 3422818749220588372| 00:00:08 | 7| 1237430309438971376| 00:00:06 | 8| 28942442626229688| 00:00:05 | 9| 4710212362688288619| 00:00:05 | 10|-5730801771815999400| 00:00:03 | 11| 8304755792398128062| 00:00:02 | 12|-6572922443698419129| 00:00:02 | 13|-1473395109729441239| 00:00:02 | 14| 2404820632950544954| 00:00:02 | 15| 180077086776069052| 00:00:01 | 16| 2369289265278398647| 00:00:01 +----------------------------------------------- | wait_event_type = IO| | wait_event = BufFileRead| | Total time = 00:00:38| ------------------------------------------------ | #| queryid| duration +-----+--------------------+-------------------- | 1| 28942442626229688| 00:00:38 +-----------------------------------------------
      
      





LOCKED PROCESSES HISTORY



Inquiry:
 SELECT MIN(date_trunc('second',timepoint)) AS started , count(*) * interval '1 second' as duration , pid , blocking_pids , relation , mode , locktype FROM activity_hist.archive_locking al WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND locktype = 'relation' GROUP BY pid , blocking_pids , relation , mode , locktype UNION SELECT MIN(date_trunc('second',timepoint)) AS started , count(*) * interval '1 second' as duration , pid , blocking_pids , relation , mode , locktype FROM activity_hist.history_locking WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND locktype = 'relation' GROUP BY pid , blocking_pids , relation , mode , locktype ORDER BY 1
      
      





Example:

  -------------------------------------------------- -------------------------------------------------- ---------------------------------
 |  LOCKED PROCESSES HISTORY
 + ----- + ---------- + -------------------- + ---------- + -------------------- + -------------------- + -------- ------------ + --------------------
 |  # |  pid |  started |  duration |  blocking_pids |  relation |  mode |  locktype
 + ----- + ---------- + -------------------- + ---------- + -------------------- + -------------------- + -------- ------------ + --------------------
 |  1 |  26224 |  2019-09-02 19: 32: 16 |  00: 01: 45 |  {26211} |  16541 |  AccessShareLock |  relation
 |  2 |  26390 |  2019-09-02 19: 34: 03 |  00: 00: 53 |  {26211} |  16541 |  AccessShareLock |  relation
 |  3 |  26391 |  2019-09-02 19: 34: 03 |  00: 00: 53 |  {26211} |  16541 |  AccessShareLock |  relation
 |  4 |  26531 |  2019-09-02 19: 35: 27 |  00: 00: 12 |  {26211} |  16541 |  AccessShareLock |  relation
 |  5 |  27284 |  2019-09-02 19: 44: 02 |  00: 00: 19 |  {27276} |  16541 |  AccessShareLock |  relation
 |  6 |  27283 |  2019-09-02 19: 44: 02 |  00: 00: 19 |  {27276} |  16541 |  AccessShareLock |  relation
 |  7 |  27286 |  2019-09-02 19: 44: 02 |  00: 00: 19 |  {27276} |  16541 |  AccessShareLock |  relation
 |  8 |  27423 |  2019-09-02 19:45:24 |  00: 00: 12 |  {27394} |  16541 |  AccessShareLock |  relation
 |  9 |  27648 |  2019-09-02 19: 48: 06 |  00: 00: 20 |  {27647} |  16541 |  AccessShareLock |  relation
 |  10 |  27650 |  2019-09-02 19: 48: 06 |  00: 00: 20 |  {27647} |  16541 |  AccessShareLock |  relation
 |  11 |  27735 |  2019-09-02 19: 49: 08 |  00: 00: 06 |  {27650} |  16541 |  AccessExclusiveLock |  relation
 |  12 |  28380 |  2019-09-02 19: 56: 03 |  00: 01: 56 |  {28379} |  16541 |  AccessShareLock |  relation
 |  13 |  28379 |  2019-09-02 19: 56: 03 |  00: 00: 01 |  28377 |  16541 |  AccessExclusiveLock |  relation
 |  |  |  |  |  28376 |  | 


BLOCKING PROCESSES HISTORY



Requests:
 SELECT blocking_pids FROM activity_hist.archive_locking al WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND locktype = 'relation' GROUP BY blocking_pids UNION SELECT blocking_pids FROM activity_hist.history_locking WHERE timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND locktype = 'relation' GROUP BY blocking_pids ORDER BY 1 --------------------------------------------------------------- SELECT pid , usename , application_name , datname , MIN(date_trunc('second',timepoint)) as started , count(*) * interval '1 second' as duration , state , query FROM activity_hist.archive_pg_stat_activity WHERE pid= current_pid AND timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') GROUP BY pid , usename , application_name , datname , state_change, state , query UNION SELECT pid , usename , application_name , datname , MIN(date_trunc('second',timepoint)) as started , count(*) * interval '1 second' as duration , state , query FROM activity_hist.history_pg_stat_activity_for_reports WHERE pid= current_pid AND timepoint BETWEEN pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') GROUP BY pid , usename , application_name , datname , state_change, state , query ORDER BY 5 , 1
      
      





Example:

  -------------------------------------------------- -------------------------------------------------- -------------------------------------------------- -----------------------
 BLOCKING PROCESSES HISTORY
 + ---- + ---------- + ---------- + -------------------- + - --------- + -------------------- + ------------------- - + ------------------------------ + ----------------- -----------------------
 |  # |  pid |  usename |  application_name |  datname |  started |  duration |  state |  query
 + ---- + ---------- + ---------- + -------------------- + - --------- + -------------------- + ------------------- - + ------------------------------ + ----------------- -----------------------
 |  1 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 31: 54 |  00: 00: 04 |  idle |
 |  2 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 31: 58 |  00: 00: 06 |  idle in transaction |  begin;
 |  3 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 32: 16 |  00: 01: 45 |  idle in transaction |  lock table wafer_data;
 |  4 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 35: 54 |  00: 01: 23 |  idle |  commit;
 |  5 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 38: 46 |  00: 00: 02 |  idle in transaction |  begin;
 |  6 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 38: 54 |  00: 00: 08 |  idle in transaction |  lock table wafer_data;
 |  7 |  26211 |  tuser |  psql |  tdb1 |  2019-09-02 19: 39: 08 |  00: 42: 42 |  idle |  commit;
 |  8 |  26211 |  tuser |  psql |  tdb1 |  2019-09-03 07: 12: 07 |  00: 00: 52 |  active |  select test_del ();




Development.



The shown basic queries and received reports already greatly simplify life when analyzing performance incidents.

Based on basic queries, you can get a report remotely approximately reminiscent of Oracle AWR.

Sample Summary Report
  + ------------------------------------------------- -----------------------------------
 |  CONSOLIDATED REPORT FOR ACTIVITY AND WAITINGS.  DATETIME: 09/03/2019 14:08
 | ------------------------------------------------- -----------------------------------
 |  HOST: 10.124.70.40
 |  BEGIN_SNAPSHOT: 09/02/2019 14:08 END_SNAPSHOT: 09/03/2019 14:00
 | ------------------------------------------------- -----------------------------------
 |  CURRENT DATABASES SIZE:
 |  DATABASE: monitor
 |  SIZE (MB): 1370.00
 | ------------------------------------------------- -----------------------------------
 |  CLUSTER CPU TIME: 19:44:22
 |  CLUSTER WAITINGS TIME: 78:49:16
 |
 |  SQL DBTIME: 65:53:09
 |  SQL CPU TIME: 19:05:21
 |  SQL WAITINGS TIME: 21:50:46
 |  SQL IOTIME: 20:53:00
 |  SQL READ TIME: 20:52:55
 |  SQL WRITE TIME: 00:00:05
 |
 |  SQL CALLS: 311293
 -------------------------------------------------- -----------
 |  SQL SHARED BLOCKS READS: 13351563334
 |  SQL SHARED BLOCKS HITS: 2775427045
 |  SQL SHARED BLOCKS HITS / READS%: 20.79
 |  SQL SHARED BLOCKS DIRTED: 21105
 |  SQL SHARED BLOCKS WRITTEN: 3656
 |
 |  SQL TEMPORARY BLOCKS READS: 7464932
 |  SQL TEMPORARY BLOCKS WRITTEN: 10176024
 -------------------------------------------------- -----------
 |
 |  WAITINGS STATICTICS
 |
 + ------------------------------------------------- -----------------------------------
 |  TOP 10 WAITINGS BY TOTAL WAIT TIME FOR SYSTEM PROCESSES
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  # |  wait_event_type |  wait_event |  duration
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  1 |  Activity |  LogicalLauncherMain |  11:21:01
 |  2 |  Activity |  CheckpointerMain |  11:20:35
 |  3 |  Activity |  AutoVacuumMain |  11:20:31
 |  4 |  Activity |  WalWriterMain |  11:19:35
 |  5 |  Activity |  BgWriterMain |  10:14:19
 |  6 |  Activity |  BgWriterHibernate |  01:06:04
 |  7 |  Activity |  WalSenderMain |  00:04:05
 |  8 |  Client |  ClientWrite |  00:04:00
 |  9 |  IO |  BufFileWrite |  00:02:45
 |  10 |  LWLock |  buffer_mapping |  00:02:14
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  TOP 10 WAITINGS BY TOTAL WAIT TIME FOR CLIENTS PROCESSES
 + ----- + ------------------------------ + ------------ -------- + -------------------- + ----------
 |  # |  wait_event_type |  wait_event |  duration |  % dbtime
 + ----- + ------------------------------ + ------------ -------- + -------------------- + ----------
 |  1 |  Lock |  transactionid |  11: 55: 37 |  18.1
 |  2 |  IO |  DataFileRead |  07: 19: 43 |  12/12
 |  3 |  Client |  ClientRead |  00: 46: 54 |  1.19
 |  4 |  Lock |  relation |  00: 40: 37 |  1.03
 |  5 |  LWLock |  buffer_mapping |  00: 31: 08 |  0.79
 |  6 |  LWLock |  buffer_io |  00: 22: 12 |  0.56
 |  7 |  Timeout |  PgSleep |  00: 10: 58 |  0.28
 |  8 |  Lock |  tuple |  00: 01: 30 |  0.04
 |  9 |  IO |  BufFileWrite |  00: 01: 16 |  0.03
 |  10 |  IO |  BufFileRead |  00: 00: 37 |  0.02
 + ----- + ------------------------------ + ------------ -------- + -------------------- + ----------
 |  WAITINGS TYPES BY TOTAL WAIT TIME, FOR SYSTEM PROCESSES
 + ----- + ------------------------------ + ------------ --------
 |  # |  wait_event_type |  duration
 + ----- + ------------------------------ + ------------ --------
 |  1 |  Activity |  56:46:10
 |  2 |  IO |  00:05:13
 |  3 |  Client |  00:04:00
 |  4 |  LWLock |  00:03:07
 + ----- + ------------------------------ + ------------ --------
 |  WAITINGS TYPES BY TOTAL WAIT TIME, FOR CLIENTS PROCESSES
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  # |  wait_event_type |  duration |  % dbtime
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  1 |  Lock |  12: 37: 44 |  19.17
 |  2 |  IO |  07: 21: 40 |  11.17
 |  3 |  LWLock |  00: 53: 26 |  1.35
 |  4 |  Client |  00: 46: 54 |  1.19
 |  5 |  Timeout |  00: 10: 58 |  0.28
 |  6 |  IPC |  00: 00: 04 |  0
 + ----- + ------------------------------ + ------------ -------- + --------------------
 |  WAITINGS FOR SYSTEM PROCESSES
 + ----- + ----------------------------- + ---------- + - ------------------ + ---------------------- + -------- ------------
 |  # |  backend_type |  dbname |  wait_event_type |  wait_event |  duration
 + ----- + ----------------------------- + ---------- + - ------------------ + ---------------------- + -------- ------------
 |  1 |  logical replication launcher |  |  Activity |  LogicalLauncherMain |  11:21:01
 |  2 |  checkpointer |  |  Activity |  CheckpointerMain |  11:20:35
 |  3 |  autovacuum launcher |  |  Activity |  AutoVacuumMain |  11:20:31
 |  4 |  walwriter |  |  Activity |  WalWriterMain |  11:19:35
 |  5 |  background writer |  |  Activity |  BgWriterMain |  10:14:19
 |  6 |  background writer |  |  Activity |  BgWriterHibernate |  01:06:04
 |  7 |  walsender |  |  Activity |  WalSenderMain |  00:04:05
 |  8 |  walsender |  |  Client |  ClientWrite |  00:04:00
 |  9 |  parallel worker |  tdb1 |  IO |  BufFileWrite |  00:02:45
 |  10 |  parallel worker |  tdb1 |  LWLock |  buffer_mapping |  00:02:05
 |  11 |  parallel worker |  tdb1 |  IO |  DataFileRead |  00:01:10
 |  12 |  parallel worker |  tdb1 |  IO |  BufFileRead |  00:01:05
 |  13 |  parallel worker |  tdb1 |  LWLock |  buffer_io |  00:00:45
 |  14 |  autovacuum worker |  tdb1 |  LWLock |  buffer_mapping |  00:00:09
 |  15 |  walwriter |  |  IO |  WALWrite |  00:00:08
 |  16 |  walwriter |  |  LWLock |  WALWriteLock |  00:00:04
 |  17 |  background writer |  |  LWLock |  WALWriteLock |  00:00:03
 |  18 |  background writer |  |  IO |  WALWrite |  00:00:02
 |  19 |  background writer |  |  IO |  DataFileWrite |  00:00:02
 |  20 |  checkpointer |  |  IO |  ControlFileSyncUpdate |  00:00:01
 |  21 |  autovacuum worker |  tdb1 |  LWLock |  buffer_io |  00:00:01
 + ----- + ----------------------------- + ---------- + - ------------------ + ---------------------- + -------- ------------
 |  WAITINGS FOR SQL
 + ----- + ------------------------- + ---------- + ------ -------------- + -------------------- + -------------- ------ + ----------
 |  # |  queryid |  dbname |  wait_event_type |  wait_event |  duration |  % dbtime
 + ----- + ------------------------- + ---------- + ------ -------------- + -------------------- + -------------- ------ + ----------
 |  1 |  389015618226997618 |  tdb1 |  Lock |  transactionid |  09: 47: 43 |  14.87
 |  2 |  389015618226997618 |  tdb1 |  IO |  DataFileRead |  05: 47: 07 |  8.78
 |  3 |  |  tdb1 |  Lock |  transactionid |  02: 07: 54 |  3.24
 |  4 |  |  tdb1 |  IO |  DataFileRead |  01: 30: 24 |  2.29
 |  5 |  8085340880788646241 |  tdb1 |  Client |  ClientRead |  00: 40: 20 |  1.02
 |  6 |  389015618226997618 |  tdb1 |  LWLock |  buffer_mapping |  00: 20: 41 |  0.52
 |  7 |  389015618226997618 |  tdb1 |  LWLock |  buffer_io |  00: 17: 30 |  0.44
 |  8 |  2649515222348904837 |  tdb1 |  Timeout |  PgSleep |  00: 10: 58 |  0.28
 |  9 |  4710212362688288619 |  tdb1 |  Lock |  relation |  00: 10: 44 |  0.27
 |  10 |  9150846928388977274 |  tdb1 |  Lock |  relation |  00: 10: 24 |  0.26
 |  11 |  28942442626229688 |  tdb1 |  Lock |  relation |  00: 07: 48 |  0.2
 |  12 |  1237430309438971376 |  tdb1 |  Lock |  relation |  00: 07: 32 |  0.19
 |  13 |  |  tdb1 |  LWLock |  buffer_mapping |  00: 04: 32 |  0.11
 |  14 |  |  tdb1 |  LWLock |  buffer_io |  00: 04: 13 |  0.11
 |  15 |  |  tdb1 |  Client |  ClientRead |  00: 03: 57 |  0.1
 |  16 |  4710212362688288619 |  tdb1 |  LWLock |  buffer_mapping |  00: 02: 26 |  0.06
 |  17 |  3167065002719415275 |  tdb1 |  Lock |  relation |  00: 02: 20 |  0.06
 |  18 |  5731212217001535134 |  tdb1 |  Client |  ClientRead |  00: 01: 53 |  0.05
 |  19 |  1237430309438971376 |  tdb1 |  LWLock |  buffer_mapping |  00: 01: 42 |  0.04
 |  20 |  389015618226997618 |  tdb1 |  Lock |  tuple |  00: 01: 30 |  0.04
 |  21 |  8304755792398128062 |  tdb1 |  Lock |  relation |  00: 01: 29 |  0.04
 |  22 |  28942442626229688 |  tdb1 |  IO |  BufFileWrite |  00: 01: 16 |  0.03
 |  23 |  9150846928388977274 |  tdb1 |  IO |  DataFileRead |  00: 01: 07 |  0.03
 |  24 |  28942442626229688 |  tdb1 |  LWLock |  buffer_mapping |  00: 01: 03 |  0.03
 |  25 |  9150846928388977274 |  tdb1 |  LWLock |  buffer_mapping |  00: 00: 44 |  0.02
 |  26 |  28942442626229688 |  tdb1 |  IO |  BufFileRead |  00: 00: 37 |  0.02
 |  27 |  28942442626229688 |  tdb1 |  LWLock |  buffer_io |  00: 00: 25 |  0.01
 |  28 |  1237430309438971376 |  tdb1 |  IO |  DataFileRead |  00: 00: 24 |  0.01
 |  29 |  28942442626229688 |  tdb1 |  IO |  DataFileRead |  00: 00: 22 |  0.01
 |  30 |  5731212217001535134 |  tdb1 |  Lock |  relation |  00: 00: 20 |  0.01
 |  31 |  4710212362688288619 |  tdb1 |  IO |  DataFileRead |  00: 00: 19 |  0.01
 |  32 |  9150846928388977274 |  tdb1 |  Client |  ClientRead |  00: 00: 09 |  0
 |  33 |  3422818749220588372 |  tdb1 |  Client |  ClientRead |  00: 00: 08 |  0
 |  34 |  1237430309438971376 |  tdb1 |  Client |  ClientRead |  00: 00: 06 |  0
 |  35 |  389015618226997618 |  tdb1 |  LWLock |  buffer_content |  00: 00: 05 |  0
 |  36 |  4710212362688288619 |  tdb1 |  Client |  ClientRead |  00: 00: 05 |  0
 |  37 |  4710212362688288619 |  tdb1 |  LWLock |  buffer_io |  00: 00: 04 |  0
 |  38 |  28942442626229688 |  tdb1 |  Client |  ClientRead |  00: 00: 04 |  0
 |  39 |  28942442626229688 |  tdb1 |  IPC |  ParallelFinish |  00: 00: 03 |  0
 |  40 |  389015618226997618 |  tdb1 |  IO |  DataFileWrite |  00: 00: 02 |  0
 |  41 |  -5730801771815999400 |  tdb1 |  Client |  ClientRead |  00: 00: 02 |  0
 |  42 |  2404820632950544954 |  tdb1 |  Client |  ClientRead |  00: 00: 02 |  0
 |  43 |  -6572922443698419129 |  tdb1 |  Client |  ClientRead |  00: 00: 02 |  0
 |  44 |  8304755792398128062 |  tdb1 |  Client |  ClientRead |  00: 00: 02 |  0
 |  45 |  -1473395109729441239 |  tdb1 |  Client |  ClientRead |  00: 00: 02 |  0
 |  46 |  |  tdb1 |  LWLock |  buffer_content |  00: 00: 01 |  0
 |  47 |  180077086776069052 |  tdb1 |  Client |  ClientRead |  00: 00: 01 |  0
 |  48 |  |  tdb1 |  IO |  DataFileWrite |  00: 00: 01 |  0
 |  49 |  28942442626229688 |  tdb1 |  IPC |  MessageQueueReceive |  00: 00: 01 |  0
 |  50 |  2369289265278398647 |  tdb1 |  Client |  ClientRead |  00: 00: 01 |  0
 |  51 |  9150846928388977274 |  tdb1 |  IO |  DataFileWrite |  00: 00: 01 |  0
 + ----- + ------------------------- + ---------- + ------ -------------- + -------------------- + -------------- ------ + ----------
 |
 |  CLIENT SQL STATICTICS
 |
 + ------------------------------------------------- -----------------------------------
 |  CLIENT SQL ordered by Elapsed Time
 + -------------------- + ---------- + ---------- + ------ ---- + ---------- + ---------- + --------------------
 |  elapsed time |  calls |  % dbtime |  % CPU |  % IO |  dbname |  queryid
 + -------------------- + ---------- + ---------- + ------ ---- + ---------- + ---------- + --------------------
 |  06: 43: 19 |  36 |  10.2 |  September 85 |  17.38 |  tdb1 |  389015618226997618
 |  02: 06: 53 |  715 |  3.21 |  0.85 |  0.06 |  tdb1 |  1237430309438971376
 |  01: 52: 07 |  720 |  2.84 |  1.19 |  0.08 |  tdb1 |  4710212362688288619
 |  00: 39: 03 |  357 |  0.99 |  1.02 |  0.33 |  tdb1 |  28942442626229688
 |  00: 22: 00 |  8 |  0.56 |  0.96 |  0 |  tdb1 |  2649515222348904837
 + -------------------- + ---------- + ---------- + ------ ---- + ---------- + ---------- + --------------------
 |  CLIENT SQL ordered by CPU Time
 + -------------------- + ---------- + ---------- + ------ ---- + ---------- + ---------- + ---------- + ------------ --------
 |  cpu time |  calls | % dbtime|total_time| % CPU| % IO| dbname| queryid
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | 16:14:38| 36| 10.2| 06:43:19| 85.09| 17.38| tdb1| 389015618226997618
 | 00:13:38| 720| 2.84| 01:52:07| 1.19| 0.08| tdb1| 4710212362688288619
 | 00:11:39| 357| 0.99| 00:39:03| 1.02| 0.33| tdb1| 28942442626229688
 | 00:10:58| 8| 0.56| 00:22:00| 0.96|  0 | tdb1| 2649515222348904837
 | 00:09:44| 715| 3.21| 02:06:53| 0.85| 0.06| tdb1| 1237430309438971376
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | CLIENT SQL ordered by User I/O Wait Time
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | io_wait time| calls| % dbtime|total_time| % CPU| % IO| dbname| queryid
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | 05:47:09| 36| 10.2| 06:43:19| 85.09| 17.38| tdb1| 389015618226997618
 | 00:02:15| 357| 0.99| 00:39:03| 1.02| 0.33| tdb1| 28942442626229688
 | 00:00:24| 715| 3.21| 02:06:53| 0.85| 0.06| tdb1| 1237430309438971376
 | 00:00:19| 720| 2.84| 01:52:07| 1.19| 0.08| tdb1| 4710212362688288619
 | 00:00:00| 8| 0.56| 00:22:00| 0.96|  0 | tdb1| 2649515222348904837
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | CLIENT SQL ordered by Shared Buffers Reads
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | buffers reads| calls| % dbtime|total_time| % CPU| % IO| dbname| queryid
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | 2562353244| 36| 10.2| 06:43:19| 85.09| 17.38| tdb1| 389015618226997618
 | 11041689| 357| 0.99| 00:39:03| 1.02| 0.33| tdb1| 28942442626229688
 | 3303551| 715| 3.21| 02:06:53| 0.85| 0.06| tdb1| 1237430309438971376
 | 3242892| 720| 2.84| 01:52:07| 1.19| 0.08| tdb1| 4710212362688288619
 |  0 | 8| 0.56| 00:22:00| 0.96|  0 | tdb1| 2649515222348904837
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | CLIENT SQL ordered by Disk Reads Time
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | read time| calls| % dbtime|total_time| % CPU| % IO| dbname| queryid
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | 03:37:46| 36| 10.2| 06:43:19| 85.09| 17.38| tdb1| 389015618226997618
 | 00:04:07| 357| 0.99| 00:39:03| 1.02| 0.33| tdb1| 28942442626229688
 | 00:00:59| 720| 2.84| 01:52:07| 1.19| 0.08| tdb1| 4710212362688288619
 | 00:00:42| 715| 3.21| 02:06:53| 0.85| 0.06| tdb1| 1237430309438971376
 | 00:00:00| 8| 0.56| 00:22:00| 0.96|  0 | tdb1| 2649515222348904837
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | CLIENT SQL ordered by Executions
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | calls| rows| % dbtime|total_time| % CPU| % IO| dbname| queryid
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | 720| 720| 2.84| 01:52:07| 1.19| 0.08| tdb1| 4710212362688288619
 | 715| 715| 3.21| 02:06:53| 0.85| 0.06| tdb1| 1237430309438971376
 | 357|  0 | 0.99| 00:39:03| 1.02| 0.33| tdb1| 28942442626229688
 | 36| 36| 10.2| 06:43:19| 85.09| 17.38| tdb1| 389015618226997618
 | 8| 8| 0.56| 00:22:00| 0.96|  0 | tdb1| 2649515222348904837
+--------------------+----------+----------+----------+----------+----------+----------+--------------------
 | Complete List of SQL Text
----------------------------------------------
 ...


To be continued. The next step is the formation of the lock history (pg_stat_locks), a more detailed description of the process of filling tables.



All Articles