PostgreSQL的執行計劃分析
想查看Postgresql的執行計劃,下面分析下PG執行計劃中的cost等相關值是怎么計算出來的:
PG的版本是9.1.2
1.終端工具PGADMIN,對執行的語句按F7即可,然后看數據輸出和解釋
2.命令行分析:explain select * from table_name;
一般我們會比較關注消耗值cost和掃描的方式,如走索引或者full scan全表掃描.當COST值消耗比較大時需要注意是否有優化的可能。
與執行計劃相關的幾個參數,參看下面的示例:
kenyon=# select count(1) from dba.website ; --普通堆棧表,無任何索引約束count
20
(1 row)
kenyon=# explain select * from dba.website ;
QUERY PLAN
Seq Scan on website (cost=0.00..1.20 rows=20 width=4) (1 row)
--relpages磁盤頁,reltuples是行數(與實際不一定相符,一般略小) kenyon=# select relpages,reltuples from pg_class where relname = 'website'; relpages | reltuples ----------+----------- 1 | 20 (1 row)
kenyon=# select 11+200.01;
--cost = relpages seq_page_cost + reltuples cpu_tuple_cost?column?
1.20
(1 row)
kenyon=# show cpu_tuple_cost ;
cpu_tuple_cost
0.01 (1 row)
kenyon=# show seq_page_cost;
seq_page_cost
1 (1 row)
--加限制條件的執行計劃
kenyon=# select count(1) from dba.website where hits >15;
count
5
(1 row)
kenyon=# explain select * from dba.website where hits >15;
QUERY PLAN
Seq Scan on website (cost=0.00..1.25 rows=5 width=4) Filter: (hits > 15) (2 rows)
kenyon=# show cpu_operator_cost ;
cpu_operator_cost
0.0025 (1 row)
因為掃描的總數是20行,不變的,所以COST不會下降,相反反而增加了0.05,這是因為額外消耗了CPU的時間去檢查符合約束條件數據,即cost 在原來的基礎上再增加 20 0.0025 = 0.05 (reltuples cpu_operator_cost)
--加索引的執行計劃 kenyon=# select count(1) from dba.website_2 ;
count
8000 (1 row)
kenyon=# explain select * from dba.website_2 ;
QUERY PLAN
Seq Scan on website_2 (cost=0.00..112.00 rows=8000 width=4) (1 row)
kenyon=# select relpages,reltuples from pg_class where relname = 'website_2'; relpages | reltuples ----------+----------- 32 | 8000 (1 row)
kenyon=# explain select * from dba.website_2 where hits >7900; --走的索引
QUERY PLAN
Index Scan using ind_website_2 on website_2 (cost=0.00..10.00 rows=100 width=4) Index Cond: (hits > 7900) (2 rows) () kenyon=# explain select * from dba.website_2 where hits >10; --未走索引(不滿足索引條件,full scan)
QUERY PLAN
Seq Scan on website_2 (cost=0.00..132.00 rows=7991 width=4) -- 132 = 112+8000*0.0025 Filter: (hits > 10) (2 rows)
雖然讀取的COST更大,但是因為索引的緣故,訪問的數據量變小了,所以總體COST是下降的。</pre>--多表JOIN的執行計劃 示例: 若想看實際的一個執行時間,可以加上 analyze 參數
kenyon=# explain analyze select * from dba.website a ,dba.website_2 b where a.hits = b.hits and a.hits >18;QUERY PLAN
Merge Join (cost=1.26..1.90 rows=2 width=8) (actual time=0.070..0.075 rows=2 loops=1) Merge Cond: (b.hits = a.hits) -> Index Scan using ind_website_2 on website_2 b (cost=0.00..235.25 rows=8000 width=4) (actual time=0.013..0.020 rows=21 loops=1) -> Sort (cost=1.26..1.26 rows=2 width=4) (actual time=0.035..0.037 rows=2 loops=1) Sort Key: a.hits Sort Method: quicksort Memory: 17kB -> Seq Scan on website a (cost=0.00..1.25 rows=2 width=4) (actual time=0.009..0.011 rows=2 loops=1) Filter: (hits > 18) Total runtime : 0.120 ms (9 rows) </pre>total runtime 是執行器啟動和關閉的時間,但不包括解析,重寫和規劃的時間
注意: pg_class中的relpages,reltuples數據不是實時更新的,一般在vacuum analyze和少部分DDL(如建立索引)后更新。
示例1:kenyon=# insert into dba.website select generate_series(8000,9000); INSERT 0 1001 kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind ----------+-----------+---------------+--------- 1 | 20 | website | r 32 | 8000 | website_2 | r 20 | 8000 | ind_website_2 | i (3 rows)kenyon=# vacuum analyze dba.website; VACUUM kenyon=# vacuum analyze dba.website; VACUUM kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind ----------+-----------+---------------+--------- 5 | 1021 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i (3 rows)</pre>示例2:
kenyon=# insert into dba.website select generate_series(8000,9000); INSERT 0 1001 kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind ----------+-----------+---------------+--------- 1 | 21 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i (3 rows)kenyon=# create index ind_website on dba.website(hits); CREATE INDEX kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind ----------+-----------+---------------+--------- 5 | 1022 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i 5 | 1022 | ind_website | i (4 rows)</pre>所涉及的系統表:
pg_stats
pg_statistic
pg_class
pg_stat是任何人都可以看的,而且可讀性高,比較直觀,pg_statistic只有superuser才能讀,并且可讀性差,普通人員建議看 pg_stats,pg_stats是pg_statistic的視圖。 這兩個表也不是實時更新的,需要vacuum analyze時會更新
所涉及的系統變量:
default_statistics_target
geqo_threshold
join_collapse_limit
from_collapse_limitkenyon=# show default_statistics_target ;default_statistics_target
100 (1 row)
kenyon=# show geqo_threshold ; --這個參數的大小會設置執行計劃從窮舉搜索到概率選擇性搜索的臨界值
geqo_threshold
12 (1 row)
kenyon=# show join_collapse_limit ; --join連接走執行計劃上限
join_collapse_limit
8 (1 row)
kenyon=# show from_collapse_limit ;
from_collapse_limit
8 (1 row)</pre>EXPLAIN
Name
EXPLAIN— show the execution plan of a statement
Synopsis
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
where option can be one of:
ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
例子:kenyon=# explain (analyze,verbose,costs,buffers) select id from dba.test222 order by id desc limit 1;QUERY PLAN
Limit (cost=1807.80..1807.80 rows=1 width=4) (actual time=87.167..87.168 rows=1 loops=1) Output: id Buffers: shared hit=393 -> Sort (cost=1807.80..2043.60 rows=94320 width=4) (actual time=87.165..87.165 rows=1 loops=1) Output: id Sort Key: test222.id Sort Method: top-N heapsort Memory: 17kB Buffers: shared hit=393 -> Seq Scan on dba.test222 (cost=0.00..1336.20 rows=94320 width=4) (actual time=0.036..42.847 rows=100000 loops=1) Output: id Buffers: shared hit=393 Total runtime: 87.183 ms (12 rows)
kenyon=# explain (analyze,verbose,costs,buffers) select max(id) from dba.test222;
QUERY PLAN
Aggregate (cost=1572.00..1572.01 rows=1 width=4) (actual time=77.679..77.680 rows=1 loops=1) Output: max(id) Buffers: shared hit=393 -> Seq Scan on dba.test222 (cost=0.00..1336.20 rows=94320 width=4) (actual time=0.012..36.908 rows=100000 loops=1) Output: id Buffers: shared hit=393 Total runtime: 77.701 ms (7 rows)</pre>explain參數解釋:
ANALYZE :執行命令并顯示執行事件,默認false
VERBOSE :對執行計劃提供額外的信息,如查詢字段信息等,默認false
COSTS :顯示執行計劃的,默認true
BUFFERS :默認false,前置條件是analyze
FORMAT :默認格式是text