PostgreSQL 執行計劃緩存和失效講解
來自: http://yq.aliyun.com/articles/4197
prepare statement的目的是讓同樣的SQL(無參數)或者類似的SQL(有參數)僅需要一次parse analyse, 因此減少了parse analyse的次數.
同時使用 prepare statement 還有一個功效是cache plan, 也就是說plan的次數也可以減少 .
本文介紹一下PostgreSQL的幾種prepare statement的方法, 僅供參考.
首先是SPI, 如SPI_prepare, 語法如下 :
SPIPlanPtr SPI_prepare ( const char * command , int nargs , Oid * argtypes )
const char * command
command string
int nargs
number of input parameters ( $1 , $2 , etc .)
Oid * argtypes
pointer to an array containing the OIDs of the data types of the parameters
</div>
注意SPI_prepare返回結果類型為SPIPlanPtr(一個指針), 用于存儲plan的數據.
作為SPI_execute_plan或者SPI_execute_plan_with_paramlist 的參數傳入 :
因為SPI_prepare可以帶參數(nargs,argtypes), 也可以不帶參數. SPI_execute_plan建立generic plan 對帶參數和不帶參數的處理方式略有不同, 見下面的解釋.
1, 無參數的情況.
If no parameters are defined , a generic plan will be created at the first use of SPI_execute_plan , and used for all subsequent executions as well .
2, 有參數的情況.
If there are parameters , the first few uses of SPI_execute_plan will generate custom plans that are specific to the supplied parameter values .
After enough uses of the same prepared statement , SPI_execute_plan will build a generic plan , and if that is not too much more expensive than the custom plans , it will start using the generic plan instead of re - planning each time .
另外就是使用如果使用prepared 游標, SPI_prepare_cursor, 可以指定flag. 如下 :
If this default behavior is unsuitable , you can alter it by passing the CURSOR_OPT_GENERIC_PLAN or CURSOR_OPT_CUSTOM_PLAN flag to SPI_prepare_cursor , to force use of generic or custom plans respectively .
</div>
這些flag來自src/include/nodes/parsenodes.h
/* ----------------------
* Declare Cursor Statement
*
* Note: the "query" field of DeclareCursorStmt is only used in the raw grammar
* output. After parse analysis it's set to null, and the Query points to the
* DeclareCursorStmt, not vice versa.
* ----------------------
*/
#define CURSOR_OPT_BINARY 0x0001 /* BINARY */
#define CURSOR_OPT_SCROLL 0x0002 /* SCROLL explicitly given */
#define CURSOR_OPT_NO_SCROLL 0x0004 /* NO SCROLL explicitly given */
#define CURSOR_OPT_INSENSITIVE 0x0008 /* INSENSITIVE */
#define CURSOR_OPT_HOLD 0x0010 /* WITH HOLD */
/* these planner-control flags do not correspond to any SQL grammar: */
#define CURSOR_OPT_FAST_PLAN 0x0020 /* prefer fast-start plan */
#define CURSOR_OPT_GENERIC_PLAN 0x0040 /* force use of generic plan */
#define CURSOR_OPT_CUSTOM_PLAN 0x0080 /* force use of custom plan */
SPI_prepare_cursor 用法 如下 :
SPIPlanPtr SPI_prepare_cursor ( const char * command , int nargs ,
Oid * argtypes , int cursorOptions )
PL/pgsql cache plan 實際上都是調用的SPI. 所以處理cache plan也分兩種情況 (帶參數和不帶參數).
例如如果以下SQL在PL/pgSQL中執行 :
1. select id,info from test where id=$1; 這屬于帶參數的SQL語句.
所以SPI_prepare這條SQL, 當第一次用SPI_execute_plan執行后不會馬上建立generic plan. 而需要有足夠多次使用到同一個prepare statement后才會建立generic plan.
2. select id,info from test where id=1; 這屬于不帶參數的SQL語句.
所以SPI_prepare這條SQL, 當第一次用SPI_execute_plan執行后將會建立generic plan.
選擇使用generic plan還是重新plan(custom plan), 用到這個函數 :
src/backend/utils/cache/plancache.c
00840 /*
00841 * choose_custom_plan: choose whether to use custom or generic plan
00842 *
00843 * This defines the policy followed by GetCachedPlan.
00844 */
00845 static bool
00846 choose_custom_plan ( CachedPlanSource * plansource , ParamListInfo boundParams )
00847 {
00848 double avg_custom_cost ;
00849
00850 /* Never any point in a custom plan if there's no parameters */
00851 if ( boundParams == NULL )
00852 return false ;
00853
00854 /* See if caller wants to force the decision */
00855 if ( plansource -> cursor_options & CURSOR_OPT_GENERIC_PLAN )
00856 return false ;
00857 if ( plansource -> cursor_options & CURSOR_OPT_CUSTOM_PLAN )
00858 return true ;
00859
00860 /* Generate custom plans until we have done at least 5 (arbitrary) */
00861 if ( plansource -> num_custom_plans < 5 )
00862 return true ;
00863
00864 avg_custom_cost = plansource -> total_custom_cost / plansource -> num_custom_plans ;
00865
00866 /*
00867 * Prefer generic plan if it's less than 10% more expensive than average
00868 * custom plan. This threshold is a bit arbitrary; it'd be better if we
00869 * had some means of comparing planning time to the estimated runtime cost
00870 * differential.
00871 *
00872 * Note that if generic_cost is -1 (indicating we've not yet determined
00873 * the generic plan cost), we'll always prefer generic at this point.
00874 */
00875 if ( plansource -> generic_cost < avg_custom_cost * 1.1 )
00876 return false ;
00877
00878 return true ;
00879 }
從函數內容來看,
首先無綁定參數的情況使用generic plan, 因此函數返回false.
然后判斷cursor_option是否包含CURSOR_OPT_GENERIC_PLAN和CURSOR_OPT_CUSTOM_PLAN. 直接返回true或false
然后判斷num_custom_plans, 如果大于等于5則使用generic plan. 小于5則返回true.
最后如果上面都沒有返回, 那么進入cost的判斷. 如果plansource->generic_cost < avg_custom_cost * 1.1 則選擇generic plan.
generic_cost , total_custom_cost 和 num_custom_plans都記錄在CachedPlanSource這個數據結構里面. 因此custom plan不是每次都產生, 而是通過以上函數來選擇的. 每使用一次custom plan都會更新 total_custom_cost 和 num_custom_plans的 數據, 用作下次判斷的依據.
下面舉例說明一下第二種情況, 也就是有參數的情況 :
首先生成測試數據 :
測試表 :
digoal => create table test ( id int , info text );
CREATE TABLE
測試數據 :
digoal => insert into test select 1 , repeat ( random ():: text , 10 ) from generate_series ( 1 , 500000 );
INSERT 0 500000
digoal => insert into test values ( 2 , 'test' );
INSERT 0 1
digoal => select id , count (*) from test group by id ;
id | count
----+--------
1 | 500000
2 | 1
( 2 rows )
</div>
索引 :
digoal => create index idx_test_1 on test ( id );
CREATE INDEX
接下來使用PL/pgsql 的cache plan情況測試 :
PL/pgSQL的cache plan用法細節參見 http://www.postgresql.org/docs/9.2/static/plpgsql-implementation.html
創建函數 :
CREATE OR REPLACE FUNCTION digoal . f_immutable ( i_id integer )
RETURNS bigint
LANGUAGE plpgsql
STRICT IMMUTABLE
AS $function$
declare
result bigint ;
begin
select count (*) into result from digoal . test where id = i_id ;
return result ;
end ;
$function$ ;
測試 :
digoal => \timing
Timing is on .
注意這些SQL的執行耗時. 很明顯的區分索引掃描和全表掃描.
使用PL/pgsql 封裝的 prepared statement 在第一次執行后SPI_execute_plan并沒有馬上產生 generic plan.
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 148.080 ms
因此第二次執行 f_immutable(2) 依然進行了plan , 也就是用了custom plan.
因為1走了全表掃描, 2走了索引掃描.
digoal => select * from f_immutable ( 2 );
f_immutable
-------------
1
( 1 row )
Time : 0.736 ms
</div>
執行計劃如下 :
digoal => explain analyze select count (*) from test where id = 1 ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 245.336 .. 245.336 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.012 .. 166.642 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 245.384 ms
( 5 rows )
Time : 246.075 ms
digoal => explain analyze select count (*) from test where id = 2 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.031 .. 0.031 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.023 .. 0.024 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.083 ms
( 5 rows )
Time : 0.772 ms
那么如SPI_prepare所述, 同一個prepare statement被SPI_execute_plan多次使用后, 將會生成generic plan.
所以接下來我多次(這里選擇5次) 調用 select * from f_immutable(1);
根據前面提到的choose_custom_plan函數, 5次后將會進入 plansource->generic_cost < avg_custom_cost * 1.1 的判斷.
因為前面5次的cost都一樣, 所以存儲在 plansource里的custom cost相關數值得出的custom cost平均值與generic_cost沒有相差10%. 所以會走generic plan. 那就可以解釋為什么第六次的SQL: select * from f_immutable(2); 未走索引了.
digoal => \timing
Timing is on .
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 242.742 ms
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 179.910 ms
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 180.052 ms
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 180.027 ms
digoal => select * from f_immutable ( 1 );
f_immutable
-------------
500000
( 1 row )
Time : 179.758 ms
經過以上5次調用 select * from f_immutable(1);后,
所以下面把參數改成2, 也不走索引掃描了. 執行時間139毫秒.
digoal => select * from f_immutable ( 2 );
f_immutable
-------------
1
( 1 row )
Time : 139.941 ms
digoal => select * from f_immutable ( 2 );
f_immutable
-------------
1
( 1 row )
Time : 139.994 ms
</div>
將函數的strict改成stable和volatile測試的結果與上面一致, 因為和函數的volatile無關.
digoal => alter function f_immutable ( int ) strict volatile ;
ALTER FUNCTION
Time : 0.490 ms
digoal => alter function f_immutable ( int ) strict stable ;
ALTER FUNCTION
Time : 0.451 ms
測試結果略.
接下來使用prepare SQL COMMAND進行測試 :
結果和PL/pgsql一致.
digoal => \timing
Timing is on .
digoal => prepare p_test ( int ) as select count (*) from test where id = $1 ;
PREPARE
Time : 1.154 ms
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 298.463 .. 298.463 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.034 .. 220.116 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 298.580 ms
( 5 rows )
Time : 299.951 ms
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 243.990 .. 243.990 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.012 .. 166.291 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 244.040 ms
( 5 rows )
Time : 244.800 ms
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 244.184 .. 244.184 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.012 .. 166.416 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 244.235 ms
( 5 rows )
Time : 244.817 ms
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 244.380 .. 244.380 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.012 .. 166.055 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 244.432 ms
( 5 rows )
Time : 245.028 ms
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 244.029 .. 244.029 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.011 .. 166.128 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 244.081 ms
( 5 rows )
Time : 244.701 ms
傳入參數2, 使用了generic plan, 而沒有執行custom plan.( 本來參數2應該走索引. )
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 115.265 .. 115.265 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 115.257 .. 115.257 rows = 1 loops = 1 )
Filter : ( id = $1 )
Rows Removed by Filter : 500000
Total runtime : 115.317 ms
( 5 rows )
Time : 116.046 ms
</div>
下面把前5次的參數改成2, 主要說明choose_custom_plan的 plansource->generic_cost < avg_custom_cost * 1.1 比較過程.
如下 :
ocz@db - 172 - 16 - 3 - 150 -> psql digoal digoal
psql ( 9.2 . 1 )
Type "help" for help .
digoal => prepare p_test ( int ) as select count (*) from test where id = $1 ;
PREPARE
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.083 .. 0.084 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.074 .. 0.075 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.200 ms
( 5 rows )
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.026 .. 0.026 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.021 .. 0.022 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.074 ms
( 5 rows )
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.021 .. 0.021 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.017 .. 0.018 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.065 ms
( 5 rows )
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.020 .. 0.020 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.016 .. 0.016 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.063 ms
( 5 rows )
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.020 .. 0.020 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.016 .. 0.017 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.061 ms
( 5 rows )
第六次傳入參數1, 走的是custom plan.
digoal => explain analyze execute p_test ( 1 );
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 20637.01 .. 20637.02 rows = 1 width = 0 ) ( actual time = 343.385 .. 343.385 rows = 1 loops = 1 )
-> Seq Scan on test ( cost = 0.00 .. 19387.01 rows = 500001 width = 0 ) ( actual time = 0.053 .. 253.714 rows = 500000 loops = 1 )
Filter : ( id = 1 )
Rows Removed by Filter : 1
Total runtime : 343.458 ms
( 5 rows )
</div>
下面來解釋一下原因 :
1. 第6次執行時,
num_custom_plans = 5
total _custom_cost = 7.1 /* (1.42*5) */
generic_cost等于多少呢? 還是-1(初始值) , 從后面使用gdb跟蹤的結果可以看到.
因此choose_custom_plan的執行結果為true. 也就是選擇custom plan, 因此需要額外的plan 優化選擇過程.
到第7次執行時會變成20637.
GDB跟蹤舉例 :
首先確保編譯PostgreSQL使用了enable-debug :
ocz@db - 172 - 16 - 3 - 150 -> pg_config -- configure
'--prefix=/home/ocz/pgsql9.2.1' '--with-pgport=9201' '--with-perl' '--with-python' '--with-tcl' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=16' '--enable-debug' '--with-ossp-uuid' '--with-libs=/opt/uuid-1.6.2/lib'
</div>
終端1 :
digoal => prepare p_test ( int ) as select count (*) from test where id = $1 ;
PREPARE
digoal => explain analyze execute p_test ( 2 );
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Aggregate ( cost = 1.41 .. 1.42 rows = 1 width = 0 ) ( actual time = 0.113 .. 0.113 rows = 1 loops = 1 )
-> Index Only Scan using idx_test_1 on test ( cost = 0.00 .. 1.40 rows = 1 width = 0 ) ( actual time = 0.102 .. 0.103 rows = 1 loops = 1 )
Index Cond : ( id = 2 )
Heap Fetches : 1
Total runtime : 0.269 ms
( 5 rows )
digoal => select pg_backend_pid ();
pg_backend_pid
----------------
10921
( 1 row )
</div>
</div>
終端2 :
ocz@db - 172 - 16 - 3 - 150 -> gdb
GNU gdb ( GDB ) Red Hat Enterprise Linux ( 7.0 . 1 - 37.el5 )
Copyright ( C ) 2009 Free Software Foundation , Inc .
License GPLv3 +: GNU GPL version 3 or later < http : //gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
綁定進程號.
(gdb) attach 10921
Attaching to process 10921
(gdb) break choose_custom_plan
Breakpoint 1 at 0x6ee730: file plancache.c, line 850.
<div>
(gdb) continue
</div>
<div>
Continuing.
</div>
</div>
</div>
</div>
digoal => explain analyze execute p_test ( 2 );
Breakpoint 1 , choose_custom_plan ( plansource = 0x14477350 , boundParams = 0x144b1678 ) at plancache . c : 850
850 if ( boundParams == NULL )
進入單步模式
( gdb ) step
854 if ( plansource -> cursor_options & CURSOR_OPT_GENERIC_PLAN )
( gdb )
856 if ( plansource -> cursor_options & CURSOR_OPT_CUSTOM_PLAN )
( gdb )
860 if ( plansource -> num_custom_plans < 5 )
( gdb )
choose_custom_plan ( plansource = 0x14477350 , boundParams = 0x144b1678 ) at plancache . c : 878
878 }
到了 choose_custom_plan 后, 打印一下 plansource -> total_custom_cost
( gdb ) print plansource -> total_custom_cost
$1 = 1.4174
繼續下一輪
( gdb ) continue
Continuing .
</div>
</div>
digoal => explain analyze execute p_test ( 2 );
</div>
終端2 :
Breakpoint 1 , choose_custom_plan ( plansource = 0x14477350 , boundParams = 0x144b1678 ) at plancache . c : 850
850 if ( boundParams == NULL )
( gdb ) step
854 if ( plansource -> cursor_options & CURSOR_OPT_GENERIC_PLAN )
( gdb )
856 if ( plansource -> cursor_options & CURSOR_OPT_CUSTOM_PLAN )
( gdb )
860 if ( plansource -> num_custom_plans < 5 )
( gdb )
choose_custom_plan ( plansource = 0x14477350 , boundParams = 0x144b1678 ) at plancache . c : 878
878 }
注意此時 total_custom_cost 增長了.
( gdb ) print plansource -> total_custom_cost
$2 = 2.8348
此時還未生成 generic plan , 所以 generic_cost =- 1
( gdb ) print plansource -> generic_cost
$3 = - 1
( gdb ) continue
Continuing .
digoal => explain analyze execute p_test ( 2 );
</div>
略去中間幾步, 下面是第6次和第7次執行SQL時跟蹤到的結果, 說明第6次執行完后生成了generic plan.
終端2 :
</div>
Breakpoint 1 , choose_custom_plan ( plansource = 0x143998c0 , boundParams = 0x1439aff8 ) at plancache . c : 850
850 if ( boundParams == NULL )
( gdb ) step
854 if ( plansource -> cursor_options & CURSOR_OPT_GENERIC_PLAN )
( gdb )
856 if ( plansource -> cursor_options & CURSOR_OPT_CUSTOM_PLAN )
( gdb )
860 if ( plansource -> num_custom_plans < 5 )
( gdb )
GetCachedPlan ( plansource = 0x143998c0 , boundParams = 0x1439aff8 , useResOwner = 1 '\001' ) at plancache . c : 941
941 if (! customplan )
( gdb ) print plansource -> generic_cost
$4 = - 1
( gdb ) print plansource -> total_custom_cost
$5 = 7.0869999999999997
( gdb ) print plansource -> num_custom_plans
$6 = 5
( gdb ) continue
Continuing .
Breakpoint 1 , choose_custom_plan ( plansource = 0x143998c0 , boundParams = 0x1439aff8 ) at plancache . c : 850
850 if ( boundParams == NULL )
( gdb ) continue
Continuing .
Breakpoint 1 , choose_custom_plan ( plansource = 0x143998c0 , boundParams = 0x1439abe8 ) at plancache . c : 850
850 if ( boundParams == NULL )
( gdb ) step
854 if ( plansource -> cursor_options & CURSOR_OPT_GENERIC_PLAN )
( gdb )
856 if ( plansource -> cursor_options & CURSOR_OPT_CUSTOM_PLAN )
( gdb )
860 if ( plansource -> num_custom_plans < 5 )
( gdb )
GetCachedPlan ( plansource = 0x143998c0 , boundParams = 0x1439abe8 , useResOwner = 1 '\001' ) at plancache . c : 941
941 if (! customplan )
( gdb ) print plansource -> generic_cost
$7 = 20637.024999999998
( gdb ) print plansource -> total_custom_cost
$8 = 8.5044000000000004
( gdb ) print plansource -> num_custom_plans
$9 = 6
generic_cost = 20637.024999999998 對應id=1的執行計劃得到的cost, 如下 :
digoal=> explain analyze execute p_test(1);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate (cost=20637.01..20637.02 rows=1 width=0) (actual time=433.100..433.100 rows=1 loops=1)
-> Seq Scan on test (cost=0.00..19387.01 rows=500001 width=0) (actual time=0.133..344.716 rows=500000 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 1
為什么id=1的執行計劃作為generic plan了. 因為id=1的值在test表的出現頻率最高.
通過pg_stats視圖可以看到. 如下 :
digoal => \d pg_stats
View "pg_catalog.pg_stats"
Column | Type | Modifiers
------------------------+----------+-----------
schemaname | name |
tablename | name |
attname | name |
inherited | boolean |
null_frac | real |
avg_width | integer |
n_distinct | real |
most_common_vals | anyarray |
most_common_freqs | real [] |
histogram_bounds | anyarray |
correlation | real |
most_common_elems | anyarray |
most_common_elem_freqs | real [] |
elem_count_histogram | real [] |
digoal => \x
Expanded display is on .
digoal => select * from pg_stats where schemaname = 'digoal' and tablename = 'test' and attname = 'id' ;
-[ RECORD 1 ]----------+-------
schemaname | digoal
tablename | test
attname | id
inherited | f
null_frac | 0
avg _width | 4
n _distinct | 1
most _common_vals | { 1 }
most_common_freqs | { 1 }
histogram_bounds |
correlation | 1
most _common_elems |
most_common_elem_freqs |
elem_count_histogram |
</div>
注意這兩項 :
most_common_vals | { 1 }
most_common_freqs | { 1 }
采樣出來它的出現頻率是100%
【其他】
1. 使用SPI_prepare_cursor, 通過設置flag [CURSOR_OPT_GENERIC_PLAN or CURSOR_OPT_CUSTOM_PLAN] 可以強制指定使用custom plan還是generic plan.
2. generic plan的生命周期 :
The statement returned by SPI_prepare can be used only in the current invocation of the procedure, since SPI_finish frees memory allocated for such a statement.
But the statement can be saved for longer using the functions SPI_keepplan or SPI_saveplan. 一般不推薦使用SPI_saveplan, 因為數據會重復.
SPI_finish釋放gplan :
/*
* ReleaseGenericPlan: release a CachedPlanSource's generic plan, if any.
*/
static void
ReleaseGenericPlan ( CachedPlanSource * plansource )
{
/* Be paranoid about the possibility that ReleaseCachedPlan fails */
if ( plansource -> gplan )
{
CachedPlan * plan = plansource -> gplan ;
Assert ( plan -> magic == CACHEDPLAN_MAGIC );
plansource -> gplan = NULL ;
ReleaseCachedPlan ( plan , false );
}
}
</div>
3. 最后補充以下, 數據分布非常傾斜, 就像本例id=1的有50000條, id=2的只有1條的情況. 當傳入id=2的值時可以考慮強制custom plan.
當然如果經常要這樣做, 不推薦使用prepare statement. 因為generic plan是通過出現最頻繁的值產生的執行計劃, 至少從gdb跟蹤出來的是這樣的.
【參考】
1. http://www.postgresql.org/docs/9.2/static/spi-spi-prepare.html
2. http://www.postgresql.org/docs/9.2/static/spi-spi-execute-plan.html
3. http://www.postgresql.org/docs/9.2/static/plpgsql-implementation.html
4. http://www.postgresql.org/docs/9.2/static/xfunc-volatility.html
5. http://www.postgresql.org/docs/9.2/static/sql-prepare.html
6. http://www.postgresql.org/docs/9.2/static/sql-execute.html
7. http://www.postgresql.org/docs/9.2/static/sql-deallocate.html
8. http://www.postgresql.org/docs/9.2/static/spi-spi-keepplan.html
9. http://www.postgresql.org/docs/9.2/static/spi-spi-saveplan.html
10. http://blog.163.com/digoal@126/blog/static/16387704020121015265329/
11. http://blog.163.com/digoal@126/blog/static/16387704020124219333824/
12. http://blog.163.com/digoal@126/blog/static/16387704020111168855258/
13. http://blog.163.com/digoal@126/blog/static/16387704020109286417600/
14. src/backend/executor/spi.c
15. src/include/executor/spi_priv.h
16. src/backend/utils/cache/plancache.c
</div>
17. src/backend/commands/prepare.c
18. src/include/utils/plancache.h
00024 /*
00025 * CachedPlanSource (which might better have been called CachedQuery)
00026 * represents a SQL query that we expect to use multiple times. It stores
00027 * the query source text, the raw parse tree, and the analyzed-and-rewritten
00028 * query tree, as well as adjunct data. Cache invalidation can happen as a
00029 * result of DDL affecting objects used by the query. In that case we discard
00030 * the analyzed-and-rewritten query tree, and rebuild it when next needed.
00031 *
00032 * An actual execution plan, represented by CachedPlan, is derived from the
00033 * CachedPlanSource when we need to execute the query. The plan could be
00034 * either generic (usable with any set of plan parameters) or custom (for a
00035 * specific set of parameters). plancache.c contains the logic that decides
00036 * which way to do it for any particular execution. If we are using a generic
00037 * cached plan then it is meant to be re-used across multiple executions, so
00038 * callers must always treat CachedPlans as read-only.
00039 *
00040 * Once successfully built and "saved", CachedPlanSources typically live
00041 * for the life of the backend, although they can be dropped explicitly.
00042 * CachedPlans are reference-counted and go away automatically when the last
00043 * reference is dropped. A CachedPlan can outlive the CachedPlanSource it
00044 * was created from.
00045 *
00046 * An "unsaved" CachedPlanSource can be used for generating plans, but it
00047 * lives in transient storage and will not be updated in response to sinval
00048 * events.
00049 *
00050 * CachedPlans made from saved CachedPlanSources are likewise in permanent
00051 * storage, so to avoid memory leaks, the reference-counted references to them
00052 * must be held in permanent data structures or ResourceOwners. CachedPlans
00053 * made from unsaved CachedPlanSources are in children of the caller's
00054 * memory context, so references to them should not be longer-lived than
00055 * that context. (Reference counting is somewhat pro forma in that case,
00056 * though it may be useful if the CachedPlan can be discarded early.)
00057 *
00058 * A CachedPlanSource has two associated memory contexts: one that holds the
00059 * struct itself, the query source text and the raw parse tree, and another
00060 * context that holds the rewritten query tree and associated data. This
00061 * allows the query tree to be discarded easily when it is invalidated.
00062 *
00063 * Note: the string referenced by commandTag is not subsidiary storage;
00064 * it is assumed to be a compile-time-constant string. As with portals,
00065 * commandTag shall be NULL if and only if the original query string (before
00066 * rewriting) was an empty string.
00067 */
00068 typedef struct CachedPlanSource
00069 {
00070 int magic ; /* should equal CACHEDPLANSOURCE_MAGIC */
00071 Node * raw_parse_tree ; /* output of raw_parser() */
00072 char * query_string ; /* source text of query */
00073 const char * commandTag ; /* command tag (a constant!), or NULL */
00074 Oid * param_types ; /* array of parameter type OIDs, or NULL */
00075 int num_params ; /* length of param_types array */
00076 ParserSetupHook parserSetup ; /* alternative parameter spec method */
00077 void * parserSetupArg ;
00078 int cursor_options ; /* cursor options used for planning */
00079 bool fixed_result ; /* disallow change in result tupdesc? */
00080 TupleDesc resultDesc ; /* result type; NULL = doesn't return tuples */
00081 struct OverrideSearchPath * search_path ; /* saved search_path */
00082 MemoryContext context ; /* memory context holding all above */
00083 /* These fields describe the current analyzed-and-rewritten query tree: */
00084 List * query_list ; /* list of Query nodes, or NIL if not valid */
00085 List * relationOids ; /* OIDs of relations the queries depend on */
00086 List * invalItems ; /* other dependencies, as PlanInvalItems */
00087 MemoryContext query_context ; /* context holding the above, or NULL */
00088 /* If we have a generic plan, this is a reference-counted link to it: */
00089 struct CachedPlan * gplan ; /* generic plan, or NULL if not valid */
00090 /* Some state flags: */
00091 bool is_complete ; /* has CompleteCachedPlan been done? */
00092 bool is_saved ; /* has CachedPlanSource been "saved"? */
00093 bool is_valid ; /* is the query_list currently valid? */
00094 int generation ; /* increments each time we create a plan */
00095 /* If CachedPlanSource has been saved, it is a member of a global list */
00096 struct CachedPlanSource * next_saved ; /* list link, if so */
00097 /* State kept to help decide whether to use custom or generic plans: */
00098 double generic_cost ; /* cost of generic plan, or -1 if not known */
00099 double total_custom_cost ; /* total cost of custom plans so far */
00100 int num_custom_plans ; /* number of plans included in total */
00101 } CachedPlanSource ;
00102 generic plan 存儲在以下數據結構中:
00103 /*
00104 * CachedPlan represents an execution plan derived from a CachedPlanSource.
00105 * The reference count includes both the link from the parent CachedPlanSource
00106 * (if any), and any active plan executions, so the plan can be discarded
00107 * exactly when refcount goes to zero. Both the struct itself and the
00108 * subsidiary data live in the context denoted by the context field.
00109 * This makes it easy to free a no-longer-needed cached plan.
00110 */
00111 typedef struct CachedPlan
00112 {
00113 int magic ; /* should equal CACHEDPLAN_MAGIC */
00114 List * stmt_list ; /* list of statement nodes (PlannedStmts and
00115 * bare utility statements) */
00116 bool is_saved ; /* is CachedPlan in a long-lived context? */
00117 bool is_valid ; /* is the stmt_list currently valid? */
00118 TransactionId saved_xmin ; /* if valid, replan when TransactionXmin
00119 * changes from this value */
00120 int generation ; /* parent's generation number for this plan */
00121 int refcount ; /* count of live references to this struct */
00122 MemoryContext context ; /* context containing this CachedPlan */
00123 } CachedPlan ;
</div> </div>