淺談 MySQL 子查詢及其優化

jopen 11年前發布 | 115K 次閱讀 MySQL 數據庫服務器

使用過oracle或者其他關系數據庫的DBA或者開發人員都有這樣的經驗，在子查詢上都認為數據庫已經做過優化，能夠很好的選擇驅動表執行，然后在把該經驗移植到mysql數據庫上，但是不幸的是，mysql在子查詢的處理上有可能會讓你大失所望，在我們的生產系統上就碰到過一些案例，例如：

SELECT i_id,
       sum(i_sell) AS i_sell
FROM table_data
WHERE i_id IN
    (SELECT i_id
     FROM table_data
     WHERE Gmt_create >= '2011-10-07 00:00:00')
GROUP BY i_id;

（備注：sql的業務邏輯可以打個比方：先查詢出10-07號新賣出的100本書，然后在查詢這新賣出的100本書在全年的銷量情況）。

這條sql之所以出現的性能問題在于mysql優化器在處理子查詢的弱點，mysql優化器在處理子查詢的時候，會將將子查詢改寫。通常情況下，我們希望由內到外，先完成子查詢的結果，然后在用子查詢來驅動外查詢的表，完成查詢；但是mysql處理為將會先掃描外面表中的所有數據，每條數據將會傳到子查詢中與子查詢關聯，如果外表很大的話，那么性能上將會出現問題；
針對上面的查詢，由于table_data這張表的數據有70W的數據，同時子查詢中的數據較多，有大量是重復的，這樣就需要關聯近70W次，大量的關聯導致這條sql執行了幾個小時也沒有執行完成，所以我們需要改寫sql：

SELECT t2.i_id,
       SUM(t2.i_sell) AS sold
FROM
  (SELECT DISTINCT i_id
   FROM table_data
   WHERE gmt_create >= '2011-10-07 00:00:00') t1,
                                              table_data t2
WHERE t1.i_id = t2.i_id
GROUP BY t2.i_id;

我們將子查詢改為了關聯，同時在子查詢中加上distinct，減少t1關聯t2的次數；
改造后，sql的執行時間降到100ms以內。
mysql的子查詢的優化一直不是很友好，一直有受業界批評比較多,也是我在sql優化中遇到過最多的問題之一，mysql在處理子查詢的時候，會將子查詢改寫,通常情況下，我們希望由內到外，也就是先完成子查詢的結果，然后在用子查詢來驅動外查詢的表，完成查詢，但是恰恰相反，子查詢不會先被執行；今天希望通過介紹一些實際的案例來加深對mysql子查詢的理解。下面將介紹一個完整的案例及其分析、調優的過程與思路。

1、案例：

用戶反饋數據庫響應較慢，許多業務動更新被卡住；登錄到數據庫中觀察，發現長時間執行的sql；

| 10437 | usr0321t9m9 | 10.242.232.50:51201 | oms | Execute | 1179 | Sending
Sql為：
SELECT tradedto0.*
FROM a1 tradedto0
WHERE tradedto0.tradestatus='1'
  AND (tradedto0.tradeoid IN
         (SELECT orderdto1.tradeoid
          FROM a2 orderdto1
          WHERE orderdto1.proname LIKE '%??%'
            OR orderdto1.procode LIKE '%??%'))
  AND tradedto0.undefine4='1'
  AND tradedto0.invoicetype='1'
  AND tradedto0.tradestep='0'
  AND (tradedto0.orderCompany LIKE '0002%')
ORDER BY tradedto0.tradesign ASC,
         tradedto0.makertime DESC LIMIT 15;</pre> 
2、現象：其他表的更新被阻塞 


UPDATE a1
SET tradesign='DAB67634-795C-4EAC-B4A0-78F0D531D62F',
              markColor=' #CD5555',
                        memotime='2012-09- 22',
                                 markPerson='??'
WHERE tradeoid IN ('gy2012092204495100032') ；
為了盡快恢復應用，將其長時間執行的sql kill掉后，應用恢復正常; 

3、分析執行計劃: 


db@3306 ：explain
SELECT tradedto0.*
FROM a1 tradedto0
WHERE tradedto0.tradestatus='1'
  AND (tradedto0.tradeoid IN
         (SELECT orderdto1.tradeoid
          FROM a2 orderdto1
          WHERE orderdto1.proname LIKE '%??%'
            OR orderdto1.procode LIKE '%??%'))
  AND tradedto0.undefine4='1'
  AND tradedto0.invoicetype='1'
  AND tradedto0.tradestep='0'
  AND (tradedto0.orderCompany LIKE '0002%')
ORDER BY tradedto0.tradesign ASC,
         tradedto0.makertime DESC LIMIT 15;
+----+--------------------+------------+------+---------------+------+---------+------+-------+-----
| id | select_type | table | type | possible_keys | key | keylen | ref | rows | Extra |
+----+--------------------+------------+------+---------------+------+---------+------+-------+-----
| 1 | PRIMARY | tradedto0 | ALL | NULL | NULL | NULL | NULL | 27454 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | orderdto1_ | ALL | NULL | NULL | NULL | NULL | 40998 | Using where |
+----+--------------------+------------+------+---------------+------+---------+------+-------+-----</pre>從執行計劃上，我們開始一步一步地進行優化： 

首先，我們看看執行計劃的第二行，也就是子查詢的那部分，orderdto1_進行了全表的掃描，我們看看能不能添加適當的索引： 

A . 使用覆蓋索引: 
db@3306：alter table a2 add index ind_a2(proname,procode,tradeoid);
ERROR 1071 (42000): Specified key was too long; max key length is 1000 bytes
添加組合索引超過了最大key length限制：

</span> B．查看該表的字段定義： 
db@3306 ：DESC  a2 ;
+---------------------+---------------+------+-----+---------+-------+
| FIELD               | TYPE          | NULL | KEY | DEFAULT | Extra |
+---------------------+---------------+------+-----+---------+-------+
| OID                 | VARCHAR(50)   | NO   | PRI | NULL    |       |
| TRADEOID            | VARCHAR(50)   | YES  |     | NULL    |       |
| PROCODE             | VARCHAR(50)   | YES  |     | NULL    |       |
| PRONAME             | VARCHAR(1000) | YES  |     | NULL    |       |
| SPCTNCODE           | VARCHAR(200)  | YES  |     | NULL    |       | C．查看表字段的平均長度： 


db@3306 ：SELECT MAX(LENGTH(PRONAME)),avg(LENGTH(PRONAME)) FROM a2;
+----------------------+----------------------+
| MAX(LENGTH(PRONAME)) | avg(LENGTH(PRONAME)) |
+----------------------+----------------------+
|    95              |       24.5588 |
 D．縮小字段長度 


ALTER TABLE MODIFY COLUMN PRONAME VARCHAR(156);
再進行執行計劃分析： 

db@3306 ：explain
SELECT tradedto0.*
FROM a1 tradedto0
WHERE tradedto0.tradestatus='1'
  AND (tradedto0.tradeoid IN
         (SELECT orderdto1.tradeoid
          FROM a2 orderdto1
          WHERE orderdto1.proname LIKE '%??%'
            OR orderdto1.procode LIKE '%??%'))
  AND tradedto0.undefine4='1'
  AND tradedto0.invoicetype='1'
  AND tradedto0.tradestep='0'
  AND (tradedto0.orderCompany LIKE '0002%')
ORDER BY tradedto0.tradesign ASC,
         tradedto0.makertime DESC LIMIT 15;
+----+--------------------+------------+-------+-----------------+----------------------+---------+
| id | select_type | table | type | possible_keys | key | keylen | ref | rows | Extra |
+----+--------------------+------------+-------+-----------------+----------------------+---------+
| 1 | PRIMARY | tradedto0 | ref | ind_tradestatus | indtradestatus | 345 | const,const,const,const | 8962 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | orderdto1 | index | NULL | ind_a2 | 777 | NULL | 41005 | Using where; Using index |
+----+--------------------+------------+-------+-----------------+----------------------+---------+</pre>發現性能還是上不去，關鍵在兩個表掃描的行數并沒有減小（8962*41005），上面添加的索引沒有太大的效果，現在查看t表的執行結果： 


db@3306 ：
SELECT orderdto1.tradeoid
FROM t orderdto1
WHERE orderdto1.proname LIKE '%??%'
  OR orderdto1.procode LIKE '%??%';
Empty
SET (0.05 sec)</pre>結果集為空，所以需要將t表的結果集做作為驅動表； 

4、改寫子查詢： 

通過上面測試驗證，普通的mysql子查詢寫法性能上是很差的，為mysql的子查詢天然的弱點，需要將sql進行改寫為關聯的寫法：


SELECT tradedto0_.*
FROM a1 tradedto0_ ,
  (SELECT orderdto1_.tradeoid
   FROM a2 orderdto1_
   WHERE orderdto1_.proname LIKE '%??%'
     OR orderdto1_.procode LIKE '%??%')t2
WHERE tradedto0_.tradestatus='1'
  AND (tradedto0_.tradeoid=t2.tradeoid)
  AND tradedto0_.undefine4='1'
  AND tradedto0_.invoicetype='1'
  AND tradedto0_.tradestep='0'
  AND (tradedto0_.orderCompany LIKE '0002%')
ORDER BY tradedto0_.tradesign ASC,
         tradedto0_.makertime DESC LIMIT 15;
 5、查看執行計劃： 


db@3306 ：explain
SELECT tradedto0.*
FROM a1 tradedto0 ,
  (SELECT orderdto1.tradeoid
   FROM a2 orderdto1
   WHERE orderdto1.proname LIKE '%??%'
     OR orderdto1.procode LIKE '%??%')t2
WHERE tradedto0.tradestatus='1'
  AND (tradedto0.tradeoid=t2.tradeoid)
  AND tradedto0.undefine4='1'
  AND tradedto0.invoicetype='1'
  AND tradedto0.tradestep='0'
  AND (tradedto0.orderCompany LIKE '0002%')
ORDER BY tradedto0.tradesign ASC,
         tradedto0.makertime DESC LIMIT 15;
+----+-------------+------------+-------+---------------+----------------------+---------+------+
| id | select_type | table | type | possible_keys | key | keylen | ref | rows | Extra |
+----+-------------+------------+-------+---------------+----------------------+---------+------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables |
| 2 | DERIVED | orderdto1 | index | NULL | ind_a2 | 777 | NULL | 41005 | Using where; Using index |
+----+-------------+------------+-------+---------------+----------------------+---------+------+</pre> 
6、執行時間： 


db@3306 ：
SELECT tradedto0.*
FROM a1 tradedto0 ,
  (SELECT orderdto1.tradeoid
   FROM a2 orderdto1
   WHERE orderdto1.proname LIKE '%??%'
     OR orderdto1.procode LIKE '%??%')t2
WHERE tradedto0.tradestatus='1'
  AND (tradedto0.tradeoid=t2.tradeoid)
  AND tradedto0.undefine4='1'
  AND tradedto0.invoicetype='1'
  AND tradedto0.tradestep='0'
  AND (tradedto0.orderCompany LIKE '0002%')
ORDER BY tradedto0.tradesign ASC,
         tradedto0.makertime DESC LIMIT 15;
Empty
SET (0.03 sec)</pre>縮短到了毫秒； 

7、總結： 
1. mysql子查詢在執行計劃上有著明顯的弱點，需要將子查詢進行改寫

可以參考：

a. 生產庫中遇到mysql的子查詢：http://hidba.org/?p=412

b. 內建的builtin InnoDB,子查詢阻塞更新：http://hidba.org/?p=456

2. 在表結構設計上，不要隨便使用varchar(N)的大字段，導致無法使用索引

可以參考：

a. JDBC內存管理—varchar2(4000)的影響：http://hidba.org/?p=31

b. innodb中大字段的限制：http://hidba.org/?p=144

c. innodb使用大字段text，blob的一些優化建議： http://hidba.org/?p=551

</span> 8、Refer： 

[1] 生產庫中遇到mysql的子查詢  http://hidba.org/?p=412 

[2] 淺談mysql的子查詢  http://hidba.org/?p=624 

[3] mysql子查詢的弱點  http://hidba.org/?p=260 
來自：http://hidba.org/?p=624

本文由用戶 jopen 自行上傳分享，僅供網友學習交流。所有權歸原作者，若您的權利被侵害，請聯系管理員。

轉載本站原創文章，請注明出處，并保留原始鏈接、圖片水印。

本站是一個以用戶分享為主的開源技術平臺，歡迎各類分享！

本文地址：http://www.baiduhome.net/lib/view/open1404887901263.html

MySQL 數據庫服務器

淺談 MySQL 子查詢及其優化

1、案例：

相關經驗

相關資訊

相關文檔

目錄