本文介绍MySQL InnoDB下索引、查询的实现以及优化。

查询

回表查询和覆盖索引

什么是回表查询？InnoDB中的主键(Primary Key, PK)使用了聚簇索引，即主键索引的叶子节点存放的是行数据本身。因此通过主键进行SELECT是很快的。
但是对于其他的索引，它们的叶子节点存放的是主键ID。这个是显然的，不然我每创建一个索引就得重新建立一个表结构了。在这种情况下访问行数据，就得通过主键ID去聚簇索引中再查一次，这个也就是所谓的回表查询。

看到这里不禁有一个想法，我们能不能把主键的索引做成哈希的，这样的话它的复杂度是O(1)，能减小回表开销，主要有下面的考虑：

自增主键往往规律可循，能够设计出很好的哈希函数。
因为自增索引不像银行卡号码或者手机号码那样具有实际的意义，所以B+树提供的一些范围查询的性能未必常用。
B+树毕竟是O(log n)的复杂度，如果使用哈希索引，能够提高回表查询的效率。
哈希索引更好做分区。

那有没有其他办法减少回表查询的开销呢？一个方案是通过覆盖索引(Covering Index)。

为了介绍覆盖索引，首先介绍联合索引。对于下面的语句生成的索引，可以用来加速c1、c1,c2、c1,c2,c3这三个查询。这个也是好理解的，例如我们在查询字典的时候，可以先查询第一个字母，然后再查询第二个字母；反之，没办法直接查第二个字母和第三个字母，即用不了c2,c3这样的索引。这启示我们，在设计联合索引的时候，应当把最常用或者最宽泛的条件放到最左边。

1	create index index_c1_c2_c3 on table1(c1,c2,c3);

通过覆盖索引，通过非聚簇索引查询数据也不需要再回表了，这得益于联合索引。例如index_c1_c2_c3就包含了c1、c2、c3这三个字段的值，那么如果我们只需要查询这些值的话，就不需要回表了。

Extra总结

Using temporary

需要创建临时表。

Using filesort

文件排序，通常出现在不能使用索引排序的情况。
一个通常的情况是使用查询索引之外的Key去做ORDER BY时。
文件排序一般有几种实现方式，令被排序的键是S，主键是ref，需要返回的列是addon1、addon2、addon3，则有

(S,ref)，即original filesort algorithm，回表排序
这种方式占用空间较小，但需要在排序后根据ref回表查询，从而产生很多随机IO。
(S,addon1,addon2,addon3)，即modified filesort algorithm，不回表排序
这种方案不需要回表，但是对排序空间要求高。当然，对于诸如varchar类型的addon字段，是可以压缩(pack)一下的，但是对于搜索排序键是不行的。

具体选择哪种方案，主要是看S和所有addon字段的长度总和是否超过一定的阈值。

Using index

使用覆盖索引获取所需要的数据。

Using Index Condition

这个实际上运用了索引下推(Index Condition Pushdown)技术。这个技术是MySQL 5.7之后的一个优化，涉及了服务器层和存储引擎层。
首先来先看下没有这个优化的select where过程：

首先读取下一行的index tuple，然后用index tuple去定位并读取整个行。
检查所有的WHERE条件，如果该条件属于这张表，就进行检测是否符合条件。
在有了这个优化之后，新的过程是：
首先读取下一行的index tuple，但不需要再去读取整个行。
检查所有的WHERE条件，如果该条件属于这张表，并且能够根据当前使用的索引就能检测，就直接检测了。如果条件不满足，直接看下一行。
如果条件满足，用index tuple去定位并读取整个行。
使用刚才剩下来没有被用到的WHERE条件，检测是否符合条件。

Using where

表示MySQL需要在收到存储引擎返回的结果后，对这个结果进行后过滤(Post filter)。

实验

首先构造样例，在db1中创建表table1_noindex、table1、table2_noindex。对表table1创建联合索引index_c1_c2_c3和索引index_c5。

create DATABASE db1;
use db1;

drop table table1_noindex;
drop table table1;
create TABLE table1_noindex(
    id INT AUTO_INCREMENT PRIMARY KEY,
    c1 INT NOT NULL,
    c2 VARCHAR(30) NOT NULL,
    c3 DATE NOT NULL,
    c4 INT NOT NULL,
    c5 INT NOT NULL UNIQUE
);

create TABLE table2_noindex(
    id INT AUTO_INCREMENT PRIMARY KEY,
    c5 INT NOT NULL UNIQUE
);

CREATE TABLE table1 LIKE table1_noindex;
INSERT INTO table1 SELECT * FROM table1_noindex;

create index index_c1_c2_c3 on table1(c1,c2,c3);
create index index_c5 on table1(c5);

insert into table1 values (NULL, 3, "a", NOW(), 4, 5);
insert into table1_noindex values (NULL, 3, "a", NOW(), 4, 5);

查看索引

1	mysql> show keys from table1;

使用主键或者UNIQUE键

mysql> explain select * from table1 where id=1\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

需要注意的是，使用UNIQUE键，同样会是const类型而不是eq_ref

mysql> explain select c1,c5 from table1 where c5=5\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: const
possible_keys: c5
          key: c5
      key_len: 4
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL

使用索引
Extra中的Using index表示单纯用索引即可获得所有数据，不需要回表查询，这也就是之前提到的覆盖索引。下面我们介绍覆盖索引

mysql> explain select c1,c2,c3 from table1 where c1=1 and c2="a"\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: ref
possible_keys: index_c1_c2_c3
          key: index_c1_c2_c3
      key_len: 96
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using index

但是，如果我们加上c4，就必须要回表了

mysql> explain select c1,c4 from table1 where c1=1 and c2="a"\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: ref
possible_keys: index_c1_c2_c3
          key: index_c1_c2_c3
      key_len: 96
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: NULL

此外，如果我们对索引列进行计算或者应用函数，也会导致不能使用索引

mysql> explain select c1,c2,c3 from table1 where c1*2=2 and c2="a"\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: index
possible_keys: index_c1_c2_c3
          key: index_c1_c2_c3
      key_len: 99
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index

使用索引下推
为了规避掉覆盖索引直接返回，我们这次用了select *。当然，也可以select索引之外的列，比如c4。

mysql> explain select * from table1 where c1=1 and c2 like "a*"\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: range
possible_keys: index_c1_c2_c3
          key: index_c1_c2_c3
      key_len: 96
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition

不使用索引
可以发现，type是个ALL，表示发生了全表扫描。

mysql> explain select c1,c2,c3 from table1_noindex where c1=1 and c2="a"\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1_noindex
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where

破坏了最左匹配原则
这一次，type是index，这表示仍然需要进行全表扫描。但不同的是扫描是按照索引的顺序，也就是说不需要对结果排序了。

mysql> explain select c1,c2,c3 from table1 where c3=DATE('2012-12-21 00:00:00')\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: table1
   partitions: NULL
         type: index
possible_keys: index_c1_c2_c3
          key: index_c1_c2_c3
      key_len: 99
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index

当然如果我们执行一下下面的语句，type又会变成ref

1	create index index_c3 on table1(c3);

多表查询

mysql> explain select table2_noindex.c5 from table1,table2_noindex where table1.c5=table2_noindex.c5;
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+
| id | select_type | table          | partitions | type   | possible_keys | key  | key_len | ref           | rows | filtered | Extra       |
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | table1         | NULL       | index  | c5            | c5   | 4       | NULL          |    1 |   100.00 | Using index |
|  1 | SIMPLE      | table2_noindex | NULL       | eq_ref | c5            | c5   | 4       | db1.table1.c5 |    1 |   100.00 | Using index |
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+

mysql> explain select c1,table2_noindex.c5 from table1,table2_noindex where table1.c5=table2_noindex.c5;
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+
| id | select_type | table          | partitions | type   | possible_keys | key  | key_len | ref           | rows | filtered | Extra       |
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+
|  1 | SIMPLE      | table1         | NULL       | ALL    | c5            | NULL | NULL    | NULL          |    1 |   100.00 | NULL        |
|  1 | SIMPLE      | table2_noindex | NULL       | eq_ref | c5            | c5   | 4       | db1.table1.c5 |    1 |   100.00 | Using index |
+----+-------------+----------------+------------+--------+---------------+------+---------+---------------+------+----------+-------------+

MySQL查询

查询