MySQL查询缓存的小知识

前言

我们知道，缓存的设计思想在rdbms数据库中无处不在，就拿号称2500w行代码，bug堆积如山的oracle数据库来说，sql的执行计划可以缓存在library cache中避免再次执行相同sql发生硬解析（语法分析->语义分析->生成执行计划），sql执行结果缓存在result cache内存组件中，有效的将物理io转化成逻辑io，提高sql执行效率。

mysql的querycache跟oracle类似，缓存的是sql语句文本以及对应的结果集，看起来是一个很棒的idea，那为什么从mysql 4.0推出之后，5.6中默认禁用，5.7中被deprecated（废弃）以及8.0版本被removed，今天就聊聊mysql querycache的前世今生。

querycache介绍

mysql查询缓（qc：querycache）在mysql 4.0.1中引入，查询缓存存储select语句的文本以及发送给客户机的结果集，如果再次执行相同的sql，server端将从查询缓存中检索结果返回给客户端，而不是再次解析执行sql，查询缓存在session之间共享，因此，一个客户端生成的缓存结果集，可以响应另一个客户端执行同样的sql。

回到开头的问题，如何判断sql是否共享？

通过sql文本是否完全一致来判断，包括大小写，空格等所有字符完全一模一样才可以共享，共享好处是可以避免硬解析，直接从qc获取结果返回给客户端，下面的两个sql是不共享滴，因为一个是from，另一个是from。

--sql 1
select id, balance from account where id = 121;
--sql 2
select id, balance from account where id = 121;

下面是oracle数据库通过sql_text生成sql_id的算法，如果sql_id不一样说明就不是同一个sql，就不共享，就会发生硬解析。

#!/usr/bin/perl -w
use digest::md5 qw(md5 md5_hex md5_base64);
use math::bigint;
my $stmt = "select id, balance from account where id = 121#!/usr/bin/perl -w
use digest::md5 qw(md5 md5_hex md5_base64);
use math::bigint;
my $stmt = "select id, balance from account where id = 121\0"; 
my $hash = md5 $stmt; 
my($a,$b,$msb,$lsb) = unpack("v*",$hash);
my $sqln = $msb*(2**32)+$lsb;
my $stop = log($sqln) / log(32) + 1;
my $sqlid = '';
my $charbase32 = '0123456789abcdfghjkmnpqrstuvwxyz';
my @chars = split '', $charbase32;
for($i=0; $i < $stop-1; $i++){
my $x = math::bigint->new($sqln);
my $seq = $x->bdiv(32**$i)->bmod(32);
$sqlid = $chars[$seq].$sqlid;
}
print "sql is:\n $stmt \nsql_id is\n $sqlid\n";
"; 
my $hash = md5 $stmt; 
my($a,$b,$msb,$lsb) = unpack("v*",$hash);
my $sqln = $msb*(2**32)+$lsb;
my $stop = log($sqln) / log(32) + 1;
my $sqlid = '';
my $charbase32 = '0123456789abcdfghjkmnpqrstuvwxyz';
my @chars = split '', $charbase32;
for($i=0; $i < $stop-1; $i++){
 my $x = math::bigint->new($sqln);
 my $seq = $x->bdiv(32**$i)->bmod(32);
 $sqlid = $chars[$seq].$sqlid;
}
print "sql is:\n $stmt \nsql_id is\n $sqlid\n";

大家可以发现sql 1和sql 2通过代码生成的sql_id值是不一样，所以不共享。

sql is: select id, balance from account where id = 121 
sql_id is dm5c6ck1g7bds
sql is: select id, balance from account where id = 121 
sql_id is 6xb8gvs5cmc9b

如果让你比较两个java代码文件的内容的有何差异，只需要将这段代码理解透了，就可以改造实现自己的业务逻辑。

querycache配置

mysql> show variables like '%query_cache%';
+------------------------------+----------+
| variable_name    | value |
+------------------------------+----------+
| have_query_cache    | yes  |
| query_cache_limit   | 1048576 |
| query_cache_min_res_unit  | 4096  |
| query_cache_size    | 16777216 |
| query_cache_type    | off  |
| query_cache_wlock_invalidate | off  |

variable_name	description
have_query_cache	查询缓存是否可用，yes-可用；no-不可用，如果用标准二进制mysql，值总是yes。
query_cache_limit	控制单个查询结果集的最大尺寸，默认是1mb。
query_cache_min_res_unit	查询缓存分片数据块的大小，默认是4kb，可以满足大部分业务场景。
query_cache_size	查询缓存大小，单位bytes，设置为0是禁用querycache，注意：不要将缓存的大小设置得太大，由于在更新过程中需要线程锁定querycache，因此对于非常大的缓存，您可能会看到锁争用问题。
query_cache_type	当query_cache_size>0；该变量影响qc如何工作，有三个取值0，1，2，0：禁止缓存或检索缓存结果；1：启用缓存，select sql_no_cache的语句除外；2：只缓存以select sql_cache开头的语句。

query_cache_min_res_unit说明

默认大小是4kb，如果有很多查询结果很小，那么默认数据块大小可能会导致内存碎片，由于内存不足，碎片可能会强制查询缓存从缓存中删除查询。

在这种情况下，可以减小query_cache_min_res_unit的值，由于修剪而删除的空闲块和查询的数量由qcache_free_blocks和qcache_lowmem_prunes状态变量的值给出，如果大量的查询有较大的结果集，可以增大该参数的值来提高性能。

通常开启querycache方式

# 修改mysql配置文件/etc/my.cnf，添加如下配置，重启mysql server即可。
[mysqld]
query_cache_size = 32m
query_cache_type = 1

querycache使用

先搞点测试数据，分别对禁用和开启querycache下的场景进行测试。

--创建一个用户表users，并且插入100w数据。
create table `users` (
 `id` bigint not null auto_increment,
 `name` varchar(20) not null default '' comment '姓名',
 `age` tinyint not null default '0' comment 'age',
 `gender` char(1) not null default 'm' comment '性别',
 `phone` varchar(16) not null default '' comment '手机号',
 `create_time` datetime not null default current_timestamp comment '创建时间',
 `update_time` datetime not null default current_timestamp on update current_timestamp comment '修改时间',
 primary key (`id`)
) engine=innodb default charset=utf8mb4 comment='用户信息表';

select count(*) from users;
+----------+
| count(*) |
+----------+
| 1000000 |

禁用querycache场景

在不使用querycache的时候，每次执行相同的查询语句，都要发生一次硬解析，消耗大量的资源。

#禁用querycache的配置
query_cache_size = 0
query_cache_type = 0

重复执行下面查询，观察执行时间。

--第一次执行查询语句
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id  | name  | age | gender | phone  | create_time   | update_time   |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| 997855 | user997854 | 54 | m  | 15240540354 | 2020-12-15 14:34:50 | 2020-12-15 14:34:50 |
.......
10 rows in set (0.89 sec)
--第二次执行同样的查询语句
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id  | name  | age | gender | phone  | create_time   | update_time   |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| 997855 | user997854 | 54 | m  | 15240540354 | 2020-12-15 14:34:50 | 2020-12-15 14:34:50 |
.......
10 rows in set (0.90 sec)
-- profile跟踪情况
mysql> show profile cpu,block io for query 1; 
+----------------------+----------+----------+------------+--------------+---------------+
| status    | duration | cpu_user | cpu_system | block_ops_in | block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| preparing   | 0.000022 | 0.000017 | 0.000004 |   0 |    0 |
| sorting result  | 0.000014 | 0.000009 | 0.000005 |   0 |    0 |
| executing   | 0.000011 | 0.000007 | 0.000004 |   0 |    0 |
| sending data   | 0.000021 | 0.000016 | 0.000004 |   0 |    0 |
| creating sort index | 0.906290 | 0.826584 | 0.000000 |   0 |    0 |

可以看到，多次执行同样的sql查询语句，执行时间都是0.89s左右，几乎没有差别，同时时间主要消耗在creating sort index阶段。

开启querycache场景

开启查询缓存时，查询语句第一次被执行时会将sql文本及查询结果缓存在qc中，下一次执行同样的sql执行从qc中获取数据返回给客户端即可。

#禁用querycache的配置
query_cache_size = 32m
query_cache_type = 1

--第一次执行查询语句
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id  | name  | age | gender | phone  | create_time   | update_time   |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| 997855 | user997854 | 54 | m  | 15240540354 | 2020-12-15 14:34:50 | 2020-12-15 14:34:50 |
.......
10 rows in set (0.89 sec)
--第二次执行查询语句
mysql> select * from users order by create_time desc limit 10;
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| id  | name  | age | gender | phone  | create_time   | update_time   |
+---------+------------+-----+--------+-------------+---------------------+---------------------+
| 997855 | user997854 | 54 | m  | 15240540354 | 2020-12-15 14:34:50 | 2020-12-15 14:34:50 |
.......
10 rows in set (0.00 sec)
-- profile跟踪数据
mysql> show profile cpu,block io for query 3;
+--------------------------------+----------+----------+------------+--------------+---------------+
| status       | duration | cpu_user | cpu_system | block_ops_in | block_ops_out |
+--------------------------------+----------+----------+------------+--------------+---------------+
| waiting for query cache lock | 0.000016 | 0.000015 | 0.000001 |   0 |    0 |
| checking query cache for query | 0.000007 | 0.000007 | 0.000000 |   0 |    0 |
| checking privileges on cached | 0.000004 | 0.000003 | 0.000000 |   0 |    0 |
| checking permissions   | 0.000034 | 0.000033 | 0.000001 |   0 |    0 |
| sending cached result to clien | 0.000018 | 0.000017 | 0.000001 |   0 |    0 |

可以看到，第一次执行querycache里没有缓存sql文本及数据，执行时间0.89s，由于开启了qc，sql文本及执行结果被缓存在qc中，第二次执行执行同样的sql查询语句，直接命中qc且返回数据，不需要发生硬解析，所以执行时间降低为0s，从profile里看到sending cached result to client直接发送qc中的数据返回给客户端。

查询缓存命中率

查询缓存相关的status变量

mysql>show global status like 'qcache\_%';
+-------------------------+----------+
| variable_name   | value |
+-------------------------+----------+
| qcache_free_blocks  | 1  | --查询缓存中可用内存块的数目。
| qcache_free_memory  | 33268592 | --查询缓存的可用内存量。
| qcache_hits    | 121  | --从qc中获取结果集的次数。
| qcache_inserts   | 91  | --将查询结果集添加到qc的次数，意味着查询已经不在qc中。
| qcache_lowmem_prunes | 0  | --由于内存不足而从查询缓存中删除的查询数。
| qcache_not_cached  | 0  | --未缓存的查询数目。
| qcache_queries_in_cache | 106  | --在查询缓存中注册的查询数。
| qcache_total_blocks  | 256  | --查询缓存中的块总数。

查询缓存命中率及平均大小

           qcache_hits
query cache hit rate = ------------------------------------------------ x 100%
      qcache_hits + qcache_inserts + qcache_not_cached
      
        query_cache_size = qcache_free_memory
query cache avg query size = --------------------------------------- 
          qcache_queries_in_cache

更新操作对qc影响

举个例子，支付系统的里转账逻辑，先要锁定账户再修改余额，主要步骤如下：

query_id	query	description
1	reset query cache	清空查询缓存。
2	select balance from account where id = 121	第一次执行，未命中qc，添加到qc。
3	select balance from account where id = 121	命中qc，直接返回结果。
4	update account set balance = balance – 1000 where id = 121	更新，锁定query cche进行更新，缓存数据失效。
5	select balance from account where id = 121	缓存已失效，未命中，添加到qc。
6	select balance from account where id = 121	命中qc，直接返回结果。

对于这种情况来说，qc是不太适合的，因为第一次执行查询sql未命中，返回结果给客户端，添加sql文本及结果集到qc之后，下一次执行同样的sql直接从qc返回结果，不需要硬解析操作，但是每次update都是先更新数据，然后锁定qc然后更新缓存结果，会导致之前的缓存结果失效，再次执行相的查询sql还是未命中，有得重新添加到qc，这样频繁的锁定qc->检查qc->添加qc->更新qc非常消耗资源，降低数据库的并发处理能力。

为何放弃querycache

一般业务场景

从业务系统的操作类型，可以分为oltp（online transaction processing 联机事务处理系统）和olap（online analysis processing联机分析处理系统），对于政企业务，也可以分为boss（business operation support system-业务操作支撑系统，简称业支）和bass（business analysis support system-业务分析支撑系统，简称经分），来总结下这两类系统的特点。

适合querycache的场景

首先，查询缓存qc的大小只有几mb，不适合将缓存设置得太大，由于在更新过程中需要线程锁定querycache，因此对于非常大的缓存，可能会看到锁争用问题。那么，哪些情况有助于从查询缓存中获益呢？以下是理想条件：

相同的查询是由相同或多个客户机重复发出的。
被访问的底层数据本质上是静态或半静态的。
查询有可能是资源密集型和/或构建简短但计算复杂的结果集，同时结果集比较小。
并发性和查询qps都不高。

这4种情况只是理想情况下，实际的业务系统都是有crud操作的，数据更新比较频繁，查询接口的qps比较高，所以能满足上面的理想情况下的业务场景实在很少，我能想到就是配置表，数据字典表这些基本都是静态或半静态的，可以时通过qc来提高查询效率。

不适合querycache的场景

如果表数据变化很快，则查询缓存将失效，并且由于不断从缓存中删除查询，从而使服务器负载升高，处理速度变得更慢，如果数据每隔几秒钟更新一次或更加频繁，则查询缓存不太可能合适。

同时，查询缓存使用单个互斥体来控制对缓存的访问，实际上是给服务器sql处理引擎强加了一个单线程网关，在查询qps比较高的情况下，可能成为一个性能瓶颈，会严重降低查询的处理速度。因此，mysql 5.6中默认禁用了查询缓存。

删除querycache

the query cache is deprecated as of mysql 5.7.20, and is removed in mysql 8.0. deprecation includes query_cache_type，可以看到从mysql 5.6的默认禁用，5.7的废弃以及8.0的彻底删除，oracle也是综合了各方面考虑做出了这样的选择。

上面聊了下适合和不适合的querycache的业务场景，发现这个特性对业务场景要求过于苛刻，与实际业务很难吻合，而且开启之后，对数据库并发度和处理能力都会降低很多，下面总结下为何mysql从disabled->deprecated->removed querycache的主要原因。

同时查询缓存碎片化还会导致服务器的负载升高，影响数据库的稳定性，在oracle官方搜索querycache可以发现，有很多bug存在，这也就决定了mysql 8.0直接果断的remove了该特性。

总结

上面为大家介绍了mysql querycache从推出->禁用->废弃->删除的心路历程，设计之初是为了减少重复sql查询带来的硬解析开销，同时将物理io转化为逻辑io，来提高sql的执行效率，但是mysql经过了多个版本的迭代，同时在硬件存储发展之快的今天，qc几乎没有任何收益，而且还会降低数据库并发处理能力，最终在8.0版本直接removd掉了。

其实缓存设计思想在硬件和软件领域无处不在，硬件方面：raid卡，cpu都有自己缓存，软件方面就太多了，os的cache，数据库的buffer pool以及java程序的缓存，作为一名研发工程师，需要根据业务场景选择合适缓存方案是非常重要的，如果都不合适，就需进行定制化开发缓存，来更好的match自己的业务场景，今天就聊这么多，希望对大家有所帮助。

我是敖丙，你知道的越多，你不知道的越多，感谢各位人才的：点赞、收藏和评论，我们下期见！

以上就是mysql查询缓存的小知识的详细内容，更多关于mysql查询缓存的资料请关注www.887551.com其它相关文章！