特定语句下TiDB与MySQL执行效果对比研究之二

网友投稿 194 2024-03-10



TiDB 与MySQL优化器在特定语句下执行效果对比(二)

特定语句下TiDB与MySQL执行效果对比研究之二

一、引言

Index join 是表关联的常用连接方式,笔者在近期学习TiDB 的index join 过程中,发现TiDB的优化器在选择表连接策略的过程中,在某些场景下并不能选择index join 作为表的连接方式,这一点与MySQL的优化器的选择策略是不同的,下面笔者就以几个例子来展现下这种差异:

备注:笔者测试语句的表数据来源均来自tpch,可以由tiup bench tpch ...导入生成这些数据

测试的TiDB版本为6.5.3,mysql 版本为8.0.30

二、测试案例1关联条件有函数

该测试案例的信息来源于 https://as

对应的表结构

mysql> show create table customer \G *************************** 1. row *************************** Table: customer Create Table: CREATE TABLE `customer` ( `C_CUSTKEY` bigint(20) NOT NULL, `C_NAME` varchar(25) NOT NULL, `C_ADDRESS` varchar(40) NOT NULL, `C_NATIONKEY` bigint(20) NOT NULL, `C_PHONE` char(15) NOT NULL, `C_ACCTBAL` decimal(15,2) NOT NULL, `C_MKTSEGMENT` char(10) NOT NULL, `C_COMMENT` varchar(117) NOT NULL, PRIMARY KEY (`C_CUSTKEY`) /*T![clustered_index] CLUSTERED */ ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.00 sec) mysql> show create table orders \G *************************** 1. row *************************** Table: orders Create Table: CREATE TABLE `orders` ( `O_ORDERKEY` bigint(20) NOT NULL, `O_CUSTKEY` bigint(20) NOT NULL, `O_ORDERSTATUS` char(1) NOT NULL, `O_TOTALPRICE` decimal(15,2) NOT NULL, `O_ORDERDATE` date NOT NULL, `O_ORDERPRIORITY` char(15) NOT NULL, `O_CLERK` char(15) NOT NULL, `O_SHIPPRIORITY` bigint(20) NOT NULL, `O_COMMENT` varchar(79) NOT NULL, PRIMARY KEY (`O_ORDERKEY`) /*T![clustered_index] CLUSTERED */, KEY `index_O_TOTALPRICE` (`O_TOTALPRICE`), KEY `index_a` (`O_ORDERSTATUS`,`O_TOTALPRICE`), KEY `indexO_CUSTKEY` (`O_CUSTKEY`), KEY `index_date` (`O_ORDERDATE`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.00 sec)

原始语句及其执行计划

mysql> explain analyze select /*+ INL_JOIN(a,b) */ count(*) from customer a,orders b where a.c_phone=25-989-741-2988 and a.c_custkey=b.o_custkey and a.C_COMMENT=b.O_COMMENT; +------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+ | id | estRows | actRows | task | access object | execution info | operator info | memory | disk | +------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+ | StreamAgg_9 | 1.00 | 1 | root | | time:901.5ms, loops:2 | funcs:count(1)->Column#18 | 8 Bytes | N/A | | └─IndexJoin_39 | 1.05 | 0 | root | | time:901.5ms, loops:1, inner:{total:183.6ms, concurrency:5, task:1, construct:247.6μs, fetch:183.2ms, build:61μs}, probe:41.2μs | inner join, inner:IndexLookUp_38, outer key:tpch2.customer.c_custkey, inner key:tpch2.orders.o_custkey,equal cond:eq(tpch2.customer.c_comment, tpch2.orders.o_comment), eq(tpch2.customer.c_custkey, tpch2.orders.o_custkey) | 65.6 KB | N/A | | ├─TableReader_33(Build) | 1.04 | 1 | root | | time:717.2ms, loops:3, cop_task: {num: 1, max: 717.2ms, proc_keys: 150000, tot_proc: 714ms, rpc_num: 1, rpc_time: 717.1ms, copr_cache: disabled,distsql_concurrency:15} | data:Selection_32 | 423 Bytes | N/A | | │ └─Selection_32 | 1.04 | 1 | cop[tikv] | | tikv_task:{time:714ms, loops:151}, scan_detail: {total_process_keys: 150000, total_process_keys_size: 30533765, total_keys: 150001, get_snapshot_time: 647.8μs, rocksdb: {key_skipped_count: 150000, block: {cache_hit_count: 5, read_count: 495, read_byte: 11.1 MB, read_time: 345.8ms}}} | eq(tpch2.customer.c_phone, "25-989-741-2988") | N/A | N/A | | │ └─TableFullScan_31 | 150000.00 | 150000 | cop[tikv] | table:a | tikv_task:{time:694ms, loops:151} | keep order:false | N/A | N/A | | └─IndexLookUp_38(Probe) | 1.05 | 6 | root | | time:183ms, loops:2, index_task: {total_time: 1.84ms,fetch_handle:1.83ms, build: 670ns, wait: 1.09μs}, table_task: {total_time: 181.1ms, num: 1, concurrency: 5}, next: {wait_index: 2.01ms, wait_table_lookup_build: 0s, wait_table_lookup_resp: 181ms} | | 9.34 KB | N/A | | ├─IndexRangeScan_36(Build) | 1.05 | 6 | cop[tikv] | table:b, index:indexO_CUSTKEY(O_CUSTKEY) | time:1.83ms, loops:3,cop_task: {num:1, max: 1.79ms, proc_keys: 6, rpc_num: 1, rpc_time: 1.77ms, copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{time:1ms, loops:1}, scan_detail: {total_process_keys: 6, total_process_keys_size: 276, total_keys: 7, get_snapshot_time: 888.5μs, rocksdb: {key_skipped_count: 6, block: {cache_hit_count: 2}}} |range: decidedby [eq(tpch2.orders.o_custkey, tpch2.customer.c_custkey)], keep order:false | N/A | N/A | | └─TableRowIDScan_37(Probe) | 1.05 | 6 | cop[tikv] | table:b | time:181ms, loops:2, cop_task: {num: 3, max: 85.9ms, min: 35.3ms, avg: 60.3ms, p95: 85.9ms, max_proc_keys: 2, p95_proc_keys: 2, tot_proc: 174ms, rpc_num: 3, rpc_time: 180.8ms,copr_cache: disabled, distsql_concurrency: 15}, tikv_task:{proc max:83ms, min:34ms, avg: 58.7ms, p80:83ms, p95:83ms, iters:3, tasks:3}, scan_detail: {total_process_keys: 6, total_process_keys_size: 929, total_keys: 6, get_snapshot_time: 2.35ms, rocksdb: {block: {cache_hit_count: 9, read_count: 18, read_byte: 3.76 MB, read_time: 170.7ms}}}| keep order:false | N/A | N/A | +------------------------------------+-----------+---------+-----------+------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+ 8 rows in set (0.94 sec)

从该执行计划来看,还是可以走正常的index join执行计划,但是如果把关联条件加个函数呢?语句如下

select count(*) from customer a,orders b where a.c_phone=25-989-741-2988 and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10);

再跑一遍执行计划

mysql> explain analyze select count(*) from customer a,orders b where a.c_phone=25-989-741-2988 and a.c_custkey=b.o_custkey and substr(a.C_COMMENT,1,10)=substr(b.O_COMMENT,1,10); +----------------------------------+------------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+ | id |estRows| actRows | task | access object | execution info | operator info | memory | disk | +----------------------------------+------------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------+-----------+---------+ | StreamAgg_11 | 1.00 | 1 | root | | time:1.51s, loops:2 | funcs:count(1)->Column#18 | 8 Bytes | N/A | | └─HashJoin_22 | 1.05 | 0 | root | | time:1.51s, loops:1, build_hash_table:{total:154.5ms, fetch:154.5ms, build:5.21μs}, probe:{concurrency:5, total:7.54s, max:1.51s, probe:190.6ms, fetch:7.35s} | inner join, equal:[eq(tpch2.customer.c_custkey, tpch2.orders.o_custkey) eq(Column#19, Column#20)] | 25.7 KB | 0 Bytes | | ├─Projection_14(Build) | 1.04 | 1 | root | | time:154.4ms, loops:2, Concurrency:OFF | tpch2.customer.c_custkey, substr(tpch2.customer.c_comment, 1, 10)->Column#19 | 40.7 KB | N/A | | │ └─TableReader_17 | 1.04 | 1 | root | | time:154.4ms, loops:2, cop_task: {num: 1, max: 169.2ms, proc_keys: 150000, tot_proc: 167ms, rpc_num: 1, rpc_time: 169.2ms, copr_cache: disabled, distsql_concurrency: 15}

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:特定语句下TiDB与MySQL执行效果对比研究之一
下一篇:理解TiDB集群P99计算方式的重要性
相关文章