索引加速功能真能提升10倍吗?--TiDB V6.1.0-V7.1.0建索引速度对比

网友投稿 624 2023-11-24

记得在某个6的版本发布宣传里面看到,新版本建索引速度提升了10倍,一直没抽时间亲自测试验证下,前段时间V7.1发布,趁着这次机会,把我的一个V6.1.0的环境做下升级,顺便测试了下各版本创建索引速度对比下。

索引加速功能真能提升10倍吗?--TiDB V6.1.0-V7.1.0建索引速度对比

不测不知道,一测吓一跳,满满的惊喜。一个3kw的表新索引速度从v6.1.0的21min,到v7.1.0只要1min8s。提升了20倍!

以下来自官方文档相关版本关于索引加速功能的Release Notes:

TiDB v6.3.0 支持开启添加索引加速功能,提升了创建索引回填过程的速度。开启该功能后,TiDB 添加索引的性能提升约为原来的 3 倍。 从 v6.5.0 起,添加索引加速功能默认开启。

1、环境说明

环境信息

单节点TiDB环境,物理机。配置:40c512G,7T *** TiDB初始版本:v6.1.0,逐步升级到v6.5.0和v7.1.0

表信息:3kw

直接使用的tpcc里面的customer表,3000w,大小约18G

[root@127.0.0.1][tpcc][05:17:48]> show create table customer\G *************************** 1. row *************************** Table: customer Create Table: CREATE TABLE `customer` ( `c_id` int(11) NOT NULL, `c_d_id` int(11) NOT NULL, `c_w_id` int(11) NOT NULL, `c_first` varchar(16) DEFAULT NULL, `c_middle` char(2) DEFAULT NULL, `c_last` varchar(16) DEFAULT NULL, `c_street_1` varchar(20) DEFAULT NULL, `c_street_2` varchar(20) DEFAULT NULL, `c_city` varchar(20) DEFAULT NULL, `c_state` char(2) DEFAULT NULL, `c_zip` char(9) DEFAULT NULL, `c_phone` char(16) DEFAULT NULL, `c_since` datetime DEFAULT NULL, `c_credit` char(2) DEFAULT NULL, `c_credit_lim` decimal(12,2) DEFAULT NULL, `c_discount` decimal(4,4) DEFAULT NULL, `c_balance` decimal(12,2) DEFAULT NULL, `c_ytd_payment` decimal(12,2) DEFAULT NULL, `c_payment_cnt` int(11) DEFAULT NULL, `c_delivery_cnt` int(11) DEFAULT NULL, `c_data` varchar(500) DEFAULT NULL, PRIMARY KEY (`c_w_id`,`c_d_id`,`c_id`) /*T![clustered_index] NONCLUSTERED */, KEY `idx_customer` (`c_w_id`,`c_d_id`,`c_last`,`c_first`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.01 sec) [root@127.0.0.1][tpcc][05:21:48]> show variables like tidb_ddl%; +----------------------------+--------------+ | Variable_name | Value | +----------------------------+--------------+ | tidb_ddl_error_count_limit | 512 | | tidb_ddl_reorg_batch_size | 256 | | tidb_ddl_reorg_priority | PRIORITY_LOW | | tidb_ddl_reorg_worker_cnt | 4 | +----------------------------+--------------+ 4 rows in set (0.00 sec)

2、v6.1.0测试3kw的表加索引

测试加3个不同类型列的索引

分别测试三个索引,一个是varchar列、一个是char列,一个组合索引

alter table tpcc.customer add index idx_01(c_city); alter table tpcc.customer add index idx_02(c_phone); alter table tpcc.customer add index idx_03(c_phone,c_since);

修改tidb_ddl_reorg_worker_cnt,观察建索引的速度。

经测试,时间差不多,后面就主要以组合索引idx_03为主来做对比。

--默认4时 [root@127.0.0.1][tpcc][05:25:40]> alter table tpcc.customer add index idx_01(c_city); Query OK, 0 rows affected (21 min 4.48 sec) [root@127.0.0.1][tpcc][05:46:48]> alter table tpcc.customer add index idx_02(c_phone); Query OK, 0 rows affected (21 min 8.08 sec)[root@127.0.0.1][tpcc][06:08:50]> set global tidb_ddl_reorg_worker_cnt=8; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][06:08:58]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (10 min 44.26 sec) [root@127.0.0.1][tpcc][06:22:43]> set global tidb_ddl_reorg_worker_cnt=16; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][06:23:01]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (9 min 6.01 sec) [root@127.0.0.1][tpcc][08:38:03]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.33 sec) [root@127.0.0.1][tpcc][08:38:18]> set global tidb_ddl_reorg_worker_cnt=32; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][08:38:29]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (8 min 35.48 sec)

加索引时间

--平均每秒6000行,4个worker select 30000000/(6000*4)/60; --平均每秒1600行,32个worker select 30000000/(1600*32)/60;

tidb日志中可以看到:当tidb_ddl_reorg_worker_cnt=4时,创建索引的读取数据的速度。单线程每秒5k-7k行左右。 当tidb_ddl_reorg_worker_cnt=32时,创建索引的读取数据的速度。单线程每秒只有1600行左右。

[2023/06/19 17:35:06.521 +08:00] [INFO] [backfilling.go:271] ["[ddl] backfill worker back fill index"] [workerID=1] [addedCount=90624] [scanCount=90624] [nextHandle=12317094.next] [speed(rows/s)=6611.8594994214745] [2023/06/19 17:35:06.521 +08:00] [INFO] [backfilling.go:271] ["[ddl] backfill worker back fill index"] [workerID=3] [addedCount=90624] [scanCount=90624] [nextHandle=12609699.next] [speed(rows/s)=6681.175296206864] [2023/06/19 17:35:06.689 +08:00] [INFO] [backfilling.go:271] ["[ddl] backfill worker back fill index"] [workerID=2] [addedCount=90624] [scanCount=90624] [nextHandle=12463112.next] [speed(rows/s)=6661.681631740738] [2023/06/19 17:35:06.904 +08:00] [INFO] [backfilling.go:271] ["[ddl] backfill worker back fill index"] [workerID=0] [addedCount=90624] [scanCount=90624] [nextHandle=12171054.next] [speed(rows/s)=6364.5596641745815] .... [2023/06/20 08:45:14.810 +08:00] [INFO] [backfilling.go:271] ["[ddl] backfill worker back fill index"] [workerID=28] [addedCount=30208] [scanCount=30208] [nextHandle=27312608.next] [speed(rows/s)=1657.5438087595428]

3、v6.5.0测试3kw的表加索引

确认TiDB版本和表信息

[root@127.0.0.1][tpcc][09:14:33]> select version(); +--------------------+ | version() | +--------------------+ | 5.7.25-TiDB-v6.5.0 | +--------------------+ 1 row in set (0.00 sec) [root@127.0.0.1][tpcc][09:14:47]> select count(0) from tpcc.customer; +----------+ | count(0) | +----------+ | 30000000 | +----------+ 1 row in set (0.02 sec)

测试加索引速度

alter table tpcc.customer add index idx_01(c_city); alter table tpcc.customer add index idx_02(c_phone); alter table tpcc.customer add index idx_03(c_phone,c_since);

默认:tidb_ddl_reorg_worker_cnt=4

[root@127.0.0.1][tpcc][09:19:36]> show variables like tidb_ddl_reorg_worker_cnt; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | tidb_ddl_reorg_worker_cnt | 4 | +---------------------------+-------+ 1 row in set (0.00 sec) [root@127.0.0.1][tpcc][09:19:40]> alter table tpcc.customer add index idx_01(c_city); Query OK, 0 rows affected (1 min 19.36 sec) [root@127.0.0.1][tpcc][09:21:03]> alter table tpcc.customer add index idx_02(c_phone); Query OK, 0 rows affected (1 min 19.49 sec) [root@127.0.0.1][tpcc][09:23:51]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (1 min 16.22 sec)

分别测试tidb_ddl_reorg_worker_cnt=8、16、32

set global tidb_ddl_reorg_worker_cnt=8; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_worker_cnt=16; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_worker_cnt=32; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since);

操作日志:

[root@127.0.0.1][tpcc][09:39:18]> set global tidb_ddl_reorg_worker_cnt=8; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][09:40:07]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.73 sec) [root@127.0.0.1][tpcc][09:40:08]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (53.23 sec) [root@127.0.0.1][tpcc][09:41:03]> set global tidb_ddl_reorg_worker_cnt=16; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][09:41:41]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.49 sec) [root@127.0.0.1][tpcc][09:41:41]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (43.99 sec) [root@127.0.0.1][tpcc][09:42:27]> set global tidb_ddl_reorg_worker_cnt=32; Query OK, 0 rows affected (0.24 sec) [root@127.0.0.1][tpcc][09:43:24]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.68 sec) [root@127.0.0.1][tpcc][09:43:25]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (38.04 sec)

索引加速功能

执行 ADMIN SHOW DDL JOBS 语句查看 JOB_TYPE 一列中是否含有 ingest 字样

[root@127.0.0.1][tpcc][09:17:35]> admin show ddl jobs; +--------+---------+------------+------------------------+----------------------+-----------+----------+-----------+---------------------+---------------------+---------------------+---------+ | JOB_ID | DB_NAME | TABLE_NAME | JOB_TYPE | SCHEMA_STATE | SCHEMA_ID | TABLE_ID | ROW_COUNT | CREATE_TIME | START_TIME | END_TIME | STATE | +--------+---------+------------+------------------------+----------------------+-----------+----------+-----------+---------------------+---------------------+---------------------+---------+ | 23464 | tpcc | customer | add index /* ingest */ | write reorganization | 23386 | 23447 | 25853098 | 2023-06-21 09:22:31 | 2023-06-21 09:22:32 | NULL | running | | 23463 | tpcc | customer | add index /* ingest */ | public | 23386 | 23447 | 30000000 | 2023-06-21 09:19:44 | 2023-06-21 09:19:44 | 2023-06-21 09:21:03 | synced | | 23462 | tpcc | customer | drop index | none | 23386 | 23447 | 0 | 2023-06-21 09:19:34 | 2023-06-21 09:19:34 | 2023-06-21 09:19:34 | synced | | 23461 | tpcc | customer | add index /* ingest */ | public | 23386 | 23447 | 30000000 | 2023-06-21 09:16:40 | 2023-06-21 09:16:40 | 2023-06-21 09:17:44 | synced |

相关参数:

tidb_ddl_enable_fast_reorg tidb_ddl_disk_quota

https://docs.pingcap.com/zh/tidb/stable/system-variables#tidb_ddl_enable_fast_reorg-span-classversion-mark%E4%BB%8E-v630-%E7%89%88%E6%9C%AC%E5%BC%80%E5%A7%8B%E5%BC%95%E5%85%A5span

在升级到 v6.5.0 及以上版本时,建议你检查 TiDB 的 temp-dir 路径是否正确挂载了 *** 磁盘。该参数是 TiDB 的配置参数,设置后需要重启 TiDB 才能生效。因此,在升级前提前进行设置,可以避免再次重启。

使用快速索引功能,建索引的时候,数据会回填到temp-dir指定目录,过程中可以看到该目录空间在持续增长。

[tidb@host_130 tmp_ddl-4000]$ du -sh /tmp/tidb/tmp_ddl-4000/ 0 /tmp/tidb/tmp_ddl-4000/ [tidb@host_130 tmp_ddl-4000]$ du -sh /tmp/tidb/tmp_ddl-4000/ 88M /tmp/tidb/tmp_ddl-4000/ [tidb@host_130 tmp_ddl-4000]$ du -sh /tmp/tidb/tmp_ddl-4000/ 193M /tmp/tidb/tmp_ddl-4000/ [tidb@host_130 tmp_ddl-4000]$ tree /tmp/tidb/tmp_ddl-4000/ /tmp/tidb/tmp_ddl-4000/ └── 23464 ├── d4cacd68-ac06-558d-8dca-9bd229f322d2 │   ├── 000002.log │   ├── CURRENT │   ├── LOCK │   ├── MANIFEST-000001 │   └── OPTIONS-000003 └── d4cacd68-ac06-558d-8dca-9bd229f322d2.sst ├── 010ff049-964c-4f42-8880-47d69a44fede.sst ├── 078d2401-a319-4a62-928a-9af030faf742.sst ├── 2497f448-a7fe-4e6f-a9be-cea2bd4413c4.sst ├── 26ac2d83-2ca9-4962-80a3-2dcdf556e3f1.sst ├── 39549396-c9f2-4f06-8edf-7b325376e6a4.sst ├── 69be99c9-cc7c-42bf-97cc-f67547d72865.sst ├── 70b137b4-0fcf-4139-8c25-d03a4a1ef932.sst ├── 735fe13f-cf00-468c-91f9-2d9578064d60.sst ├── 7c50175e-f97f-49e5-8a5b-eb7ea4acbfb1.sst ├── 86e551b8-93fb-4862-82cb-a3a0c34ce280.sst ├── 913eef57-b90b-45df-b1d5-85ee52a95471.sst ├── 9d736cbb-fc5b-4df2-a203-aab03afdedea.sst ├── def3a75f-bee9-4e45-b1ae-a8085f7daa57.sst ├── e0151d63-ff16-4f09-946f-c18f35fb1e94.sst └── f32bbfe0-9672-4f25-a20f-af391e7c50d6.sst 3 directories, 20 files

4、v7.1.0测试3kw的表加索引

TiDB版本和表信息

[root@127.0.0.1][tpcc][08:40:23]> select version(); +--------------------+ | version() | +--------------------+ | 5.7.25-TiDB-v7.1.0 | +--------------------+ 1 row in set (0.00 sec) [root@127.0.0.1][tpcc][08:40:42]> select count(0) from tpcc.customer; +----------+ | count(0) | +----------+ | 30000000 | +----------+ 1 row in set (4.00 sec)

不同worker下,加索引速度对比

默认:tidb_ddl_reorg_worker_cnt=4

[root@127.0.0.1][tpcc][08:40:55]> set global tidb_ddl_reorg_worker_cnt=4; Query OK, 0 rows affected (0.04 sec) [root@127.0.0.1][tpcc][08:41:20]> show variables like tidb_ddl_%; +--------------------------------+--------------+ | Variable_name | Value | +--------------------------------+--------------+ | tidb_ddl_disk_quota | 107374182400 | | tidb_ddl_enable_fast_reorg | ON | | tidb_ddl_error_count_limit | 512 | | tidb_ddl_flashback_concurrency | 64 | | tidb_ddl_reorg_batch_size | 256 | | tidb_ddl_reorg_priority | PRIORITY_LOW | | tidb_ddl_reorg_worker_cnt | 4 | +--------------------------------+--------------+ 7 rows in set (0.01 sec) [root@127.0.0.1][tpcc][10:37:58]> alter table tpcc.customer add index idx_01(c_city); Query OK, 0 rows affected (1 min 8.36 sec) [root@127.0.0.1][tpcc][10:29:31]> alter table tpcc.customer add index idx_02(c_phone); Query OK, 0 rows affected (1 min 7.93 sec) [root@127.0.0.1][tpcc][10:36:00]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (1 min 7.64 sec)

分别测试tidb_ddl_reorg_worker_cnt=8、16、32

set global tidb_ddl_reorg_worker_cnt=8; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_worker_cnt=16; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_worker_cnt=32; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since);

执行记录:

[root@127.0.0.1][tpcc][10:43:58]> set global tidb_ddl_reorg_worker_cnt=8; Query OK, 0 rows affected (0.03 sec) [root@127.0.0.1][tpcc][10:44:00]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.47 sec) [root@127.0.0.1][tpcc][10:44:01]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (44.50 sec) [root@127.0.0.1][tpcc][10:44:47]> set global tidb_ddl_reorg_worker_cnt=16; Query OK, 0 rows affected (0.01 sec) [root@127.0.0.1][tpcc][10:46:04]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.47 sec) [root@127.0.0.1][tpcc][10:46:05]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (36.20 sec) [root@127.0.0.1][tpcc][10:46:42]> set global tidb_ddl_reorg_worker_cnt=32; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][10:46:46]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.41 sec) [root@127.0.0.1][tpcc][10:46:46]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (31.45 sec)

不同tidb_ddl_reorg_batch_size下速度对比

tidb_ddl_reorg_batch_size 默认256.

set global tidb_ddl_reorg_worker_cnt=4; set global tidb_ddl_reorg_batch_size=512; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_batch_size=1024; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since); set global tidb_ddl_reorg_batch_size=2048; alter table tpcc.customer drop index idx_03; alter table tpcc.customer add index idx_03(c_phone,c_since);

操作日志:

[root@127.0.0.1][tpcc][05:23:46]> set global tidb_ddl_reorg_batch_size=512; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][05:23:53]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.44 sec) [root@127.0.0.1][tpcc][05:23:54]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (1 min 5.60 sec) [root@127.0.0.1][tpcc][05:25:01]> show variables like tidb_ddl_%; +--------------------------------+--------------+ | Variable_name | Value | +--------------------------------+--------------+ | tidb_ddl_disk_quota | 107374182400 | | tidb_ddl_enable_fast_reorg | ON | | tidb_ddl_error_count_limit | 512 | | tidb_ddl_flashback_concurrency | 64 | | tidb_ddl_reorg_batch_size | 512 | | tidb_ddl_reorg_priority | PRIORITY_LOW | | tidb_ddl_reorg_worker_cnt | 4 | +--------------------------------+--------------+ 7 rows in set (0.00 sec) [root@127.0.0.1][tpcc][05:25:26]> set global tidb_ddl_reorg_batch_size=1024; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][05:25:31]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.47 sec) [root@127.0.0.1][tpcc][05:25:32]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (58.42 sec) [root@127.0.0.1][tpcc][05:26:30]> [root@127.0.0.1][tpcc][05:26:30]> set global tidb_ddl_reorg_batch_size=2048; Query OK, 0 rows affected (0.02 sec) [root@127.0.0.1][tpcc][05:26:57]> alter table tpcc.customer drop index idx_03; Query OK, 0 rows affected (0.42 sec) [root@127.0.0.1][tpcc][05:26:58]> alter table tpcc.customer add index idx_03(c_phone,c_since); Query OK, 0 rows affected (56.39 sec)

5、小结

版本表数据量tidb_ddl_reorg_worker_cnt=4(默认)tidb_ddl_reorg_worker_cnt=8tidb_ddl_reorg_worker_cnt=16tidb_ddl_reorg_worker_cnt=32v6.1.03000w21min10min44s9min6s8min35sv6.5.03000w1min19s53s43s38sv7.1.03000w1min8s44s36s31s

从上面的数据我们可以看到,在V6.5.0版本中,默认4个worker下,建索引速度对比v6.1.0提升了约20倍,随着worker个数增加速度会进一步提升,但是越到后面提升越少,可能跟数据量有关系哈,测试结果仅供参考。

V7.1.0,相比V6.5.0也有一定提升,在worker为8和16的情况下,大约提升了16%((53-44)/53=0.1698,(43-36)/43=0.1628)。

另外,在V7.1.0下,测试了不同tidb_ddl_reorg_batch_size下的影响,也是会有一定提升。

版本表数据量tidb_ddl_reorg_batch_size=512(默认)tidb_ddl_reorg_batch_size=1024tidb_ddl_reorg_batch_size=2048v7.1.03000w1min5s58s56s

以上就是在同一台机器上测试不同TiDB版本建索引速度的对比。V6.3开始新增的索引加速功能确实提升非常大,还在用老版本的同学,快升级测试下吧。

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:TiDB 7.1 资源管控验证测试
下一篇:tidb之旅——生成列
相关文章