黄东旭解析 TiDB 的核心优势
348
2024-03-30
定义:TIDB的资源管控 (Resource Control) ,使用资源管控特性,将用户绑定到某个资源组后,TiDB 层会根据用户 所绑定资源组设定的配额对用户的读写请求做流控,TiKV 层会根据配额映射的优先级来对请求做调度。通过 流控和调度这两层控制,可以实现应用的资源隔离,满足服务质量 (QoS) 要求。
个人理解:resource manager在于我的理解,最大的收益就是资源整合,降低数据库的使用成本,尤其对于企业db层被paas平台化后的性价比提升。多租户应用程序,多业务系统资源整合,实现并发访问控制,每个租户配置自己的隔离组,确保资源使用容量不相互干扰,这有助于优化资源利用率并降低成本。除此之外还适用这些场景:读写分离控制;测试开发混部;批处理夜维与工作日资源控制。
通过 TiUP 部署 TiDB v7.1.0 (略)
开启tidb的reource control(默认开启),并在进行资源规划之前,使用 CALIBRATE RESOURCE 估算集群容量。
使用默认test数据库、oltp_user及oltp_rg资源组,使用tiup bench tpcc测试方法,生成测试数据,模拟日常的oltp交易场景
创建olap数据库、olap_user及olap_rg资源组,使用tiup bench tpch测试方法,生成测试数据,模拟日常的olap统计场景
分别启动 tpcc/tpch测试。tpcc测试 20 分钟,tpch测试 20 轮。
在dashboard观察资源隔离效果,并简单总结测试。
[root@vlan99 ~]# mpstat -P ALL|head -1 Linux 3.10.0-957.el7.x86_64 (vlan99) 06/09/2023 x86_64 (8 CPU) [root@vlan99 ~]# free -m total used free shared buff/cache available Mem: 15884 12050 290 793 3544 2710<截图dashboard的主机面板>
3.2 确认reource control开启状态并估算容量MySQL [(none)]> show variables like %tidb_enable_resource_control%; +------------------------------+-------+ | Variable_name | Value | +------------------------------+-------+ | tidb_enable_resource_control | ON | +------------------------------+-------+ 1 row in set (0.00 sec) MySQL [(none)]> calibrate resource workload oltp_read_write; +-------+ | QUOTA | +-------+ | 14886 | +-------+ 1 row in set (0.01 sec)
MySQL [(none)]> create resource group if not exists oltp_rg ru_per_sec=5000; Query OK, 0 rows affected (0.15 sec) MySQL [(none)]> create resource group if not exists olap_rg ru_per_sec=2000; Query OK, 0 rows affected (0.15 sec) MySQL [(none)]> create user oltp_user identified by oltp_user resource group oltp_rg; Query OK, 0 rows affected (0.18 sec) MySQL [(none)]> create user olap_user identified by olap_user resource group olap_rg; Query OK, 0 rows affected (0.03 sec) MySQL [(none)]> grant all on . to oltp_user; Query OK, 0 rows affected (0.03 sec) MySQL [(none)]> grant all on . to olap_user; Query OK, 0 rows affected (0.03 sec) MySQL [(none)]> select user,host,user_attributes from mysql.user; +-----------+-----------+-------------------------------+ | user | host | user_attributes | +-----------+-----------+-------------------------------+ | root | % | NULL | | root | localhost | NULL | | oltp_user | % | {"resource_group": "oltp_rg"} | | olap_user | % | {"resource_group": "olap_rg"} | +-----------+-----------+-------------------------------+ 4 rows in set (0.00 sec)
4.2 TPCC测试数据准备[root@vlan99 ~]# tiup bench tpcc --warehouses 1 --parts 1 prepare tiup is checking updates for component bench ...timeout(2s)! Starting component bench: /root/.tiup/components/bench/v1.12.0/tiup-bench tpcc --warehouses 1 --parts 1 prepare creating table warehouse creating table district creating table customer 略 begin to check warehouse 1 at condition 3.3.2.9 begin to check warehouse 1 at condition 3.3.2.12 Finished
4.3 TPCH测试数据准备[root@vlan99 ~]# time tiup bench tpch --sf=1 prepare tiup is checking updates for component bench ...timeout(2s)! Starting component bench: /root/.tiup/components/bench/v1.12.0/tiup-bench tpch --sf=1 prepare creating nation creating region 略 generate orders/lineitem tables done Finished
tpcc测试 20 分钟,tpch测试 20 轮。
5.1 tpcc测试结果tiup bench tpcc --warehouses 1 --time 20m --user=oltp_user -poltp_user run
TPC-C 使用 tpmC 值 (Transactions per Minute) 来衡量系统最大有效吞吐量 (MQTh, Max Qualified Throughput),其中 Transactions 以 NewOrder Transaction 为准,即最终衡量单位为每分钟处理的新订单数。
[Summary] DELIVERY - Takes(s): 599.2, Count: 2376, TPM: 237.9, Sum(ms): 100035.8, Avg(ms): 42.1, 50th(ms): 41.9, 90th(ms): 50.3, 95th(ms): 52.4, 99th(ms): 62.9, 99.9th(ms): 104.9, Max(ms): 125.8 [Summary] NEW_ORDER - Takes(s): 599.7, Count: 26849, TPM: 2686.3, Sum(ms): 311837.6, Avg(ms): 11.6, 50th(ms): 11.5, 90th(ms): 14.2, 95th(ms): 15.2, 99th(ms): 19.9, 99.9th(ms): 44.0, Max(ms): 369.1 [Summary] ORDER_STATUS - Takes(s): 599.5, Count: 2436, TPM: 243.8, Sum(ms): 11000.2, Avg(ms): 4.5, 50th(ms): 4.7, 90th(ms): 5.8, 95th(ms): 6.8, 99th(ms): 9.4, 99.9th(ms): 29.4, Max(ms): 62.9 [Summary] PAYMENT - Takes(s): 599.9, Count: 25509, TPM: 2551.4, Sum(ms): 152591.7, Avg(ms): 6.0, 50th(ms): 5.8, 90th(ms): 7.3, 95th(ms): 8.4, 99th(ms): 11.5, 99.9th(ms): 32.5, Max(ms): 125.8 [Summary] PAYMENT_ERR - Takes(s): 599.9, Count: 1, TPM: 0.1, Sum(ms): 2.2, Avg(ms): 2.4, 50th(ms): 2.6, 90th(ms): 2.6, 95th(ms): 2.6, 99th(ms): 2.6, 99.9th(ms): 2.6, Max(ms): 2.6 [Summary] STOCK_LEVEL - Takes(s): 599.5, Count: 2343, TPM: 234.5, Sum(ms): 16923.3, Avg(ms): 7.2, 50th(ms): 7.3, 90th(ms): 8.9, 95th(ms): 9.4, 99th(ms): 12.1, 99.9th(ms): 17.8, Max(ms): 21.0 tpmC: 2686.3, tpmTotal: 5953.9, efficiency: 20888.9%
5.2 tpch测试结果[root@vlan99 ~]# tiup bench tpch --count=20 --sf=1 --user=olap_user -polap_user --db=olap run tiup is checking updates for component bench ...timeout(2s)! Starting component bench: /root/.tiup/components/bench/v1.12.0/tiup-bench tpch --count=20 --sf=1 --user=olap_user -polap_user --db=olap run [Current] Q1: 5.27s [Current] Q2: 1.71s 略[Summary] Q7: 1.78s [Summary] Q8: 1.71s [Summary] Q9: 3.12s
dashboard观察资源隔离效果
6.1 tpcc(oltp用户)测试结果,可以限制到指定的5000RU 6.2 tpch(olap用户)测试结果,可以限制到指定的2000RU优点:
v7.1版本的resource control资源管控,将cpu/内存/io的物理资源抽象成RU的感念,简化了使用方式,并且测试中发现该特性可以正常限制住RU,达到资源控制的效果。
改进方向:
不能实时限制,需要用户重新登陆。
resource语句整合到createuser中。
建议:最好能用百分比控制RU。
整合session/sql的并发限制/执行时长/idle time等资源到resource control中。
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。