DM 数据迁移分库分表准备过程探讨

网友投稿 678 2024-03-13

一、概述

分库分表的悲观协调方法是 2018 年开发的特性，是 DM 首次支持 MySQL 分库分表的迁移。由于分库分表在各个公司中的应用实在太过广泛，所以只有在支持分库分表迁移后，DM 才有了工程实践的意义，否则只能算作一个玩具。这对 DM 来说意义重大。

DM 数据迁移分库分表准备过程探讨

由于悲观协调的内容庞大，本节只讲述悲观协调的准备过程：

Master 的准备过程

Worker 的准备过程

以及在悲观协调过程中至关重要的结构体：

Lock

Info

Operation

ShardingMeta

ShardingGroup

...

注：为了专注于我们的目的（悲观协调），本文不会对无关代码进行解读

这一节外链中的代码，读者可能会产生这个逻辑为什么是这里的疑问。这是因为本节并没有完全按照顺序读代码。所以读者可以仅带着学习的心态阅读本节，暂时不需要知道它为什么在代码的这个地方。在下一节中会顺序阅读代码

本节基于 DM release-6.0.0

二、Master

1、Pessimist 和 LockKeeper

NewPessimist：Pessimist 是 Master 处理悲观协调的结构体，其中最重要的成员是 LockKeeper。

LockKeeper 的作用是管理各种各样的 lock。

// LockKeeper used to keep and handle DDL lock conveniently. // The lock information do not need to be persistent, and can be re-constructed from the shard DDL info. type LockKeeper struct { mu sync.RWMutex locks map[string]*Lock // lockID -> Lock }

Lock 可以看作是每一个 ddl 影响的库表，这些库表分布在不同的 MySQL/source 中。由于在悲观协调过程中，数据流（binlog 流）不是停滞的，所以在数据流中，会源源不断地出现新 DDL，即新 lock。

故 LockKeeper就是管理这些 lock。

// Lock represents the shard DDL lock in memory. // This information does not need to be persistent, and can be re-constructed from the shard DDL info. type Lock struct { mu sync.RWMutex ID string // locks ID Task string // locks corresponding task name Owner string // Owners source ID (not DM-workers name) DDLs []string // DDL statements remain int // remain count of sources needed to receive DDL info // whether the DDL info received from the source. // if all of them have been ready, then we call the lock `synced`. ready map[string]bool // whether the operations have done (exec/skip the shard DDL). // if all of them have done, then we call the lock `resolved`. done map[string]bool }

2、Lock

ID 表现形式为 ${task}-${schema}.${table}，从 LockKeeper 的结构可以看到同一任务同一表下只能有一个 lock，如果遇到了不同的 DDL，影响的又是同任务同表，则报错。

DDLs 为什么是 []string？因为这里是经过 Split 的 DDLs，单条 DDL 可能会被 Split 成多条 DDL。

Remain 表示剩余的还未收到 DDL 的 source/worker（以下 worker 都表示 source/worker 对）数量。即如果 worker 把这个 Lock 相关的 DDLs（实际上是 info 的形式）发给 master，master 则把remain --。

Ready 和 remain 表达相同的意思，remain 即表示 ready 中为 true 的个数

Done 表示 operation 结束的 worker 的状态

注意：下面说的所有 worker 都 ready 即是 remain <= 0，代码中即是 synced = true。这里创造的概念过多，应该收束一下，减少代码的理解难度。

TODO：可以把代码中的 remain 改成 noReadies；synced 改成 allReady

3、Info

// Info represents the shard DDL information. // This information should be persistent in etcd so can be retrieved after the DM-master leader restarted or changed. // NOTE: `Task` and `Source` are redundant in the etcd key path for convenient. type Info struct { Task string `json:"task"` // data migration task name Source string `json:"source"` // upstream source ID Schema string `json:"schema"` // schema name of the DDL Table string `json:"table"` // table name of the DDL DDLs []string `json:"ddls"` // DDL statements }

Info 并没有什么特别的地方，只是用作 worker 通知 master，某 lock ready 的消息。

4、Operation

// Operation represents a shard DDL coordinate operation. // This information should be persistent in etcd so can be retrieved after the DM-master leader restarted or changed. // NOTE: `Task` and `Source` are redundant in the etcd key path for convenient. type Operation struct { ID string `json:"id"` // the corresponding DDL lock ID Task string `json:"task"` // data migration task name Source string `json:"source"` // upstream source ID DDLs []string `json:"ddls"` // DDL statements Exec bool `json:"exec"` // execute or skip the DDL statements Done bool `json:"done"` // whether the `Exec` operation has done // only used to report to the caller of the watcher, do not marsh it. // if its true, it means the Operation has been deleted in etcd. IsDeleted bool `json:"-"` }

Exec：是 master 向 worker 发送消息时所用到的字段。它告知该 worker 是否需要执行 DDL，只需要 owner 执行 DDL

Done：是 worker 向 master 发送消息时所用到的字段。它告知 master 该 worker done 了。done 并不代表该 worker 执行了 DDL，而是表示一种状态，因为只有 owner 可以执行 DDL，non-owner done 可以当作 non-worker 接受到了 DDL 已执行的信息的确认。

3、Start

以下可以直接在代码中搜索对应的名字，请务必与对应代码一起阅读！

s.pessimist.Start

p.run

watchInfoOperation

pessimism.WatchInfoPut：等待 worker 发送 info

p.handleInfoPut：

收到 Worker 发送的 info 之后，意味着该 worker 已 ready

如果这个时候所有的 source/worker 都 ready（synced）了，则可以发送一个 exec operation 给该 lock 的 owner（第一个创建 lock 的 worker）

pessimism.WatchOperationPut：

等待 worker 发送 operation done。

这里的 operation 和上面发送的 operation 其实是一样的，这里监听的没有指定 task 和 source，上面发送的 operation 是指定 task 和 source 的。所以上面发送的 operation 这里也会监听到，

p.handleOperationPut：

收到 not done operation 这里并不进行操作。即上面发送 exec operation 的时候没有 done。

当收到 operation is done 时，才进行一系列操作。即从 worker 发送过来的 done operation。

首先 markDone

如果是 owner done 了，还会对 non-owner 发送 exec operation，督促 non-owner done

如果所有的 worker 都 done 了（即 resolved），则把 lock 释放

错误处理

三、Worker

1、Pessimist

// Pessimist used to coordinate the shard DDL migration in pessimism mode. typePessimiststruct { mu sync.RWMutex logger log.Logger cli *clientv3.Client task string source string // the shard DDL info which is pending to handle. pendingInfo *pessimism.Info // the shard DDL lock operation which is pending to handle. pendingOp *pessimism.Operation}

NewPessimist：worker 中的 pessimist 不同于 master 中的 pessimist。它只保存了 pendingInfo 和 pendingOp，用于保存当前的 Info 和 Operation

2、ShardingGroupKeeper

NewShardingGroupKeeper：其中最重要的结构体就是 shardingGroup。shardingGroup 和 master 中的 Lock 类似，也是用于管理 DDL 影响到的 table（在 Lock 中是 source/worker）。

// ShardingGroupKeeper used to keep ShardingGroup. type ShardingGroupKeeper struct { sync.RWMutex groups map[string]*ShardingGroup // target table ID -> ShardingGroup cfg *config.SubTaskConfig shardMetaSchema string shardMetaTable string shardMetaTableName string db *conn.BaseDB dbConn *dbconn.DBConn tctx*tcontext.Context }

3、ShardingGroup

// ShardingGroup represents a sharding DDL sync group. type ShardingGroup struct { sync.RWMutex // remain count waiting for syncing // == len(sources): DDL syncing not started or resolved // == 0: all DDLs synced, will be reset to len(sources) after resolved combining with other dm-workers // (0, len(sources)): waiting for syncing // NOTE: we can make remain to be configurable if needed remain int sources map[string]bool // source table ID -> whether source tables DDL synced IsSchemaOnly bool // whether is a schema (database) only DDL TODO: zxc add schema-level syncing support later sourceID string // associate dm-worker source ID meta *shardmeta.ShardingMeta // sharding sequence meta storage firstLocation *binlog.Location // first DDLs binlog pos and gtid, used to restrain the global checkpoint when un-resolved firstEndLocation *binlog.Location // first DDLs binlog End_log_pos and gtid, used to re-direct binlog streamer after synced ddls []string // DDL which current in syncing flavor string enableGTID bool }

Remain 和 Lock 中的 remain 意义完全一样，但是这里指 table 是否 ready（现代码中用的 synced）

whether source tables DDL synced

TODO：应该统一概念，这里也用 ready 表示

sources：和 Lock 中的 ready 类似

firstLocation/firstEndLocation/ddls：这三个其实在 meta 中都有保存，所以感觉其实没啥用

meta：里面保存着所有的有用的信息，下面详细说

4、ShardingMeta

这些信息会被持久化到磁盘中

// ShardingMeta stores sharding ddl sequence // including global sequence and each sources own sequence // NOTE: sharding meta is not thread safe, it must be used in thread safe context. typeShardingMetastruct { activeIdx int // the first unsynced DDL index global *ShardingSequence // merged sharding sequence of all source tables sources map[string]*ShardingSequence // source table ID -> its sharding sequence tableName string // table name (with schema) used in downstream meta db enableGTID bool // whether enableGTID, used to compare location }

activeIdx：表示当前活跃的 DDL 下标，即现在被哪个 DDL 卡住了

global：表示全局的 DDL 序列，即所有 table 中最长的

sources：表示各个 table 的 DDL 序列

5、ShardingSequence 和 DDLItem

// ShardingSequence records a list of DDLItem. type ShardingSequence struct { Items []*DDLItem `json:"items"` }// DDLItem records ddl information used in sharding sequence organization. type DDLItem struct { FirstLocation binlog.Location `json:"-"` // first DDLs binlog Pos, not the End_log_pos of the event DDLs []string `json:"ddls"` // DDLs, these ddls are in the same QueryEvent Source string `json:"source"` // source table ID // just used for jsons marshal and unmarshal, because gtid.Set in FirstLocation is interface, // cant be marshal and unmarshal FirstPosition mysql.Position `json:"first-position"` FirstGTIDSet string `json:"first-gtid-set"` }

这里的 DDLItem 和 Lock 中的 []DDL 其实是一样的，可以看到一个 DDLItem 中也包含多个 DDL。但是这里的 DDLItem 中封装了更多的信息：

DDLs []string `json:"ddls"` // DDLs, these ddls are in the same QueryEvent

FirstLocation：DDL 开始的位点，主要用于识别该 DDL 之前是否来过了。

6、Init

s.sgk.Init()：在下游创建库表 dm_meta.($task)_syncer_sharding_meta

s.initShardingGroups

TODO：可用 utils.FetchTargetDoTables() 替换 fromDB.FetchAllDoTables

s.sgk.LoadShardMeta：把数据读出来，存到内存里

s.sgk.AddGroup：把 ShardingGroup 恢复

Reset

7、Start

启动 syncDDL 线程，等待 DDL job

四、总结

本小节主要介绍了在悲观协调中：

会用到的各种数据结构

Master 运行了几个线程，等待 worker 中信息的来临

Worker 运行了 syncDDL 线程，等待着 DDL 到来

这一节中，罗列了特别多的概念，看不懂是正常的！在下一节，将从 worker 接受到一条 DDL 开始，直到这条 DDL 的 lock 解除的过程。在这个过程中，我们来学习这些结构体到底是怎么用的。

标签：数据迁移 MySQL 分库分表

DM 数据迁移分库分表准备过程探讨

一、概述

二、Master

1、Pessimist 和 LockKeeper

2、Lock

3、Info

4、Operation

3、Start

三、Worker

1、Pessimist

2、ShardingGroupKeeper

3、ShardingGroup

4、ShardingMeta

5、ShardingSequence 和 DDLItem

6、Init

7、Start

四、总结

记录 TiDB v5.2.3 迁移到 v6.1.0 的升级过程

解锁 DM 数据迁移旅程分库分表协调难题

将 AWS S3 数据迁移到 TiDB Cloud 集群的实践

推荐文章

HTAP 还可以这么玩？丨TiDB 在 IoT 智慧园区的应用

新特性解析丨TiDB 资源管控的设计思路与场景解析

TiDB赋能保险业-首个全栈自主核心保单系统成功投产

首个云原生、分布式、全栈国产化银行核心业务系统投产上线丨TiDB × 杭州银行

TiDB 在社交场景的解决方案实践

电商数据技术栈，在海量数据增长下如何实现实时与全量兼得？

金融行业数据库的选择

TiDB 在智能制造中的应用实践

TiDB 在全球头部物流企业计费管理系统的应用实践

PingCAP与教育部教育管理信息中心合作，推动普惠教育数字化转型

友情链接

热评文章

TiDB 中标杭州银行核心系统数据库项目

TiDB 首批通过信通院 HTAP 数据库基础能力评

PingCAP 与 Wisconsin-Madiso

PingCAP 成为中国唯一入选 Forrester

TiDB 走进东软集团，共建医疗数字化基石

共享开源技术，共建开放生态丨平凯星辰余梦杰出席 20

DM 数据迁移分库分表准备过程探讨

一、概述

二、Master

1、Pessimist 和 LockKeeper

2、Lock

3、Info

4、Operation

3、Start

三、Worker

1、Pessimist

2、ShardingGroupKeeper

3、ShardingGroup

4、ShardingMeta

5、ShardingSequence 和 DDLItem

6、Init

7、Start

四、总结

微信扫一扫：分享

推荐文章

友情链接

热评文章