Mycat2 全局二級索引

2021-09-09 15:23 更新

為了解決分庫分表中間件在路由分片規(guī)則難以映射多個分片維度的問題,Mycat2使用分片表數(shù)據(jù)冗余方案。使用不同與原分片表的分片算法,建立原分片表的列子集甚至全集的分片表,這類分片表在Mycat2中稱為索引表.

約束

一個原分片表可以建立多個索引表,是一對多關(guān)系。

原分片表必須有主鍵,主鍵用于建立索引表與主表的行關(guān)系。

索引表也必須有主鍵,原分片表的分片鍵:

  • 當使用索引表后,建議在插入語句中,需要顯式帶有原分片表和索引表的分片的值.
  • 當使用索引表后,不建議對分片表進行頻繁更新操作,因為會觸發(fā)索引表更新.在帶有條件的update或者delete語句下,一次更新不能超過1000行.而不帶條件下的delete語句則沒有這個限制.

Mycat僅保證在更新.插入語句中自動使用XA事務(在XA事務模式下),保證數(shù)據(jù)一致性。

在使用分片表途中建立全局二級索引,Mycat2僅僅是更改配置,而不負責數(shù)據(jù)同步,因為暫時缺乏schema變更的數(shù)據(jù)同步組件

在使用全局二級索引后,能有效減少全表掃描,對于減少連接使用,減少計算節(jié)點與存儲節(jié)點的數(shù)據(jù)傳輸有幫助.

自動提交下如果使用XA事務模式,涉及多個節(jié)點,自動開啟XA事務,如果遇上插入失敗,則隱式自動嘗試回滾XA事務,建議顯式啟動事務進行數(shù)據(jù)插入,修改,且插入值不要涉及SQL函數(shù)調(diào)用避免運行時報錯

如果僅涉及一個mysql為主節(jié)點,那么可以使用proxy事務

自動提交下如果使用Proxy(本地)事務模式,涉及多個節(jié)點,如果遇上提交失敗,則無法回滾已經(jīng)成功插入數(shù)據(jù)的連接.需要業(yè)務系統(tǒng)顯式編寫刪除數(shù)據(jù)的代碼

所以建議使用XA事務使用

具體的執(zhí)行語句可以通過explain語句查看

建立全局二級索引有三種辦法

1.在建表語句中帶有全局二級索引信息

CREATE TABLE IF NOT EXISTS db1.`travelrecord` (
    `id` bigint NOT NULL AUTO_INCREMENT,
    `user_id` varchar(100) DEFAULT NULL,
    `traveldate` date DEFAULT NULL,
    `fee` decimal(10, 0) DEFAULT NULL,
    `days` int DEFAULT NULL,
    `blob` longblob,
    PRIMARY KEY (`id`),
    KEY `id` (`id`),
    GLOBAL INDEX `g_i_user_id`(`user_id`) COVERING (`fee`, id) DBPARTITION BY mod_hash(`user_id`) TBPARTITION BY mod_hash(`user_id`) DBPARTITIONS 2 TBPARTITIONS 2
) ENGINE = InnoDB CHARSET = utf8
DBPARTITION BY mod_hash(id) DBPARTITIONS 2
TBPARTITION BY mod_hash(id) TBPARTITIONS 2

該全局二級索引的語法是

GLOBAL INDEX `g_i_user_id`(`user_id`) COVERING (`fee`, id) DBPARTITION BY mod_hash(`user_id`) TBPARTITION BY mod_hash(`user_id`) DBPARTITIONS 2 TBPARTITIONS 2

g_i_user_id是索引名字,它不能與mysql中的索引名重復

其后的user_id是索引列(字段)名字,在這里它就是分片字段,必須與分片字段相同

COVERING后面的是覆蓋列,冗余的信息,用于減少掃描原分片表或者優(yōu)化為不掃描原分片表,

索引列與覆蓋列必須包含原分片表的主鍵,原分片字段,否則無法進行數(shù)據(jù)插入,無法正確路由.

DBPARTITION BY mod_hash(`user_id`) TBPARTITION BY mod_hash(`user_id`) DBPARTITIONS 2 TBPARTITIONS 2

此處配置與分片表的建表語句相同,不再重復.

2.在已有分片表中添加全局二級索引

//建立分片表
CREATE TABLE db1.`travelrecord` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `user_id` varchar(100) DEFAULT NULL,
  `traveldate` date DEFAULT NULL,
  `fee` decimal(10,0) DEFAULT NULL,
  `days` int DEFAULT NULL,
  `blob` longblob,
  PRIMARY KEY (`id`),
  KEY `id` (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 dbpartition by mod_hash(id) tbpartition by mod_hash(id) tbpartitions 2 dbpartitions 2;


//建立全局二級索引
 CREATE UNIQUE GLOBAL INDEX `g_i_user_id` ON `db1`.`travelrecord`(`user_id`) 
    COVERING(`fee`,id) 
    dbpartition by mod_hash(`user_id`) tbpartition by mod_hash(`user_id`) dbpartitions 2 tbpartitions 2

3.在配置文件中自定義索引表

{
    "customTables":{},
    "globalTables":{},
    "normalTables":{},
    "schemaName":"db1",
    "shardingTables":{
        "travelrecord":{
            "createTableSQL":"CREATE TABLE IF NOT EXISTS db1.`travelrecord` (\n\t`id` bigint NOT NULL AUTO_INCREMENT,\n\t`user_id` varchar(100) DEFAULT NULL,\n\t`traveldate` date DEFAULT NULL,\n\t`fee` decimal(10, 0) DEFAULT NULL,\n\t`days` int DEFAULT NULL,\n\t`blob` longblob,\n\tPRIMARY KEY (`id`),\n\tKEY `id` (`id`),\n\tGLOBAL INDEX `g_i_user_id`(`user_id`) COVERING (`fee`, id) DBPARTITION BY mod_hash(`user_id`) TBPARTITION BY mod_hash(`user_id`) DBPARTITIONS 2 TBPARTITIONS 2\n) ENGINE = InnoDB CHARSET = utf8\nDBPARTITION BY mod_hash(id) DBPARTITIONS 2\nTBPARTITION BY mod_hash(id) TBPARTITIONS 2",
            "function":{
                "properties":{
                    "dbNum":"2",
                    "mappingFormat":"c${targetIndex}/db1_${dbIndex}/travelrecord_${index}",
                    "tableNum":"2",
                    "tableMethod":"mod_hash(id)",
                    "storeNum":2,
                    "dbMethod":"mod_hash(id)"
                }
            },
            "shardingIndexTables":{
                "travelrecord_g_i_user_id":{
                    "createTableSQL":"CREATE TABLE IF NOT EXISTS db1.travelrecord_g_i_user_id (\n\t`user_id` varchar(100) DEFAULT NULL,\n\t`fee` decimal(10, 0) DEFAULT NULL,\n\t`id` bigint NOT NULL AUTO_INCREMENT,\n\tPRIMARY KEY (`id`)\n)\nDBPARTITION BY mod_hash(`user_id`)\nTBPARTITION BY mod_hash(`user_id`) TBPARTITIONS 2",
                    "function":{
                        "properties":{
                            "dbNum":"2",
                            "mappingFormat":"c${targetIndex}/db1_${dbIndex}/travelrecord_g_i_user_id_${index}",
                            "tableNum":"2",
                            "tableMethod":"mod_hash(`user_id`)",
                            "storeNum":2,
                            "dbMethod":"mod_hash(`user_id`)"
                        }
                    }
                }
            }
        }
    },
    "views":{}
}

shardingIndexTablesshardingTables中元素的一個屬性,travelrecord_g_i_user_id的配置與分片表的配置沒有區(qū)別

以下是mycat2中索引表的表名自動生成規(guī)則

CREATE UNIQUE GLOBAL INDEX `g_i_user_id` ON `db1`.`travelrecord`(`user_id`) 
    COVERING(`fee`,id) 
    dbpartition by mod_hash(`user_id`) tbpartition by mod_hash(`user_id`) dbpartitions 2 tbpartitions 2

索引表名 = 原分片表名 + "_" + 索引名 索引庫名 = 原分庫名

所以索引名就是索引表名去掉(原分片表名+"_")

例子

例子1 查詢原分片表,分片鍵是id

select * from db1.travelrecord where id = 1


MycatView(distribution=[[db1.travelrecord]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
Each(targetName=c0, sql=SELECT * FROM db1_0.travelrecord_1 AS `travelrecord` WHERE (`travelrecord`.`id` = ?))

例子2 查詢索引表,分片鍵是user_id

select * from db1.travelrecord where user_id = 1


MycatProject(id=[$0], user_id=[$1], traveldate=[$3], fee=[$2], days=[$4], blob=[$5])
  MycatSQLTableLookup(condition=[=($0, $7)], joinType=[inner], type=[BACK], correlationIds=[[$cor0]], leftKeys=[[0]])
    MycatView(distribution=[[db1.travelrecord_g_i_user_id]], conditions=[=($0, ?0)])
    MycatView(distribution=[[db1.travelrecord]])

    
    //查詢原分片表
Each(targetName=c0, sql=SELECT * FROM (SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id`         FROM db1_0.travelrecord_0 AS `travelrecord`         WHERE (`travelrecord`.`user_id` = ?)) AS `t0` WHERE ((`t0`.`id`) IN ($cor0)) union all SELECT * FROM (SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id`         FROM db1_0.travelrecord_1 AS `travelrecord`         WHERE (`travelrecord`.`user_id` = ?)) AS `t0` WHERE ((`t0`.`id`) IN ($cor0)))


    //查詢原分片表
Each(targetName=c1, sql=SELECT * FROM (SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id`         FROM db1_1.travelrecord_2 AS `travelrecord`         WHERE (`travelrecord`.`user_id` = ?)) AS `t0` WHERE ((`t0`.`id`) IN ($cor0)) union all SELECT * FROM (SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id`         FROM db1_1.travelrecord_3 AS `travelrecord`         WHERE (`travelrecord`.`user_id` = ?)) AS `t0` WHERE ((`t0`.`id`) IN ($cor0)))


    //查詢索引表
Each(targetName=c0, sql=SELECT `travelrecord_g_i_user_id`.`id`, `travelrecord_g_i_user_id`.`user_id`, `travelrecord_g_i_user_id`.`fee` FROM db1_0.travelrecord_g_i_user_id_1 AS `travelrecord_g_i_user_id` WHERE (`travelrecord_g_i_user_id`.`user_id` = ?))

其中查詢原分片表的sql可以在執(zhí)行器中進一步進行分區(qū)剪裁(因為主鍵是分片鍵),實際上執(zhí)行SQL只有兩條

例子3 查詢索引表,分片鍵是user_id且只查詢覆蓋列

select fee from db1.travelrecord where user_id = 1


MycatView(distribution=[[db1.travelrecord_g_i_user_id]], conditions=[=($0, ?0)])
Each(targetName=c0, sql=SELECT `travelrecord_g_i_user_id`.`fee` FROM db1_0.travelrecord_g_i_user_id_1 AS `travelrecord_g_i_user_id` WHERE (`travelrecord_g_i_user_id`.`user_id` = ?))

可以看出只有一條SQL

例子4 使用全局二級索引注釋(2021-7-26后)

 EXPLAIN SELECT * FROM db1.travelrecord FORCE INDEX(g_i_user_id) WHERE user_id =1 


plan
MycatProject(id=[$0], user_id=[$1], traveldate=[$3], fee=[$2], days=[$4], blob=[$5])
  MycatHashJoin(condition=[=($0, $7)], joinType=[inner])
    MycatView(distribution=[[db1.travelrecord_g_i_user_id]], conditions=[=($0, ?0)])
    MycatView(distribution=[[db1.travelrecord]], conditions=[=(CAST($1):DOUBLE, CAST(?0):DOUBLE NOT NULL)])

    
Each(targetName=c0, sql=SELECT `travelrecord_g_i_user_id`.`id`, `travelrecord_g_i_user_id`.`user_id`, `travelrecord_g_i_user_id`.`fee` FROM db1_0.travelrecord_g_i_user_id_1 AS `travelrecord_g_i_user_id` WHERE (`travelrecord_g_i_user_id`.`user_id` = ?))
Each(targetName=c0, sql=SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id` FROM db1_0.travelrecord_0 AS `travelrecord` WHERE (`travelrecord`.`user_id` = ?) union all SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id` FROM db1_0.travelrecord_1 AS `travelrecord` WHERE (`travelrecord`.`user_id` = ?))
Each(targetName=c1, sql=SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id` FROM db1_1.travelrecord_2 AS `travelrecord` WHERE (`travelrecord`.`user_id` = ?) union all SELECT `travelrecord`.`traveldate`, `travelrecord`.`days`, `travelrecord`.`blob`, `travelrecord`.`user_id`, `travelrecord`.`id` FROM db1_1.travelrecord_3 AS `travelrecord` WHERE (`travelrecord`.`user_id` = ?))

可以看出使用了travelrecord_g_i_user_id查詢

寫放大問題

explain insert db1.travelrecord (id,user_id) values(100,100)

對應實際執(zhí)行的sql是

VertxExecuter.EachSQL(target=c0, sql=INSERT INTO db1_0.travelrecord_0 (id, user_id)
VALUES (?, ?), params=[100, 100])
VertxExecuter.EachSQL(target=c0, sql=INSERT INTO db1_0.travelrecord_g_i_user_id_1 (user_id, fee, id)
VALUES (?, NULL, ?), params=[100, 100])

而在不配置索引表的情況下是

VertxExecuter.EachSQL(target=c0, sql=INSERT INTO db1_0.travelrecord_0 (id, user_id)
VALUES (?, ?), params=[100, 100])

盡管Mycat2會把插入語句以集群一個連接為鍵進行分組并行插入,但是還是會有一點性能開銷

以上內(nèi)容是否對您有幫助:
在線筆記
App下載
App下載

掃描二維碼

下載編程獅App

公眾號
微信公眾號

編程獅公眾號