51Testing软件测试论坛

 找回密码
 (注-册)加入51Testing

QQ登录

只需一步,快速开始

微信登录,快人一步

手机号码,快捷登录

查看: 2767|回复: 0
打印 上一主题 下一主题

[转贴] Mysql Fabric高可用集群分片功能测试

[复制链接]
  • TA的每日心情
    擦汗
    6 小时前
  • 签到天数: 1046 天

    连续签到: 4 天

    [LV.10]测试总司令

    跳转到指定楼层
    1#
    发表于 2016-3-21 13:21:16 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式

    一、MySQL Fabric高可用集群中一台数据库崩溃了,不影响数据的完整性

    1.测试前准备

    a) 查看group_id-1集群组的服务器状态

    mysqlfabric group lookup_servers group_id-1

    返回结果:

    Command :
    { success = True
    return = [{'status': 'PRIMARY', 'server_uuid': '7a45f71d-7934-11e4-9e8c-782bcb74823a', 'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.1.71:3306'}, {'status': 'SECONDARY', 'server_uuid': '9cf162ca-7934-11e4-9e8d-782bcb1b6b98','mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.1.76:3306'}, {'status': 'SECONDARY', 'server_uuid': 'ae94200b-7932-11e4-9e81-a4badb30e16b', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.1.70:3306'}]
    activities =}

    此时primary机为机器:192.168.1.71:3306,group_id-1集群组正常

    b) 查看group_id-1集群组中的数据

    mysql -P 3306 -h 192.168.1.76 -u root -e "select * from test.subscribers"
    mysql -P 3306 -h 192.168.1.71 -u root -e "select * from test.subscribers"
    mysql -P 3306 -h 192.168.1.70 -u root -e "select * from test.subscribers"
    返回结果如下:
    +--------+------------+-----------+
    | sub_no | first_name | last_name |
    +--------+------------+-----------+
    | 500 | Billy | Joel |
    | 1500 | Arthur | Askey |
    | 5000 | Billy | Fish |
    | 17542 | Bobby | Ball |
    | 22 | Billy | Bob |
    | 8372 | Banana | Man |
    | 93846 | Bill | Ben |
    | 15050 | John | Smith |
    +--------+------------+-----------+

    c) 利用python接口查看数据库test的数据表subscribers分片后的全部数据

    python read_table_ha.py

    返回结果如下:

    (u'Billy', u'Bob')
    (u'Billy', u'Fish')
    (u'Billy', u'Joel')
    (u'Arthur', u'Askey')
    (u'Banana', u'Man')
    (u'Billy', u'Fish')
    (u'Bill', u'Ben')
    (u'Jimmy', u'White')
    (u'John', u'Smith')
    (u'Bobby', u'Ball')

    2.开始测试

    a) 激活故障自动切换

    即使Fabric选出了master角色,但当这个master角色宕机时,fanric不会自动将secondary角色切换为master角色,所以需要将HA集群组的配置改为自动切换角色

    mysqlfabric group activate group_id-1

    b) 停止group_id-1组中的master角色,即数据库实例192.168.1.71:3306,再查看fabric的状态

    mysqlfabric group lookup_servers group_id-1

    返回结果:

    Command :
    { success = True
    return = [{'status': 'FAULTY', 'server_uuid': '7a45f71d-7934-11e4-9e8c-782bcb74823a', 'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.1.71:3306'}, {'status': 'SECONDARY', 'server_uuid': '9cf162ca-7934-11e4-9e8d-782bcb1b6b98', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.1.76:3306'}, {'status': 'PRIMARY', 'server_uuid': 'ae94200b-7932-11e4-9e81-a4badb30e16b', 'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.1.70:3306'}]
    activities =}

    其中,数据库实例192.168.1.71:3306的状态为FAULTY,mysql fabric自动检测到了HA组的故障,并自动选举了slave角色为primary角色

    c) 查看group_id-1集群组中的数据

    mysql -P 3306 -h 192.168.1.76 -u root -e "select * from test.subscribers"
      mysql -P 3306 -h 192.168.1.70 -u root -e "select * from test.subscribers"

    返回结果:

    +--------+------------+-----------+
    | sub_no | first_name | last_name |
    +--------+------------+-----------+
    | 500 | Billy | Joel |
    | 1500 | Arthur | Askey |
    | 5000 | Billy | Fish |
    | 17542 | Bobby | Ball |
    | 22 | Billy | Bob |
    | 8372 | Banana | Man |
    | 93846 | Bill | Ben |
    | 15050 | John | Smith |
    +--------+------------+-----------+

     d) 利用python接口查看数据库test的数据表subscribers分片后的全部数据

    python read_table_ha.py

    返回结果:

     (u'Billy', u'Bob')
      (u'Billy', u'Fish')
      (u'Billy', u'Joel')
      (u'Arthur', u'Askey')
      (u'Banana', u'Man')
      (u'Billy', u'Fish')
      (u'Bill', u'Ben')
      (u'Jimmy', u'White')
      (u'John', u'Smith')
      (u'Bobby', u'Ball')

    结论:由以上测试返回数据表明,当HA组(group_id-1)中的master角色崩溃,mysql fabric会自动选举一个slave角色为master,并可正常查询原来的数据,因此当mysql fabric高可用集群中某一台数据库崩溃掉,并不影响数据的完整性

    PS:手工恢复FAULT状态的数据库实例

    mysql fabric高可用集群中的服务器状态有四种:primary、secondary、faulty、spare

    关闭primary,primary不能从组中移除,需要使用mysqlfabric group demote group_id-1来关闭组中的primary,不会重新选举一个新的master,同时也不会关闭故障检测

    a) 当其中的一个数据库实例崩溃,可以将这个数据库实例从组中删除,启动这个数据库实例后再重新添加,例如:

    mysqlfabric group remove group_id-1 7a45f71d-7934-11e4-9e8c-782bcb74823a
      mysqlfabric group add group_id-1 192.168.1.71:3306
      mysqlfabric group lookup_servers group_id-1

    返回结果:

    Command :
      { success = Truereturn = [{'status': 'SECONDARY', 'server_uuid': '7a45f71d-7934-11e4-9e8c-782bcb74823a', 'mode': 'READ_ONLY','weight': 1.0, 'address': '192.168.1.71:3306'}, {'status': 'SECONDARY', 'server_uuid': '9cf162ca-7934-11e4-9e8d-782bcb1b6b98', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.1.76:3306'}, {'status': 'PRIMARY', 'server_uuid': 'ae94200b-7932-11e4-9e81-a4badb30e16b','mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.1.70:3306'}] activities =}

    b) 当HA组中全部的服务器宕机,实例重启后,fabric的状态不会自动恢复,需要先执行命令:mysqlfabric group demote group_id-1,因为不能直接修改server的状态由faulty为secondary,需要将状态修改为spare,在改为secondary,如下:

    mysqlfabric group demote group_id-1
      mysqlfabric group deactivate group_id-1
      mysqlfabric server set_status 7a45f71d-7934-11e4-9e8c-782bcb74823a spare
      mysqlfabric server set_status 7a45f71d-7934-11e4-9e8c-782bcb74823a secondary

    ###当所有的数据库实例都为secondary状态时,可选举master角色,命令如下:

    mysqlfabric group promote group_id-1

    二、MySQL Fabric高可用集群存储均衡

    1.测试前准备

    a) 测试脚本

    cat test_add_subs_shards.py
    import mysql.connector
    from mysql.connector import fabric
    import math
    def add_subscriber(conn, sub_no, first_name, last_name):
    conn.set_property(tables=["test.subscribers"], key=sub_no, \
    mode=fabric.MODE_READWRITE)
    cur = conn.cursor()
    cur.execute(
    "INSERT INTO subscribers VALUES (%s, %s, %s)",
    (sub_no, first_name, last_name)
    )
    conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", \
    "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
    )
    conn.set_property(tables=["test.subscribers"], scope=fabric.SCOPE_LOCAL)
    for num in range(10):
    add_subscriber(conn, "%s" % num, "k%s" % num, "kw%s" % num)

    2.开始hash分片测试

    a)插入10条记录,当group_id-1组的三台服务器正常,group_id-2组只有一台服务器正常运作时,

    group_id-1组的数据如下:

    b) 测试插入1万条记录,group_id-1组和group_id-2组都有三台服务器正常运作

    分片后,group_id-1组插入的数据行数如下:

    mysql -P 3306 -h 192.168.1.70 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 7138 |
      +----------+

    分片后,group_id-2组插入的数据行数如下:

     mysql -P 3309 -h 192.168.1.76 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 2903 |
      +----------+

    c) 测试插入10万条记录,group_id-1组和group_id-2组都有三台服务器正常运作

    分片后,group_id-1组插入的数据行数如下:

    mysql -P 3306 -h 192.168.1.70 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 78719 |
      +----------+
      mysql -P 3309 -h 192.168.1.76 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 31321 |
      +----------+

    d) 测试插入100万条记录,group_id-1组和group_id-2组都有三台服务器正常运作

    分片后,group_id-1组插入的数据行数如下:

     mysql -P 3306 -h 192.168.1.70 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 794287 |
      +----------+

    分片后,group_id-2组插入的数据行数如下:

     mysql -P 3309 -h 192.168.1.76 -u root -e "select count(*) from test.subscribers"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 315752 |
      +----------+

    添加一个新HA组(group_id-3),对HA组(group_id-1)进行分片,操作如下:

    a) 在机器:192.168.1.71 192.168.1.76 192.168.1.230上各起一个数据库实例,添加为组group_id-3

    mysqlfabric group add group_id-3 192.168.1.71:3313
      mysqlfabric group add group_id-3 192.168.1.76:3315
      mysqlfabric group add group_id-3 192.168.1.230:3317

    选举primary角色

    mysqlfabric group promote group_id-3

    b) 重置HA组group_id-3所有的数据库实例

    mysql -P 3313 -h 192.168.1.71 -u root -e "reset master"
      mysql -P 3315 -h 192.168.1.76 -u root -e "reset master"
      mysql -P 3317 -h 192.168.1.230 -u root -e "reset master"

     c) 查看HA组(group_id-1)的shard_id值,在储存机查看fabric数据库

    mysql -h 127.0.0.1 -P3306 -u root -e 'select * from fabric.shards'

    返回结果:

    +----------+------------+---------+
      | shard_id | group_id | state |
      +----------+------------+---------+
      | 4 | group_id-1 | ENABLED |
      | 5 | group_id-2 | ENABLED |
      +----------+------------+---------+

    d) 对HA组(group_id-1)进行分片分离,将

    mysqlfabric sharding split_shard 4 group_id-3

     ps:如果这里执行分片不成功,错误如下:

    Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'could not find next log; the first event '' at 4, the last event read from './binlog.000003' at 621, the last byte read from './binlog.000003' at 621.'

    解决方法:

    将master角色的数据dump出来,如下:

    mysqldump -h 192.168.1.76 -u root -P3315 --all-databases --flush-privileges --single-transaction --flush-logs --triggers --routines --events --hex-blob >/data/database/full_backup_master.sql

    停掉slave机

    mysql -P 3317 -h 192.168.1.230 -u root -e "stop slave"
      mysql -P 3317 -h 192.168.1.230 -u root -e "reset master"
      mysql -P 3317 -h 192.168.1.230 -u root </data/full_backup_master.sql
      mysql -P 3317 -h 192.168.1.230 -u root -e "start slave"

    再次执行分片,会报以下错误:

    Procedure :
      { uuid = 4da230c2-31c3-4242-bd88-ccafd51bfac1,
      finished = True,
      success = False,
      return = BackupError: ('Error while restoring the backup using the mysql client\n, %s', "ERROR 1840 (HY000) at line 24 in file: 'MySQL_192.168.1.76_3306.sql': @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.\n"),
      activities =}

    解决方法:

    在HA组(group_id-3)的全部数据库实例上执行“reset master”,再分片即可

    e) 现在HA组有group_id-1、group_id-2、group_id-3,尝试插入100条记录

    group_id-1的插入的数据行数有:

    mysql -P 3306 -h 192.168.1.70 -u root -e "select count(*) from test.subscribers where sub_no between 20000 and 20100"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 70 |
      +----------+

    group_id-2、group_id-3的插入的数据行数有:

    mysql -P 3309 -h 192.168.1.76 -u root -e "select count(*) from test.subscribers where sub_no between 20000 and 20100"
      mysql -P 3313 -h 192.168.1.230 -u root -e "select count(*)from test.subscribers where sub_no between 20000 and 20100"

    返回结果:

    +----------+
      | count(*) |
      +----------+
      | 30 |
      +----------+

    用select语句可知,HA组group_id-2、group_id-3插入的数据是一样的

    3.hash分片测试结论

    在插入1万条记录时,HA组(group_id-1)插入了7138条记录,HA组(group_id-2)插入了2903条记录

    在插入10万条记录时,HA组(group_id-1)插入了78719条记录,HA组(group_id-2)插入了31321条记录

    在插入100万条记录时,HA组(group_id-1)插入了794287条记录,HA组(group_id-2)插入了315752条记录

    因此,对mysql fabric高集群进行hash分片后,分片比例为7:3,负载并不是十分均衡


    分享到:  QQ好友和群QQ好友和群 QQ空间QQ空间 腾讯微博腾讯微博 腾讯朋友腾讯朋友
    收藏收藏2
    回复

    使用道具 举报

    本版积分规则

    关闭

    站长推荐上一条 /1 下一条

    小黑屋|手机版|Archiver|51Testing软件测试网 ( 沪ICP备05003035号 关于我们

    GMT+8, 2024-11-14 15:46 , Processed in 0.070526 second(s), 23 queries .

    Powered by Discuz! X3.2

    © 2001-2024 Comsenz Inc.

    快速回复 返回顶部 返回列表