51Testing软件测试论坛
标题:
千万级数据分区存储方案测试
[打印本页]
作者:
lsekfe
时间:
2021-9-1 10:47
标题:
千万级数据分区存储方案测试
场景
基于项目中的业务场景,对数据量大,对时间比较敏感(比如历史数据基本不用,热点数据集中在近期)的数据采用分区存储。 本次测试流程主要是创建测试表、插入千万条数据(本次两千万),查看分区表执行情况。
创建表
CREATE TABLE test (
id BIGINT(20) auto_increment,
area_id INT (8) NOT NULL DEFAULT 0,
create_time datetime NOT NULL ,
num INT (8) NOT NULL DEFAULT 0,
PRIMARY KEY (id, area_id, create_time)
) ENGINE = INNODB DEFAULT CHARSET = utf8 PARTITION BY RANGE (TO_DAYS(create_time))(
PARTITION p0
VALUES
LESS THAN (TO_DAYS('2019-11-01')) ENGINE = INNODB,
PARTITION p20191101
VALUES
LESS THAN (TO_DAYS('2019-11-02')) ENGINE = INNODB,
PARTITION p20191102
VALUES
LESS THAN (TO_DAYS('2019-11-03')) ENGINE = INNODB,
PARTITION p20191103
VALUES
LESS THAN (TO_DAYS('2019-11-04')) ENGINE = INNODB,
PARTITION p20191104
VALUES
LESS THAN (TO_DAYS('2019-11-05')) ENGINE = INNODB,
PARTITION p20191105
VALUES
LESS THAN (TO_DAYS('2019-11-06')) ENGINE = INNODB,
PARTITION p20191106
VALUES
LESS THAN (TO_DAYS('2019-11-07')) ENGINE = INNODB,
PARTITION p20191107
VALUES
LESS THAN (TO_DAYS('2019-11-08')) ENGINE = INNODB,
PARTITION p20191108
VALUES
LESS THAN (TO_DAYS('2019-11-09')) ENGINE = INNODB,
PARTITION p20191109
VALUES
LESS THAN (TO_DAYS('2019-11-10')) ENGINE = INNODB
);
复制代码
创建存储过程
DELIMITER //
create PROCEDURE insert_test(in num INT, in time varchar(10))
BEGIN
DECLARE rowid INT DEFAULT 0;
DECLARE create_time datetime;
DECLARE area_id INT;
SET @exedata = "";
WHILE rowid < num DO
SET create_time = (select str_to_date(CONCAT(time,' ',LPAD(FLOOR(0 + (RAND() * 23)),2,0),':',LPAD(FLOOR(0 + (RAND() * 59)),2,0),':',LPAD(FLOOR(0 + (RAND() * 59)),2,0)), "%Y-%m-%d %H:%i:%s") from dual);
SET area_id = (select FLOOR(910000 + (RAND() * 9)));
SET rowid = rowid + 1;
IF length(@exedata)>0 THEN
SET @exedata = CONCAT(@exedata,',');
END IF;
SET @exedata=concat(@exedata,"('",area_id,"','",create_time,"','",rowid,"')");
IF rowid%5000=0
THEN
SET @exesql =concat("insert into test(area_id,create_time,num) values ", @exedata);
prepare stmt from @exesql;
execute stmt;
DEALLOCATE prepare stmt;
SET @exedata = "";
END IF;
END WHILE;
IF length(@exedata)>0
THEN
SET @exesql =concat("insert into test(area_id,create_time,num) values ", @exedata);
prepare stmt from @exesql;
execute stmt;
DEALLOCATE prepare stmt;
END IF;
END //
DELIMITER ;
复制代码
调用存储过程插入数据
CALL insert_test (1000000, '2019-10-31');
CALL insert_test (1000000, '2019-11-01');
CALL insert_test (1000000, '2019-11-02');
CALL insert_test (1000000, '2019-11-03');
CALL insert_test (1000000, '2019-11-04');
CALL insert_test (1000000, '2019-11-05');
CALL insert_test (1000000, '2019-11-06');
CALL insert_test (1000000, '2019-11-07');
CALL insert_test (1000000, '2019-11-08');
CALL insert_test (1000000, '2019-11-09');
复制代码
分析
1、两千万数据下,根据日期按小时平均在1s左右;保持单分区数据不变情况下,后期sql性能影响基本不会太大。
2、此执行计划为全表扫描,经测试后发现分区后where已经问题不大,后期可尝试将主键优化(去掉ID)后,执行计划将来到range级别;
3、未分区表,同样数据走索引情况下,多次执行平均在2.5s左右。
select PARTITION_NAME as "分区",TABLE_ROWS as "行数" from information_schema.partitions where table_schema="test" and table_name="test";
explain partitions select * from test where area_id = 910005 and create_time > '2019-11-02 00:00:00' and create_time < '2019-11-02 23:59:59';
explain
select avg(num) , DATE_FORMAT(create_time,'%Y%m%d%H') time from test
where area_id = 910005 and create_time > '2019-11-06 00:00:00' and create_time < '2019-11-06 23:59:59'
group by time
复制代码
[attach]134199[/attach]
作者:
千里
时间:
2021-9-26 16:04
这个DBA可能会关注得更多
欢迎光临 51Testing软件测试论坛 (http://bbs.51testing.com/)
Powered by Discuz! X3.2