数据准备工作:
MySQL创建准备对比的表Bigtable并建立索引
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17CREATE TABLE Bigtable ( consumption_id VARCHAR(255), member_id INT(11), fans_id INT(11), bill_date DATE, money VARCHAR(255), people_num INT(8), dish_name VARCHAR(255), created_org INT(8), open_id VARCHAR(255), subscribed_time DATETIME, unsubscribed_time DATETIME, source_type INT(8), sns_type VARCHAR(255), is_subscribed VARCHAR(255) )ENGINE=INNODB DEFAULT CHARSET=utf8; CREATE INDEX indexbigtable ON bigtable(consumption_id,member_id,bill_date,created_org,source_type,sns_type,is_subscribed);
将数据插入Bigtable,共308万行。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15INSERT INTO bigtable SELECT a.consumption_id,a.member_id,b.fans_id,DATE(a.bill_date),a.need_pay_amount,a.people_num,a.dish_name,a.created_org,b.open_id,b.subscribed_time, b.unsubscribed_time,b.source_type,b.sns_type,b.is_subscribed FROM ( SELECT conn.consumption_id,conn.member_id,dish.dish_name,conn.bill_date,conn.people_num,conn.need_pay_amount,conn.created_org FROM `pos_consumption` AS conn INNER JOIN pos_dining_order AS porder ON conn.consumption_id = porder.consumption_id AND conn.created_org = porder.created_org INNER JOIN pos_order_dish AS dish ON porder.order_id=dish.dining_order_id AND dish.created_org=porder.created_org WHERE conn.`status_code`=1 AND conn.`need_pay_amount` >0 ) a LEFT JOIN (SELECT fan.fans_id,fan.member_id,bind.open_id,bind.subscribed_time,bind.unsubscribed_time,fan.source_type,sns_type,is_subscribed FROM rv_ol_fans AS fan INNER JOIN `0001790455_baseinfo`.`ol_fans_sns_bind` AS bind USE INDEX (idx_fans_id) ON fan.fans_id=bind.fans_id WHERE bind.status_code=1)b ON a.member_id=b.member_id
clickhouse建表bigtable,大小写一定要严格符合:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16CREATE TABLE bigtable ( consumption_id UInt64, member_id UInt64, fans_id UInt64, bill_date Date, money Float32, people_num UInt8, dish_name String, created_org UInt8, open_id String, subscribed_time DateTime, unsubscribed_time DateTime, source_type UInt8, sns_type UInt8, is_subscribed UInt8 )ENGINE=MergeTree(bill_date,(consumption_id,created_org),8192)
mysql导出数据到csv:
1
2
3
4
5
6
7
8
9
10
11SELECT consumption_id, member_id, fans_id, bill_date, money, people_num, dish_name, created_org,open_id,subscribed_time,unsubscribed_time, source_type,sns_type,is_subscribed INTO OUTFILE '/var/lib/mysql-files/bigtable.csv' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY 'n' FROM bigtable;
服务器之间的csv拷贝:
1scp bigtable.csv root@192.168.x.xxx:/root/clickhouse-files
clickhouse导入csv:
1cat /root/clickhouse-files/bigtable.csv | clickhouse-client --query="INSERT INTO test.bigtable FORMAT CSV"
数据量为308万行,每行14个列的测试:
- 复制代码clickhouse:67.62s ; mysql:0.002s1
select * from bigtable
clickhouse感觉是瞬间查出来,但是加载10000行数据刷屏用了很久很久,clickhouse里查询的列数一多,加载的时间就变长
- TOPN复制代码clickhouse:0.13s ; mysql:0.020s1
select* from bigtable order by dish_Name limit 1
- 统计数据有多少行:
1select count(1) from bigtable
clickhouse:0.015s ; mysql:1.33s
- 统计一共有多少个订单:
1select count(1) from (select consumption_id,created_org from bigtable group by consumption_id,created_org)a;
clickhouse:0.121s ; mysql:5.15s
- 来吃过的次数里各有多少人:82行
1
2
3select sum_all,count(1) as countall from (select member_id,count(1) as sum_all from (select member_id,consumption_id,created_org from bigtable group by consumption_id,created_org) a group by member_id) a group by sum_all;
clickhouse:0.166s ; mysql:5.50s
- 添加条件的查询:
1
2
3
4
5
6
7
8
9
10SELECT COUNT(1) FROM bigtable WHERE member_id IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19, 20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59, 60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75, 76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94, 95,96,97,98,99,100) AND bill_date BETWEEN '2017-01-01' AND '2017-12-31' AND fans_id >10
clickhouse:0.027s ; mysql:2.05s
- 每个用户最常吃的菜品:8万行
1
2
3SELECT member_id,dish_name,sum_all FROM ( SELECT member_id,dish_name,COUNT(1) AS sum_all FROM bigtable WHERE dish_name NOT IN('米饭','打包盒','茶位费') GROUP BY member_id,dish_name ORDER BY member_id,sum_all DESC) a GROUP BY member_id
- 来个猛的,四重子查询:171万行
1
2
3
4
5SELECT * FROM bigtable WHERE member_id IN(SELECT member_id FROM bigtable WHERE member_id IN(SELECT member_id FROM bigtable WHERE source_type=1 AND member_id IN (SELECT member_id FROM bigtable WHERE source_type=1 AND member_id IN (SELECT member_id FROM bigtable WHERE fans_id !=0)) ) AND is_subscribed=1 )
clickhouse:65.284s ; mysql:
Mysql不行了,查了30分钟还没结果;clickhouse同样是加载行数用了很久
- 再来一个二重子查询试试:
1
2
3SELECT * FROM bigtable WHERE member_id IN(SELECT member_id FROM bigtable WHERE member_id IN(SELECT member_id FROM bigtable WHERE source_type=1 ) AND is_subscribed=1 )
clickhouse:63.216s ; mysql:
没想到两个子查询的mysql在30分钟内也出不来结果
那么这部分对比查询时间的结果显而易见了:
对于简单查询来说,查询列数多的话mysql有优势,查询列数少的话clickhouse领先。
对于复杂查询来说,clickhouse占有显著优势
另外,展示行数的多少会影响clickhouse的查询时间,不知道是不是因为使用linux的原因
数据量为1亿两千四百万行,每行62个列的测试:
- 同样,先是select一下全表
1select * from pdish_test
clickhouse:276s ; mysql:0.036s
- 统计行数:
1select count(1) from pdish_test
clickhouse:0.044s ; mysql:32.168s
- 每个菜品点了多少次,共收入多少钱:377行
1SELECT dish_name,COUNT(1) AS cc,SUM(dish_sale_amount) AS c FROM pdish_test GROUP BY dish_name ORDER BY cc DESC
- 简单的条件查询:
1SELECT COUNT(1) FROM pdish_test WHERE dish_sale_amount>10 AND created_on >'2017-01-01'
clickhouse:0.448s ; mysql:
依然是30分钟过去了,mysql还是没有反应
看来对非常大的数据量来说,Mysql已经跑不动了
关联表查询,三个表分别为近100万,13万,13万:
三个表相互关联拼接的查询: 8万行
clickhouse:3.65s ; mysql:4min46s
在几万十几万行的数据里,clickhouse的速度也是要显著快于mysql
最后
以上就是高大自行车最近收集整理的关于clickhouse与mysql查询速度对比的全部内容,更多相关clickhouse与mysql查询速度对比内容请搜索靠谱客的其他文章。
发表评论 取消回复