Postgresql单表【插入】/【更新】百万数据

186 阅读 0 评论 123 点赞

我是靠谱客的博主执着红牛，这篇文章主要介绍Postgresql单表【插入】/【更新】百万数据，现在分享给大家，希望可以做个参考。

一、插入数据

说到插入数据，一开始就想到：

insert int A values(*******************)

插入多条数据，最多想到：写成这样：

insert into A values(**********),(*************),(*****************)

但是在百万数据面前，都太慢了。

1、用脚本的方式

 1 #!/bin/bash
 2 strsql="insert into tbl_devaccess8021x (uidrecordid, dtaccesstime, strmac, strusername, strswitchip, strifname, iisauthsuc,iisantipolicy,iisaccessed,strmachinecode,strrandomcode,iaccesstype,straccessfailedcode,uidroleid ,struserdes) values('d71803axxx1','2019-08-02 20:37:35', '1:2:3:4:5:6', 'criss0', '192.168.2.146','FastEthernet0/1',0,0,1,'000000000020A0B01','020A0B01',1,0,'研发','crissxu10')"
 3
 4 for ((i=1; i <=3000000; i++))
 5 do
 6
strsql=$strsql",('d71803axxx$i',$(date +%s), '1:2:3:4:5:$i', 'criss$i', '192.168.2.$i','FastEthernet0/1',0,0,1,'000000000020A0B01','020A0B01',1,0,'研发','crissxu10')"
 7
 8 done
 9 echo $strsql
10 #psql -d xxx -U xxx -c "$strsql"

上述在数据量小的时候，可以采用，数据量大的话特别耗时。

2、postgresql提供了copy函数，方便批量导入数据。

copy_from的参数说明：copy_from(file, table, sep='t', null='\N', size=8192, columns=None)

 1 import sys
 2 import psycopg2
 3 if sys.version_info.major == 2:
 4
import StringIO as io
 5 else:
 6
import io
 7 from datetime import datetime
 8 if __name__=='__main__':
 9
s = ""
10
start_time = datetime.now()
11
for i in range(0,10):
12
str_i = str(i)
13
temp = "d71803axxx{0}t{1}t1:2:3:4:5:{2}tcriss{3}t192.168.2.{4}tFastEthernet0/1t0t0t1t000000000020A0B01t020A0B01t1t0t研发tcrissxu10n".format(str_i, datetime.now(),str_i,str_i,str_i)
14
s +=temp
15
conn = psycopg2.connect(host='127.0.0.1',user="xxx",password="xxx",database="xxx")
16
cur = conn.cursor()
17
cur.copy_from(io.StringIO(s),'tbl_devaccess8021x',columns=('uidrecordid', 'dtaccesstime', 'strmac', 'strusername', 'strswitchip', 'strifname', 'iisauthsuc','iisantipolicy','iisaccessed','strmachinecode','strrandomcode','iaccesstype','straccessfailedcode','uidroleid' ,'struserdes'))
18 
conn.commit()
19 
cur.close()
20 
conn.close()
21
end_time = datetime.now()
22
print ('done. time:{0}'.format(end_time - start_time))

用copy_from 函数执行三百万的数据，时间大概7分钟左右。

3、先往临时表中插入，然后再同步

1 insert into source_table select
temporary_table

二、更新数据

update table set col = value where col_condition=value;

更新数据的步骤是先找到符合条件的col_condition的数据，然后再执行更新。少量数据的时候，查询速度快，当表里的数据达到一定量的时候，查询性能受到影响，从而导致更新效率降低。

解决办法：

1、对查询条件加索引。

2、将多条数据合并成一条sql语句

1 update target_table set c2 = t.c2 from (values(1,1),(2,2),(3,3),…(2000,2000)) as t(c1,c2) where target_table.c1=t.c1

Reference:

【1】 http://www.voidcn.com/article/p-stwpqgta-bdq.html

"后来看到葛班长的日志，他通过Python在SQLite中插入100万条数据只用了4秒，原因在于Python对所有的这100万条插入语句进行了优化，将所有的插入操作放到了同一个事务中，这样极大的减少了开启和取消事务的时间，而正是这部分操作会消耗大量的时间"

这应该可以解释为什么方法2

【2】http://www.voidcn.com/article/p-vvuwvbyw-yu.html

【3】https://help.aliyun.com/knowledge_detail/59076.html

转载于:https://www.cnblogs.com/hoojjack/p/11345828.html

最后

以上就是执着红牛最近收集整理的关于Postgresql单表【插入】/【更新】百万数据的全部内容，更多相关Postgresql单表【插入】/【更新】百万数据内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：数据库
浏览次数：186 次浏览
发布日期：2023-12-25 17:15:18
本文链接：https://www.kaopuke.com/article/k-p-k_13_u_23_o_2_f5_12__23__22_3.html

Postgresql单表【插入】/【更新】百万数据

最后

评论列表共有 0 条评论

发表评论取消回复

Postgresql单表【插入】/【更新】百万数据

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复