概述
From this question and others it seems that it is not recommended to use concat or append to build a pandas dataframe because it is recopying the whole dataframe each time.
My project involves retrieving a small amount of data every 30 seconds. This might run for a 3 day weekend, so someone could easily expect over 8000 rows to be created one row at a time. What would be the most efficient way to add rows to this dataframe?
解决方案
You can add rows to a DataFrame in-place using loc on a non-existent index. From the Pandas documentation:
In [119]: dfi
Out[119]:
A B C
0 0 1 0
1 2 3 2
2 4 5 4
In [120]: dfi.loc[3] = 5
In [121]: dfi
Out[121]:
A B C
0 0 1 0
1 2 3 2
2 4 5 4
3 5 5 5
As expected, using loc is considerably faster than append (about 14x):
import pandas as pd
df = pd.DataFrame({"A": [1,2,3], "B": [1,2,3], "C": [1,2,3]})
%%timeit
df2 = pd.DataFrame({"A": [4], "B": [4], "C": [4]})
df.append(df2)
# 1000 loops, best of 3: 1.61 ms per loop
%%timeit
df.loc[3] = 4
# 10000 loops, best of 3: 113 µs per loop
最后
以上就是飘逸小鸭子为你收集整理的python数据框追加_Python - 高效的方式向数据框添加行的全部内容,希望文章能够帮你解决python数据框追加_Python - 高效的方式向数据框添加行所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复