pandas-Series常用方法

66 阅读 0 评论 44 点赞

我是靠谱客的博主勤恳冰棍，最近开发中收集的这篇文章主要介绍pandas-Series常用方法，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

类型转换

Series.to_list()
# 将series转换为列表。
Series.copy()
# 复制此对象的索引和数据

Series的创建方法

通过字典创建

dic = {'a':1,'b':2,'c':3,'1':'hello','2':'python','3':[1,2]}
s = pd.Series(dic)
print(s,type(s))
#运行结果
1
hello
2
python
3
[1, 2]
a
1
b
2
c
3
dtype: object <class 'pandas.core.series.Series'>

通过数组（ndarray)创建
参数index：是Series的标签
参数name: 是Series的名称，没有的话默认为None。可以用rename()来更改，更改后生成新的Series，不改变原来的Series

s = pd.Series(np.random.rand(5),index = list('abcde'),name = 'test')
print(s,type(s))
#运行结果
a
0.384840
b
0.202776
c
0.646176
d
0.215777
e
0.605895
Name: test, dtype: float64 <class 'pandas.core.series.Series'>
#rename()的用法
s1 = s.rename('excel')
print(s1)
#运行结果
a
0.499740
b
0.943519
c
0.643355
d
0.591372
e
0.418790
Name: excel, dtype: float64

通过标量创建（必须声明index值）

s = pd.Series(5,index = range(5))
print(s)
#运行结果
0
5
1
5
2
5
3
5
4
5
dtype: int64

Series的索引

位置下标索引

s = pd.Series(np.random.rand(5))
print(s,'n')
print(s[2],type(s[2]),s[2].dtype)
#运行结果
0
0.961192
1
0.394670
2
0.948766
3
0.658049
4
0.214219
dtype: float64
0.948766189751 <class 'numpy.float64'> float64

标签（index）索引

s = pd.Series(np.random.rand(5),index = list('abcde'))
print(s,'n')
print(s['a'],'n')
print(s[['a','b']])
#运行结果
a
0.593557
b
0.991561
c
0.611022
d
0.603023
e
0.518528
dtype: float64
0.593556973009
a
0.593557
b
0.991561
dtype: float64

切片索引

s1 = pd.Series(np.random.rand(5),list('abcde'))
print(s1,'n')
print(s1['a':'b'],'n')
#用index做索引的话是末端包含的
print(s1[1:2],'n')
#用下标做切片索引的话和list切片是一样的，不包含末端

#运行结果
a
0.973470
b
0.192143
c
0.805640
d
0.623555
e
0.040572
dtype: float64
a
0.973470
b
0.192143
dtype: float64
b
0.192143
dtype: float64

Series 的基本操作用法

增添（第一，直接下标索引或index添加；第二，通过append()添加，生成新的Series）复制代码

s = pd.Series(np.random.rand(2))
s[3]= 100
#用index增添
s['a'] = 200
print(s,'n')
#运行结果
0
0.646847
1
0.224802
3
100.000000
a
200.000000
dtype: float64
s2 = pd.Series(np.random.rand(2),index = ['value1','value2'])
s3 = s.append(s2)
#用append()增添
print(s3)
#运行结果
0
0.646847
1
0.224802
3
100.000000
a
200.000000
value1
0.225087
value2
0.504572
dtype: float64

删除(第一，用del删除；第二，用.drop()删除，会生成新的Series) 复制代码

s = pd.Series(np.random.rand(5),index = list('abcde'))
del s['a']
#用del删除
print(s,'n')
#运行结果
b
0.036584
c
0.319169
d
0.267866
e
0.855804
dtype: float64
s1 = s.drop(['c','d'])
#用.drop()删除，删除多个要加[]
print(s1)
#运行结果
b
0.036584
e
0.855804
dtype: float64

修改（通过索引直接修改）

s = pd.Series(np.random.rand(5),index = list('abcde'))
print(s,'n')
s[1] = 100
print(s,'n')
s[['c','d']] = 200
print(s)
#运行结果
a
0.900485
b
0.955717
c
0.270206
d
0.186294
e
0.503710
dtype: float64
a
0.900485
b
100.000000
c
0.270206
d
0.186294
e
0.503710
dtype: float64
a
0.900485
b
100.000000
c
200.000000
d
200.000000
e
0.503710
dtype: float64

数据查看
.head()方法是查看前几行的数据，默认是5行
.tail()方法是查看后几行的数据，默认也是5行

s = pd.Series(np.random.rand(10))
print(s.head(2),'n')
print(s.tail())
#运行结果
0
0.301042
1
0.344857
dtype: float64
5
0.461262
6
0.337744
7
0.215328
8
0.735952
9
0.066285
dtype: float64

重新索引

.reindex(新的标签,fill_value = )会根据更改后的标签重新排序，若添加了原标签中没有的新标签，则默认填入NaN，参数fill_value指对新出现的标签填入的值。

s = pd.Series(np.random.rand(3),
index = ['a','b','c'])
s1 = s.reindex(['c','b','a','A'],fill_value = 100)
print(s1)
#运行结果
c
0.223246
b
0.484170
a
0.844028
A
100.000000
dtype: float64

reset_index函数

import pandas as pd
idx =
"hello the cruel world".split()
val = [1000, 201, None, 104]
t = pd.Series(val, index = idx)
print t, "<- t"
print t.reset_index(), "<- reset_index"
print t.reset_index(drop = True), "<- reset_index"
print t, "<- t"

排序sort_index、sort_values函数

import pandas as pd
idx =
"hello the cruel world".split()
val = [1, 21, None, 104]
t = pd.Series(val, index = idx)
print t.sort_index(), "<- t.count()"
print t.sort_values(inplace=True), "<- t.sort_values()"