How to use Series for pandas data structure

  • 2021-06-29 11:24:04
  • OfStack

1. Series

Series is the data structure of an array of classes with labels (lable) or indexes (index).

A simplest Series object is generated below 1.1 because no index is specified for Series, so the default index (from 0 to N-1) is used.


#  Introduce Series and DataFrame
In [16]: from pandas import Series,DataFrame
In [17]: import pandas as pd

In [18]: ser1 = Series([1,2,3,4])

In [19]: ser1
Out[19]: 
0  1
1  2
2  3
3  4
dtype: int64

1.2 When you want to generate an Series for a specified index, you can do this:


#  to index Appoint 1 individual list
In [23]: ser2 = Series(range(4),index = ["a","b","c","d"])

In [24]: ser2
Out[24]: 
a  0
b  1
c  2
d  3
dtype: int64

1.3 You can also create Series objects from a dictionary


In [45]: sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}

In [46]: ser3 = Series(sdata)
#  Can be found, created with a dictionary Series Yes by index Ordered 
In [47]: ser3
Out[47]: 
Ohio   35000
Oregon  16000
Texas   71000
Utah    5000
dtype: int64

When a dictionary is used to generate Series, an index can also be specified. When the value in the dictionary corresponding to the value in the index does not exist, the value of the index is marked as Missing, NA, and the function (pandas.isnull, pandas.notnull) can be used to determine which index has no corresponding value.


In [48]: states = ['California', 'Ohio', 'Oregon', 'Texas']

In [49]: ser3 = Series(sdata,index = states)

In [50]: ser3
Out[50]: 
California    NaN
Ohio     35000.0
Oregon    16000.0
Texas     71000.0
dtype: float64
#  Determine which values are null 
In [51]: pd.isnull(ser3)
Out[51]: 
California   True
Ohio     False
Oregon    False
Texas     False
dtype: bool

In [52]: pd.notnull(ser3)
Out[52]: 
California  False
Ohio      True
Oregon     True
Texas     True
dtype: bool

1.4 Access elements and indexes in Series:


#  Access index is "a" Elements of 
In [25]: ser2["a"]
Out[25]: 0
#  Access index is "a","c" Elements of 
In [26]: ser2[["a","c"]]
Out[26]: 
a  0
c  2
dtype: int64
#  Get all values 
In [27]: ser2.values
Out[27]: array([0, 1, 2, 3])
#  Get all indexes 
In [28]: ser2.index
Out[28]: Index([u'a', u'b', u'c', u'd'], dtype='object')

1.5 Simple operations

In pandas's Series, the NumPy array operations (filtering data with Boolean arrays, scalar multiplication, and using mathematical functions) are preserved, while the use of references is maintained


In [34]: ser2[ser2 > 2]
Out[34]: 
a  64
d   3
dtype: int64

In [35]: ser2 * 2
Out[35]: 
a  128
b   2
c   4
d   6
dtype: int64

In [36]: np.exp(ser2)
Out[36]: 
a  6.235149e+27
b  2.718282e+00
c  7.389056e+00
d  2.008554e+01
dtype: float64

1.6 Series Auto-alignment

One of the important functions of Series is auto-alignment (not noticeable), just look at the examples.Almost the same way different Series objects are computed according to their index.


# ser3  Contents 
In [60]: ser3
Out[60]: 
Ohio   35000
Oregon  16000
Texas   71000
Utah    5000
dtype: int64
# ser4  Contents 
In [61]: ser4
Out[61]: 
California    NaN
Ohio     35000.0
Oregon    16000.0
Texas     71000.0
dtype: float64
#  Addition of elements with the same index value 
In [62]: ser3 + ser4
Out[62]: 
California     NaN
Ohio      70000.0
Oregon     32000.0
Texas     142000.0
Utah        NaN
dtype: float64

1.7 Naming

The Series object itself, as well as the index, has an name attribute


In [64]: ser4.index.name = "state"

In [65]: ser4.name = "population"

In [66]: ser4
Out[66]: 
state
California    NaN
Ohio     35000.0
Oregon    16000.0
Texas     71000.0
Name: population, dtype: float64


Related articles: