Brief introduction of Python Pandas data structure

  • 2021-07-09 08:38:52
  • OfStack

Series

Series is similar to a 1-dimensional array and consists of a set of data and a set of related data labels. Use the Series class of pandas.


import pandas as pd
s1 = pd.Series(['a', 'b', 'c,', 'd'])
print(s1)

# Output:  0   a 
#   1   b
#   2   c
#   3   d
#   dtype: object

The above is the implementation of passing in 1 list, and the above 0, 1, 2 and 3 are the default labels of data. In addition, you can customize tags through index attributes.


s2 = pd.Series(['1', '2', '3,', '4'],index=['a', 'b', 'c,', 'd']) # index Setting Custom Indexes 
print(s2)

In addition, Series can also be transmitted through dictionaries.


s3 = pd.Series({'a':1,'b':2})
print(s3.values) #  Pass values Get its value 

DataFrame

DataFrame is a data structure composed of 1 set of data and 1 set of indexes, with row index and column index. Similar to excel, it is a tabular data structure. The following is a simple DataFrame data format


    Skills  
 0  python 
 1  Java

DataFrame class can be passed in list instantiation of 1 dataframe table data object, at this time row and column index is 0 by default. Common is to pass in nested list, nested inside the list can also be ancestral, if not specified index column index from 0, 1, and can be through columns, index self-defined column index and row index. See the following code for details.


import pandas as pd
df2 = pd.DataFrame([('a','A'),('b','B'),('c','C'),('d','D')]) #  Biography 1 Nested lists , The data in the nested can be the ancestor or the list 
print(df2)

The output format is as follows:


  0   1 

0  a  A 

1  b  B 

2  c  C 

3  d  D

df3 = pd.DataFrame([('a','A'),('b','B'),('c','C'),('d','D')],columns=[' Lowercase ',' Capitalized '])
print(df3)

   Lowercase   Capitalized  

0 a    A

1 b    B

2 c    C

3 d    D

A dictionary can also be passed into the DataFrame class to instantiate a table data object of dataframe. At this time, the key of the dictionary is equivalent to the column index, and the row index starts from 0 by default. In addition, the column index can be customized through index.


Related articles: