Implementation of pandas DataFrame Index Rows and Columns
- 2021-06-28 09:22:16
- OfStack
Row index
There are three ways to index rows, loc iloc ix
import pandas as pd
import numpy as np
index = ["a", "b", "c", "d"]
data = np.random.randint(10, size=(4, 3))
df = pd.DataFrame(data, index=index)
"""
0 1 2
a 9 7 1
b 0 0 7
c 2 6 5
d 8 2 5
"""
loc
loc determines rows by their index name
Single-row index, returning Series object
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
Multiline index, returning DataFrame object
df.loc[["a", "c"]]
"""
0 1 2
a 9 7 1
c 2 6 5
"""
iloc
Determining rows by their index ordinal
Single-line index, returns Series object
df.iloc[0]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.iloc[1]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
Multiline index, returning DataFrame object
df.iloc[[0, 2]]
"""
0 1 2
a 9 7 1
c 2 6 5
"""
ix (not recommended)
Rows are determined by the row index name or serial number. If the row index index is of type integer, the row index name is indexed by the ix method index, and an error occurs if the row index name does not exist
index = [2, 3, 4, 5]
df = pd.DataFrame(data, index=index)
"""
0 1 2
2 9 7 1
3 0 0 7
4 2 6 5
5 8 2 5
"""
df.ix[2]
"""
0 9
1 7
2 1
Name: 2, dtype: int64
"""
# Prompt message
"""
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing
"""
# If index Is an integer , You cannot index by row index number
df.ix[0]
"""
...
KeyError: 0
"""
Row Index
There are two ways to index rows, one is. []
import pandas as pd
import numpy as np
columns = ["i", "ii", "iii"]
data = np.random.randint(10, size=(4, 3))
df = pd.DataFrame(data, columns=columns)
"""
i ii iii
0 4 5 9
1 0 3 4
2 7 9 1
3 8 2 3
"""
Returns the Series object by getting the specified row directly from the.Attribute
df.i
"""
0 4
1 0
2 7
3 8
Name: i, dtype: int64
"""
[]
Single column index, returns DataFrame object
df[["i"]]
"""
i
0 4
1 0
2 7
3 8
"""
Multi-column index, returns DataFrame object
df[["i", "ii"]]
"""
i ii
0 4 5
1 0 3
2 7 9
3 8 2
"""
Index both rows and columns
Index by specifying an index name or a slice
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
0
loc
Returns an DataFrame object by specifying the row and column index names to index
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
1
Returns an DataFrame object by specifying a range of row and column index names, including edge values
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
2
iloc
Returns an DataFrame object by specifying the row and column index numbers to index
df.iloc[[0, 1], [1, 2]]
"""
ii iii
a 5 9
f 3 4
"""
Returns the DataFrame object by specifying the range of row and column index numbers for the slice index (left closed right open)
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
4
ix (not recommended)
Returns an DataFrame object by specifying a range of row and column index numbers or names
df.loc["a"]
"""
0 9
1 7
2 1
Name: a, dtype: int64
"""
df.loc["b"]
"""
0 0
1 0
2 7
Name: b, dtype: int64
"""
5
tips: Only when iloc or ix is used to index a slice by index number is left-closed and right-open, and the rest are all closed