Example of pandas.DataFrame excluding row specific methods in python
- 2020-05-26 09:38:27
- OfStack
preface
When you use Python for data analysis, one data structure you often use is DataFrame of pandas. You can check this article for the basic operation of pandas.DataFrame in python.
pandas.DataFrame excludes specific rows
If we want to filter like Excel, we only need 1 of the rows or some of the rows, we can use it
isin()
Method to pass in the values of the required rows as a list, or you can pass in a dictionary to specify columns to filter.
But if we want everything that doesn't contain a particular line, we don't have one
isnotin()
Methods. I met such a demand in my work today. After searching frequently, I found that I could only use it in another way
isin()
To implement this requirement.
Here's an example:
In [3]: df = pd.DataFrame([['GD', 'GX', 'FJ'], ['SD', 'SX', 'BJ'], ['HN', 'HB'
...: , 'AH'], ['HEN', 'HEN', 'HLJ'], ['SH', 'TJ', 'CQ']], columns=['p1', 'p2
...: ', 'p3'])
In [4]: df
Out[4]:
p1 p2 p3
0 GD GX FJ
1 SD SX BJ
2 HN HB AH
3 HEN HEN HLJ
4 SH TJ CQ
If you only want two lines with p1 for GD and HN, you can do this:
In [8]: df[df.p1.isin(['GD', 'HN'])]
Out[8]:
p1 p2 p3
0 GD GX FJ
2 HN HB AH
But if we want something other than these two rows, we need to go a little way around.
The idea is to take p1 and convert it to a list, then remove the unwanted rows (values) from the list, and then use them in DataFrame
isin()
In [9]: ex_list = list(df.p1)
In [10]: ex_list.remove('GD')
In [11]: ex_list.remove('HN')
In [12]: ex_list
Out[12]: ['SD', 'HEN', 'SH']
In [13]: df[df.p1.isin(ex_list)]
Out[13]:
p1 p2 p3
1 SD SX BJ
3 HEN HEN HLJ
4 SH TJ CQ
conclusion