Example of pandas.DataFrame excluding row specific methods in python

  • 2020-05-26 09:38:27
  • OfStack

preface

When you use Python for data analysis, one data structure you often use is DataFrame of pandas. You can check this article for the basic operation of pandas.DataFrame in python.

pandas.DataFrame excludes specific rows

If we want to filter like Excel, we only need 1 of the rows or some of the rows, we can use it isin() Method to pass in the values of the required rows as a list, or you can pass in a dictionary to specify columns to filter.

But if we want everything that doesn't contain a particular line, we don't have one isnotin() Methods. I met such a demand in my work today. After searching frequently, I found that I could only use it in another way isin() To implement this requirement.

Here's an example:


In [3]: df = pd.DataFrame([['GD', 'GX', 'FJ'], ['SD', 'SX', 'BJ'], ['HN', 'HB'
 ...: , 'AH'], ['HEN', 'HEN', 'HLJ'], ['SH', 'TJ', 'CQ']], columns=['p1', 'p2
 ...: ', 'p3'])

In [4]: df
Out[4]:
 p1 p2 p3
0 GD GX FJ
1 SD SX BJ
2 HN HB AH
3 HEN HEN HLJ
4 SH TJ CQ

If you only want two lines with p1 for GD and HN, you can do this:


In [8]: df[df.p1.isin(['GD', 'HN'])]
Out[8]:
 p1 p2 p3
0 GD GX FJ
2 HN HB AH

But if we want something other than these two rows, we need to go a little way around.

The idea is to take p1 and convert it to a list, then remove the unwanted rows (values) from the list, and then use them in DataFrame isin()


In [9]: ex_list = list(df.p1)

In [10]: ex_list.remove('GD')

In [11]: ex_list.remove('HN')

In [12]: ex_list
Out[12]: ['SD', 'HEN', 'SH']

In [13]: df[df.p1.isin(ex_list)]
Out[13]:
 p1 p2 p3
1 SD SX BJ
3 HEN HEN HLJ
4 SH TJ CQ

conclusion


Related articles: