Solution of pandas merge Error Reporting

  • 2021-10-24 23:24:35
  • OfStack

pandas reported this error when doing merge:

df22 = pd.merge(df1,df2,left_on='company_name',right_on = 'name',how='left') Process finished with exit code 137

The reason for checking 1 times is:

The two tables are too large, which may lead to insufficient memory.

Additional: Pandas: Considerations for the use of the merge function (pandas's merge function causes a large number of incorrect null values)

I believe people who have used Pandas's merge function know that merge has the function of connection, and left connection is the most commonly used connection mode in data processing. In the process of using merge,

This often happens:


dataframe1 : 
a b
1 1
2 2
3 3

dataframe2 : 
b c
1 2
2 3

dataframe = pd. merge (dataframe1, dataframe2, on= 'b', how= 'left')


dataframe:
a b c
1 1 2
2 2 3
3 3 nan

But sometimes dataframe2 is


b c
1 2
2 3
3 4 

dataframe = pd. merge (dataframe1, dataframe2, on= 'b', how= 'left')


dataframe:
a b c
1 1 2
2 2 3
3 3 nan

Why is this?

The reason is that the data of our dataframe is usually read from csv file or xls file. When opening the data corresponding to b in excel, it looks like 1, but when reading with pandas,

It may be found that the same data in the same column of different csv files or xls files has the difference between integer type and floating point type, which leads to the difference between 3 and 3.0 at the time of connection, and it is impossible to connect the line of 3.

Therefore, before using merge to connect, you must adjust the keywords of the connection to characters or integers.


Related articles: