Skip to content Skip to sidebar Skip to footer

Left Join In Pandas With Approximately Equal Numeric Comparison

I am using the following to do a left join in Pandas: merged_left = pd.merge(left=xrf_df, right=statistics_and_notes_df, how='left',

Solution 1:

Assuming we have the following DFs:

In[111]: aOut[111]:
      abc03.03c311.01a122.02b2In[112]: bOut[112]:
      ax01.02Z15.00Y23.04X

Let's set joining float64 column as index (sorted):

In [113]: a = a.sort_values('a').set_index('a')

In [114]: b = b.assign(idx=b['a']).set_index('idx').sort_index()

In [115]: a
Out[115]:
      b  c
a
1.01  a  12.02  b  23.03  c  3

In [116]: b
Out[116]:
         a  x
idx
1.021.02  Z
3.043.04  X
5.005.00  Y

now we can use DataFrame.reindex(..., method='nearest'):

In [118]: a.join(b.reindex(a.index, method='nearest'), how='left')
Out[118]:
      b  c     a  x
a
1.01  a  11.02  Z
2.02  b  21.02  Z
3.03  c  33.04  X

In [119]: a.join(b.reindex(a.index, method='nearest'), how='left').rename(columns={'a':'a_right'})
Out[119]:
      b  c  a_right  x
a
1.01  a  11.02  Z
2.02  b  21.02  Z
3.03  c  33.04  X

In [120]: a.join(b.reindex(a.index, method='nearest'), how='left').rename(columns={'a':'a_right'}).reset_index()
Out[120]:
      a  b  c  a_right  x
01.01  a  11.02  Z
12.02  b  21.02  Z
23.03  c  33.04  X

PS you may want to use df.reindex(..., tolerance=<value>) parameter in order to set the tolerance: abs(index[indexer] - target) <= tolerance

Post a Comment for "Left Join In Pandas With Approximately Equal Numeric Comparison"