Skip to content Skip to sidebar Skip to footer

Python2 Pandas: How To Merge A Part Of Another Dataframe To A Dataframe

I have a dataframe(df1) as following: datetime m d 1d 2d 3d 2014-01-01 1 1 2 2 3 2014-01-02 1 2 3 4 3 2014-01-03

Solution 1:

I think you need add NaNs by loc and then merge with left join:

np.random.seed(10)
N = 365
rng = pd.date_range('2015-01-01', periods=N)
df_tr_2014 = pd.DataFrame(np.random.randint(10, size=(N, 3)), index=rng).reset_index()
df_tr_2014.columns = ['datetime','7d','15d','20d']
df_tr_2014.insert(1,'month', df_tr_2014['datetime'].dt.month)
df_tr_2014.insert(2,'day_m', df_tr_2014['datetime'].dt.day)
#print (df_tr_2014.head())

N = 366
rng = pd.date_range('2016-01-01', periods=N)
df_te = pd.DataFrame(index=rng)
df_te['month'] = df_te.index.month
df_te['day_m'] = df_te.index.day
df_te = df_te.reset_index()
#print (df_te.tail())
df2 = df_te.copy()
df1 = df_tr_2014.copy()

df1 = df1.set_index('datetime')
df1.index += pd.offsets.DateOffset(years=1)

#correct 29 February
y = df1.index[0].year
df1 = df1.reindex(pd.date_range(pd.datetime(y,1,1), pd.datetime(y,12,31)))
idx = df1.index[(df1.index.month == 2) & (df1.index.day == 29)]

df1.loc[idx, :] = df1.loc[idx - pd.Timedelta(1, unit='d'), :].values
df1.loc[idx, 'day_m'] = idx.day
df1[['month','day_m']] = df1[['month','day_m']].astype(int)

df1[['7d','15d', '20d']] = df1[['7d','15d', '20d']].astype(float)

df1.loc[np.arange(len(df1.index))  % 30 == 0, ['7d','15d','20d']] = 0
df1 = df1.reset_index()
print (df1.iloc[57:62])
        index  month  day_m   7d  15d  20d
572016-02-272272.00.01.0582016-02-282282.03.05.0592016-02-292292.03.05.0602016-03-01310.00.00.0612016-03-02327.06.09.0

Solution 2:

Why don't you just remove the rows in df1 that don't match (m, d) pairs in df2?

df_new = df2.drop(df2[(not ((df2.m == df1.m) & (df2.n == df1.n)).any()) or (df2.index % 30 == 0)].index)

Or something along those lines.

Link to a related answer.

I'm not enormously familiar with Pandas and have not tested the above example.

Solution 3:

df_te is df2 df_tr_2014 is df1 7d 15d 20 is 1d 2d 3d respectively in question. size_df_te is the length of df_te, month and day_m are m, d in df2

df_te['7d'] = 0
df_te['15d'] = 0
df_te['20d'] = 0
mj = 0
dj = 0for i inrange(size_df_te):
    if i%30 != 0:
        m = df_te.loc[i,'month']
        d = df_te.loc[i,'day_m']
        if (m== 2) & (d == 29):
            m = 2
            d = 28
        dk_7 = df_tr_2014.loc[(df_tr_2014['month']==m) & (df_tr_2014['day_m']==d)]['7d']
        dk_15 = df_tr_2014.loc[(df_tr_2014['month']==m) & (df_tr_2014['day_m']==d)]['15d']
        dk_20 = df_tr_2014.loc[(df_tr_2014['month']==m) & (df_tr_2014['day_m']==d)]['20d']
        df_te.loc[i,'7d'] = float(dk_7)
        df_te.loc[i,'15d'] = float(dk_15)
        df_te.loc[i,'20d'] = float(dk_20)

EDIT:

Sample data:

np.random.seed(10)
N = 365
rng = pd.date_range('2014-01-01', periods=N)
df_tr_2014 = pd.DataFrame(np.random.randint(10, size=(N, 3)), index=rng).reset_index()
df_tr_2014.columns = ['datetime','7d','15d','20d']
df_tr_2014.insert(1,'month', df_tr_2014['datetime'].dt.month)
df_tr_2014.insert(2,'day_m', df_tr_2014['datetime'].dt.day)
#print (df_tr_2014.head())

N = 365
rng = pd.date_range('2015-01-01', periods=N)
df_te = pd.DataFrame(index=rng)
df_te['month'] = df_te.index.month
df_te['day_m'] = df_te.index.day
df_te = df_te.reset_index()
#print (df_te.head())

Post a Comment for "Python2 Pandas: How To Merge A Part Of Another Dataframe To A Dataframe"