Skip to content Skip to sidebar Skip to footer

Groupby Max Value And Return Corresponding Row In Pandas Dataframe

My dataframe consists of students, dates, and test scores. I want to find the max date for each student and return the corresponding row (ultimately, I am most interested in the st

Solution 1:

You can sort the data frame by Date and then use groupby.tail to get the most recent record:

df.iloc[pd.to_datetime(df.Date, format='%m/%d/%y').argsort()].groupby('Student_id').tail(1)

#Student_id     Date    Score#2     Lia1 12/13/16    0.845#0    Tina1  1/17/17    0.950#3    John2  1/25/17    0.975

Or avoid sorting, use idxmax (this works if you don't have duplicated index):

df.loc[pd.to_datetime(df.Date, format='%m/%d/%y').groupby(df.Student_id).idxmax()]

# Student_id       Date Score#3     John2    1/25/17 0.975#2      Lia1   12/13/16 0.845#0     Tina1    1/17/17 0.950

Post a Comment for "Groupby Max Value And Return Corresponding Row In Pandas Dataframe"