Skip to content Skip to sidebar Skip to footer

Pandas - Df.loc - Can Only Compare Identically-labelled Series

My code below ( sorry I cannot share exact data) takes a df, filters it by date ranges, and re-labels certain date. I want to then pull those re-labeled dates into the original df.

Solution 1:

This is not an answer, see my comments in code. Also, at this point, I think this question is more appropriate for codereview.

finaldf['Completed_Date'] = pd.to_datetime(finaldf['Completed_Date'], format="%m/%d/%Y").dt.date

# making it lower case y made it work 
finaldf['Due_Date'] = pd.to_datetime(finaldf['Due_Date'], format="%m/%d/%y").dt.date 

# this worked as of 4.16
current_week_flags = (finaldf.Completed_Date >= last_monday.date()) & (finaldf.Completed_Date <= today.date()) 
earlydue = (finaldf.Due_Date < last_monday.date())

flags = current_week_flags & earlydue
finaldfmon = finaldf[current_week_flags]

# here we make all the due dates before monday, monday while complete date filterered# this works because last_monday is a single day
finaldfmon.loc[(finaldfmon['Due_Date']<last_monday.date()), 'Due_Date'] = last_monday 

# this fails in two places:# finaldf.loc[(finaldf['Due_Date'] != finaldfmon['Due_Date']), 'Due_Date'] = finaldfmon['Due_Date'] # finaldf['Due_Date'] != finaldfmon['Due_Date'] # these two series have different length, so you can't compare them # even if they have the same length, they have different indices# (unless one of them is a single number/date, then it becomes the case above)# finaldf.loc[..., 'Due_Date'] = finaldfmon['Due_Date']# same story    

writer = pd.ExcelWriter('currentweek.xlsx', engine='xlsxwriter')
finaldf.to_excel(writer, index=False, sheet_name='Sheet1')    
writer.save()

The Code Below ( mainly the last line achieves the goal

import pandas as pd
import xlrd # added when using visual studio import datetime
from datetime import datetime
#read in excel file
finaldf = pd.read_excel("scrubcomplete.xlsx", encoding = "ISO-8859-1", dtype=object)
finaldf.columns = finaldf.columns.str.strip().str.replace(' ', '_').str.replace('(', '').str.replace(')', '')
#
today = pd.to_datetime(datetime.now().date())
day_of_week = today.dayofweek
last_monday = today - pd.to_timedelta(day_of_week, unit='d') 
#if day_of_week !=0:
    finaldf['Completed_Date'] = pd.to_datetime(finaldf['Completed_Date'], format="%m/%d/%Y").dt.date
    finaldf['Due_Date'] = pd.to_datetime(finaldf['Due_Date'], format="%m/%d/%y").dt.date # making it lower case y made it work
    current_week_flags = (finaldf.Completed_Date >= last_monday.date()) & (finaldf.Completed_Date <= today.date())
    finaldf.loc[(finaldf['Completed_Date'] >= last_monday.date()) & (finaldf['Completed_Date'] <= today.date()) & (finaldf['Due_Date'] < last_monday.date()), 'Due_Date'] = last_monday

Post a Comment for "Pandas - Df.loc - Can Only Compare Identically-labelled Series"