The Truth Value Of A Series Is Ambiguous In Dataframe
I have the same code,I'm trying to create new field in pandas dataframe with simple conditions: if df_reader['email1_b']=='NaN': df_reader['email1_fin']=df_reader['email1_a'] e
Solution 1:
df_reader['email1_b']=='NaN'
is a vector of Boolean values (one per row), but you need one Boolean value for if
to work. Use this instead:
df_reader['email1_fin'] = np.where(df_reader['email1_b']=='NaN',
df_reader['email1_a'],
df_reader['email1_b'])
As a side note, are you sure about 'NaN'
? Is it not NaN
? In the latter case, your expression should be:
df_reader['email1_fin'] = np.where(df_reader['email1_b'].isnull(),
df_reader['email1_a'],
df_reader['email1_b'])
Solution 2:
if
expects a scalar value to be returned, it doesn't understand an array of booleans which is what is returned by your conditions. If you think about it what should it do if a single value in this array is False
/True
?
to do this properly you can do the following:
df_reader['email1_fin'] = np.where(df_reader['email1_b'] == 'NaN', df_reader['email1_a'], df_reader['email1_b'] )
also you seem to be comparing against the str
'NaN'
rather than the numerical NaN
is this intended?
Post a Comment for "The Truth Value Of A Series Is Ambiguous In Dataframe"