Replace Duplicate Values Across Columns In Pandas
I have a simple dataframe as such: df = [ {'col1' : 'A', 'col2': 'B', 'col3': 'C', 'col4':'0'}, {'col1' : 'M', 'col2': '0', 'col3': 'M', 'col4':'0'}, {
Solution 1:
You can use the duplicated
method to return a boolean indexer of whether elements are duplicates or not:
In [214]: pd.Series(['M', '0', 'M', '0']).duplicated()
Out[214]:
0False1False2True3True
dtype: bool
Then you could create a mask by mapping this across the rows of your dataframe, and using where
to perform your substitution:
is_duplicate = df.apply(pd.Series.duplicated, axis=1)
df.where(~is_duplicate, 0)
col1 col2 col3 col4
0 A B C 01 M 0002 B 0003 X 0 Y 0
Post a Comment for "Replace Duplicate Values Across Columns In Pandas"