Find Duplicates For Mixed Type Values In Dictionaries
I would like to recognize and group duplicates values in a dictionary. To do this I build a pseudo-hash (better read signature) of my data set as follow: from pickle import dumps
Solution 1:
The first thing is to remove the call to deepcopy
which is your bottleneck here:
deffaithfulrepr(ds):
ifisinstance(ds, collections.Mapping):
res = collections.OrderedDict(
(k, faithfulrepr(v)) for k, v insorted(ds.items())
)
elifisinstance(ds, list):
res = [faithfulrepr(v) for v in ds]
else:
res = ds
returnrepr(res)
However sorted
and repr
have their drawbacks:
- you can't trully compare custom types;
- you can't use mappings with different types of keys.
So the second thing is to get rid of faithfulrepr
and compare objects with __eq__
:
binder, values = [], []
for key, value in ds.items():
try:
index = values.index(value)
except ValueError:
values.append(value)
binder.append([key])
else:
binder[index].append(key)
grouped = dict(zip(map(tuple, binder), values))
Post a Comment for "Find Duplicates For Mixed Type Values In Dictionaries"