Relationship Between Pickle And Deepcopy
Solution 1:
You should not be confused by (1) and (2). In general, Python tries to include sensible fall-backs for missing methods. (For instance, it is enough to define __getitem__
in order to have an iterable class, but it may be more efficient to also implement __iter__
. Similar for operations like __add__
, with optional __iadd__
etc.)
__deepcopy__
is the most specialized method that deepcopy()
will look for, but if it does not exists, falling back to the pickle protocol is a sensible thing to do. It does not really call dumps()
/loads()
, because it does not rely on the intermediate representation to be a string, but it will indirectly make use of __getstate__
and __setstate__
(via __reduce__
), as you observed.
Currently, the documentation still states
… The copy module does not use the copy_reg registration module.
but that seems to be a bug that has been fixed in the meantime (possibly, the 2.7 branch has not gotten enough attention here).
Also note that this is pretty deeply integrated into Python (at least nowadays); the object
class itself implements __reduce__
(and its versioned _ex variant), which refers to copy_reg.__newobj__
for creating fresh instances of the given object-derived class.
Solution 2:
Ok, I had to read the source code for this one, but it looks like it's a pretty simple answer. http://svn.python.org/projects/python/trunk/Lib/copy.py
copy
looks up some of the builtin types it knows what the constructors look like for (registered in the _copy_dispatch
dictionary, and when it doesn't know how to copy the basic type, it imports copy_reg.dispatch_table
... which is the place where pickle
registers the methods it knows for producing new copies of objects. Essentially, it's a dictionary of the type of object and the "function to produce a new object" -- this "function to produce a new object" is pretty much what you write when you write a __reduce__
or a __reduce_ex__
method for an object (and if one of those is missing or needs help, it defers to the __setstate__
, __getstate__
, etc methods.
So that's copy
. Basically… (with some additional clauses…)
defcopy(x):
"""Shallow copy operation on arbitrary Python objects.
See the module's __doc__ string for more info.
"""
cls = type(x)
copier = _copy_dispatch.get(cls)
if copier:
return copier(x)
copier = getattr(cls, "__copy__", None)
if copier:
return copier(x)
reductor = dispatch_table.get(cls)
if reductor:
rv = reductor(x)
else:
reductor = getattr(x, "__reduce_ex__", None)
if reductor:
rv = reductor(2)
else:
reductor = getattr(x, "__reduce__", None)
if reductor:
rv = reductor()
else:
raise Error("un(shallow)copyable object of type %s" % cls)
deepcopy
does the same thing as the above, but in addition inspects each object and makes sure that there's a copy for each new object and not a pointer reference. deepcopy
builds it's own _deepcopy_dispatch
table (a dict) where it registers functions that ensure the new objects produced do not have pointer references to the originals (possibly generated with the __reduce__
functions registered in copy_reg.dispatch_table
)
Hence writing a __reduce__
method (or similar) and registering it with copy_reg
, should enable copy
and deepcopy
to do their thing as well.
Post a Comment for "Relationship Between Pickle And Deepcopy"