dill package documentation
dill: serialize all of Python
About Dill
dill
extends Python’s pickle
module for serializing and de-serializing
Python objects to the majority of the built-in Python types. Serialization
is the process of converting an object to a byte stream, and the inverse
of which is converting a byte stream back to a Python object hierarchy.
dill
provides the user the same interface as the pickle
module, and
also includes some additional features. In addition to pickling Python
objects, dill
provides the ability to save the state of an interpreter
session in a single command. Hence, it would be feasible to save an
interpreter session, close the interpreter, ship the pickled file to
another computer, open a new interpreter, unpickle the session and
thus continue from the ‘saved’ state of the original interpreter
session.
dill
can be used to store Python objects to a file, but the primary
usage is to send Python objects across the network as a byte stream.
dill
is quite flexible, and allows arbitrary user defined classes
and functions to be serialized. Thus dill
is not intended to be
secure against erroneously or maliciously constructed data. It is
left to the user to decide whether the data they unpickle is from
a trustworthy source.
dill
is part of pathos
, a Python framework for heterogeneous computing.
dill
is in active development, so any user feedback, bug reports, comments,
or suggestions are highly appreciated. A list of issues is located at
https://github.com/uqfoundation/dill/issues, with a legacy list maintained at
https://uqfoundation.github.io/project/pathos/query.
Major Features
dill
can pickle the following standard types:
none, type, bool, int, float, complex, bytes, str,
tuple, list, dict, file, buffer, builtin,
Python classes, namedtuples, dataclasses, metaclasses,
instances of classes,
set, frozenset, array, functions, exceptions
dill
can also pickle more ‘exotic’ standard types:
functions with yields, nested functions, lambdas,
cell, method, unboundmethod, module, code, methodwrapper,
methoddescriptor, getsetdescriptor, memberdescriptor, wrapperdescriptor,
dictproxy, slice, notimplemented, ellipsis, quit
dill
cannot yet pickle these standard types:
frame, generator, traceback
dill
also provides the capability to:
save and load Python interpreter sessions
save and extract the source code from functions and classes
interactively diagnose pickling errors
Current Release
The latest released version of dill
is available from:
dill
is distributed under a 3-clause BSD license.
Development Version
You can get the latest development version with all the shiny new features at:
If you have a new contribution, please submit a pull request.
Installation
dill
can be installed with pip
:
$ pip install dill
To optionally include the objgraph
diagnostic tool in the install:
$ pip install dill[graph]
To optionally include the gprof2dot
diagnostic tool in the install:
$ pip install dill[profile]
For windows users, to optionally install session history tools:
$ pip install dill[readline]
Requirements
dill
requires:
python
(orpypy
), >=3.8
setuptools
, >=42
Optional requirements:
objgraph
, >=1.7.2
gprof2dot
, >=2022.7.29
pyreadline
, >=1.7.1 (on windows)
Basic Usage
dill
is a drop-in replacement for pickle
. Existing code can be
updated to allow complete pickling using:
>>> import dill as pickle
or:
>>> from dill import dumps, loads
dumps
converts the object to a unique byte string, and loads
performs
the inverse operation:
>>> squared = lambda x: x**2
>>> loads(dumps(squared))(3)
9
There are a number of options to control serialization which are provided
as keyword arguments to several dill
functions:
with protocol, the pickle protocol level can be set. This uses the same value as the
pickle
module, DEFAULT_PROTOCOL.with byref=True,
dill
to behave a lot more like pickle with certain objects (like modules) pickled by reference as opposed to attempting to pickle the object itself.with recurse=True, objects referred to in the global dictionary are recursively traced and pickled, instead of the default behavior of attempting to store the entire global dictionary.
with fmode, the contents of the file can be pickled along with the file handle, which is useful if the object is being sent over the wire to a remote system which does not have the original file on disk. Options are HANDLE_FMODE for just the handle, CONTENTS_FMODE for the file content and FILE_FMODE for content and handle.
with ignore=False, objects reconstructed with types defined in the top-level script environment use the existing type in the environment rather than a possibly different reconstructed type.
The default serialization can also be set globally in dill.settings.
Thus, we can modify how dill
handles references to the global dictionary
locally or globally:
>>> import dill.settings
>>> dumps(absolute) == dumps(absolute, recurse=True)
False
>>> dill.settings['recurse'] = True
>>> dumps(absolute) == dumps(absolute, recurse=True)
True
dill
also includes source code inspection, as an alternate to pickling:
>>> import dill.source
>>> print(dill.source.getsource(squared))
squared = lambda x:x**2
To aid in debugging pickling issues, use dill.detect which provides tools like pickle tracing:
>>> import dill.detect
>>> with dill.detect.trace():
>>> dumps(squared)
┬ F1: <function <lambda> at 0x7fe074f8c280>
├┬ F2: <function _create_function at 0x7fe074c49c10>
│└ # F2 [34 B]
├┬ Co: <code object <lambda> at 0x7fe07501eb30, file "<stdin>", line 1>
│├┬ F2: <function _create_code at 0x7fe074c49ca0>
││└ # F2 [19 B]
│└ # Co [87 B]
├┬ D1: <dict object at 0x7fe0750d4680>
│└ # D1 [22 B]
├┬ D2: <dict object at 0x7fe074c5a1c0>
│└ # D2 [2 B]
├┬ D2: <dict object at 0x7fe074f903c0>
│├┬ D2: <dict object at 0x7fe074f8ebc0>
││└ # D2 [2 B]
│└ # D2 [23 B]
└ # F1 [180 B]
With trace, we see how dill
stored the lambda (F1
) by first storing
_create_function
, the underlying code object (Co
) and _create_code
(which is used to handle code objects), then we handle the reference to
the global dict (D2
) plus other dictionaries (D1
and D2
) that
save the lambda object’s state. A #
marks when the object is actually stored.
More Information
Probably the best way to get started is to look at the documentation at
http://dill.rtfd.io. Also see dill.tests
for a set of scripts that
demonstrate how dill
can serialize different Python objects. You can
run the test suite with python -m dill.tests
. The contents of any
pickle file can be examined with undill
. As dill
conforms to
the pickle
interface, the examples and documentation found at
http://docs.python.org/library/pickle.html also apply to dill
if one will import dill as pickle
. The source code is also generally
well documented, so further questions may be resolved by inspecting the
code itself. Please feel free to submit a ticket on github, or ask a
question on stackoverflow (@Mike McKerns).
If you would like to share how you use dill
in your work, please send
an email (to mmckerns at uqfoundation dot org).
Citation
If you use dill
to do research that leads to publication, we ask that you
acknowledge use of dill
by citing the following in your publication:
M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis,
"Building a framework for predictive science", Proceedings of
the 10th Python in Science Conference, 2011;
http://arxiv.org/pdf/1202.1056
Michael McKerns and Michael Aivazis,
"pathos: a framework for heterogeneous computing", 2010- ;
https://uqfoundation.github.io/project/pathos
Please see https://uqfoundation.github.io/project/pathos or http://arxiv.org/pdf/1202.1056 for further information.
- exception PickleWarning
Bases:
Warning
,PickleError
- class Pickler(file, *args, **kwds)
Bases:
_Pickler
python’s Pickler extended to interpreter sessions
This takes a binary file for writing a pickle data stream.
The optional protocol argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3, 4 and 5. The default protocol is 4. It was introduced in Python 3.4, and is incompatible with previous versions.
Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
The file argument must have a write() method that accepts a single bytes argument. It can thus be a file object opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.
If fix_imports is True and protocol is less than 3, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.
If buffer_callback is None (the default), buffer views are serialized into file as part of the pickle stream.
If buffer_callback is not None, then it can be called any number of times with a buffer view. If the callback returns a false value (such as None), the given buffer is out-of-band; otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
It is an error if buffer_callback is not None and protocol is None or smaller than 5.
- _session = False
- dispatch: Dict[type, Callable[[Pickler, Any], None]]
The dispatch table, a dictionary of serializing functions used by Pickler to save objects of specific types. Use
pickle()
orregister()
to associate types to custom functions.
- dump(obj)
Write a pickled representation of obj to the open file.
- save(obj, save_persistent_id=True)
- settings = {'byref': False, 'fmode': 0, 'ignore': False, 'protocol': 4, 'recurse': False}
- exception PicklingError
Bases:
PickleError
- exception PicklingWarning
Bases:
PickleWarning
,PicklingError
- class Unpickler(*args, **kwds)
Bases:
Unpickler
python’s Unpickler extended to interpreter sessions and more types
- _session = False
- find_class(module, name)
Return an object from a specified module.
If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).
This method is called whenever a class or a function object is needed. Both arguments passed are str objects.
- load()
Load a pickle.
Read a pickled object representation from the open file object given in the constructor, and return the reconstituted object hierarchy specified therein.
- settings = {'byref': False, 'fmode': 0, 'ignore': False, 'protocol': 4, 'recurse': False}
- exception UnpicklingError
Bases:
PickleError
- exception UnpicklingWarning
Bases:
PickleWarning
,UnpicklingError
- check(obj, *args, **kwds)
Check pickling of an object across another process.
python is the path to the python interpreter (defaults to sys.executable)
Set verbose=True to print the unpickled object in the other process.
- citation()
print citation
- copy(obj, *args, **kwds)
Use pickling to ‘copy’ an object (i.e. loads(dumps(obj))).
- dump(obj, file, protocol=None, byref=None, fmode=None, recurse=None, **kwds)
Pickle an object to a file.
See
dumps()
for keyword arguments.
- dump_module(filename=None, module=None, refimported=False, **kwds)
Pickle the current state of
__main__
or another module to a file.Save the contents of
__main__
(e.g. from an interactive interpreter session), an imported module, or a module-type object (e.g. built withModuleType
), to a file. The pickled module can then be restored with the functionload_module()
.- Parameters:
filename (str | PathLike | None) – a path-like object or a writable stream. If None (the default), write to a named file in a temporary directory.
module (ModuleType | str | None) – a module object or the name of an importable module. If None (the default),
__main__
is saved.refimported (bool) – if True, all objects identified as having been imported into the module’s namespace are saved by reference. Note: this is similar but independent from
dill.settings[`byref`]
, asrefimported
refers to virtually all imported objects, whilebyref
only affects select objects.**kwds – extra keyword arguments passed to
Pickler()
.
- Raises:
PicklingError – if pickling fails.
- Return type:
None
Examples
Save current interpreter session state:
>>> import dill >>> squared = lambda x: x*x >>> dill.dump_module() # save state of __main__ to /tmp/session.pkl
Save the state of an imported/importable module:
>>> import dill >>> import pox >>> pox.plus_one = lambda x: x+1 >>> dill.dump_module('pox_session.pkl', module=pox)
Save the state of a non-importable, module-type object:
>>> import dill >>> from types import ModuleType >>> foo = ModuleType('foo') >>> foo.values = [1,2,3] >>> import math >>> foo.sin = math.sin >>> dill.dump_module('foo_session.pkl', module=foo, refimported=True)
Restore the state of the saved modules:
>>> import dill >>> dill.load_module() >>> squared(2) 4 >>> pox = dill.load_module('pox_session.pkl') >>> pox.plus_one(1) 2 >>> foo = dill.load_module('foo_session.pkl') >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672]
Use refimported to save imported objects by reference:
>>> import dill >>> from html.entities import html5 >>> type(html5), len(html5) (dict, 2231) >>> import io >>> buf = io.BytesIO() >>> dill.dump_module(buf) # saves __main__, with html5 saved by value >>> len(buf.getvalue()) # pickle size in bytes 71665 >>> buf = io.BytesIO() >>> dill.dump_module(buf, refimported=True) # html5 saved by reference >>> len(buf.getvalue()) 438
Changed in version 0.3.6: Function
dump_session()
was renamed todump_module()
. Parametersmain
andbyref
were renamed tomodule
andrefimported
, respectively.Note
Currently,
dill.settings['byref']
anddill.settings['recurse']
don’t apply to this function.
- dump_session(filename=None, main=None, byref=False, **kwds)
Pickle the current state of
__main__
or another module to a file.Save the contents of
__main__
(e.g. from an interactive interpreter session), an imported module, or a module-type object (e.g. built withModuleType
), to a file. The pickled module can then be restored with the functionload_module()
.- Parameters:
filename – a path-like object or a writable stream. If None (the default), write to a named file in a temporary directory.
module – a module object or the name of an importable module. If None (the default),
__main__
is saved.refimported – if True, all objects identified as having been imported into the module’s namespace are saved by reference. Note: this is similar but independent from
dill.settings[`byref`]
, asrefimported
refers to virtually all imported objects, whilebyref
only affects select objects.**kwds – extra keyword arguments passed to
Pickler()
.
- Raises:
PicklingError – if pickling fails.
Examples
Save current interpreter session state:
>>> import dill >>> squared = lambda x: x*x >>> dill.dump_module() # save state of __main__ to /tmp/session.pkl
Save the state of an imported/importable module:
>>> import dill >>> import pox >>> pox.plus_one = lambda x: x+1 >>> dill.dump_module('pox_session.pkl', module=pox)
Save the state of a non-importable, module-type object:
>>> import dill >>> from types import ModuleType >>> foo = ModuleType('foo') >>> foo.values = [1,2,3] >>> import math >>> foo.sin = math.sin >>> dill.dump_module('foo_session.pkl', module=foo, refimported=True)
Restore the state of the saved modules:
>>> import dill >>> dill.load_module() >>> squared(2) 4 >>> pox = dill.load_module('pox_session.pkl') >>> pox.plus_one(1) 2 >>> foo = dill.load_module('foo_session.pkl') >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672]
Use refimported to save imported objects by reference:
>>> import dill >>> from html.entities import html5 >>> type(html5), len(html5) (dict, 2231) >>> import io >>> buf = io.BytesIO() >>> dill.dump_module(buf) # saves __main__, with html5 saved by value >>> len(buf.getvalue()) # pickle size in bytes 71665 >>> buf = io.BytesIO() >>> dill.dump_module(buf, refimported=True) # html5 saved by reference >>> len(buf.getvalue()) 438
Changed in version 0.3.6: Function
dump_session()
was renamed todump_module()
. Parametersmain
andbyref
were renamed tomodule
andrefimported
, respectively.Note
Currently,
dill.settings['byref']
anddill.settings['recurse']
don’t apply to this function.
- dumps(obj, protocol=None, byref=None, fmode=None, recurse=None, **kwds)
Pickle an object to a string.
protocol is the pickler protocol, as defined for Python pickle.
If byref=True, then dill behaves a lot more like pickle as certain objects (like modules) are pickled by reference as opposed to attempting to pickle the object itself.
If recurse=True, then objects referred to in the global dictionary are recursively traced and pickled, instead of the default behavior of attempting to store the entire global dictionary. This is needed for functions defined via exec().
fmode (
HANDLE_FMODE
,CONTENTS_FMODE
, orFILE_FMODE
) indicates how file handles will be pickled. For example, when pickling a data file handle for transfer to a remote compute service, FILE_FMODE will include the file contents in the pickle and cursor position so that a remote method can operate transparently on an object with an open file handle.Default values for keyword arguments can be set in
dill.settings
.
- extend(use_dill=True)
add (or remove) dill types to/from the pickle registry
by default,
dill
populates its types topickle.Pickler.dispatch
. Thus, alldill
types are available upon calling'import pickle'
. To drop alldill
types from thepickle
dispatch, use_dill=False.- Parameters:
use_dill (bool, default=True) – if True, extend the dispatch table.
- Returns:
None
- license()
print license
- load_module(filename=None, module=None, **kwds)
Update the selected module (default is
__main__
) with the state saved atfilename
.Restore a module to the state saved with
dump_module()
. The saved module can be__main__
(e.g. an interpreter session), an imported module, or a module-type object (e.g. created withModuleType
).When restoring the state of a non-importable module-type object, the current instance of this module may be passed as the argument
main
. Otherwise, a new instance is created withModuleType
and returned.- Parameters:
filename (str | PathLike | None) – a path-like object or a readable stream. If None (the default), read from a named file in a temporary directory.
module (ModuleType | str | None) – a module object or the name of an importable module; the module name and kind (i.e. imported or non-imported) must match the name and kind of the module stored at
filename
.**kwds – extra keyword arguments passed to
Unpickler()
.
- Raises:
UnpicklingError – if unpickling fails.
ValueError – if the argument
main
and module saved atfilename
are incompatible.
- Returns:
A module object, if the saved module is not
__main__
or a module instance wasn’t provided with the argumentmain
.- Return type:
ModuleType | None
Examples
Save the state of some modules:
>>> import dill >>> squared = lambda x: x*x >>> dill.dump_module() # save state of __main__ to /tmp/session.pkl >>> >>> import pox # an imported module >>> pox.plus_one = lambda x: x+1 >>> dill.dump_module('pox_session.pkl', module=pox) >>> >>> from types import ModuleType >>> foo = ModuleType('foo') # a module-type object >>> foo.values = [1,2,3] >>> import math >>> foo.sin = math.sin >>> dill.dump_module('foo_session.pkl', module=foo, refimported=True)
Restore the state of the interpreter:
>>> import dill >>> dill.load_module() # updates __main__ from /tmp/session.pkl >>> squared(2) 4
Load the saved state of an importable module:
>>> import dill >>> pox = dill.load_module('pox_session.pkl') >>> pox.plus_one(1) 2 >>> import sys >>> pox in sys.modules.values() True
Load the saved state of a non-importable module-type object:
>>> import dill >>> foo = dill.load_module('foo_session.pkl') >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672] >>> import math >>> foo.sin is math.sin # foo.sin was saved by reference True >>> import sys >>> foo in sys.modules.values() False
Update the state of a non-importable module-type object:
>>> import dill >>> from types import ModuleType >>> foo = ModuleType('foo') >>> foo.values = ['a','b'] >>> foo.sin = lambda x: x*x >>> dill.load_module('foo_session.pkl', module=foo) >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672]
Changed in version 0.3.6: Function
load_session()
was renamed toload_module()
. Parametermain
was renamed tomodule
.See also
load_module_asdict()
to load the contents of module saved withdump_module()
into a dictionary.
- load_module_asdict(filename=None, update=False, **kwds)
Load the contents of a saved module into a dictionary.
load_module_asdict()
is the near-equivalent of:lambda filename: vars(dill.load_module(filename)).copy()
however, does not alter the original module. Also, the path of the loaded module is stored in the
__session__
attribute.- Parameters:
filename (str | PathLike | None) – a path-like object or a readable stream. If None (the default), read from a named file in a temporary directory.
update (bool) – if True, initialize the dictionary with the current state of the module prior to loading the state stored at filename.
**kwds – extra keyword arguments passed to
Unpickler()
- Raises:
UnpicklingError – if unpickling fails
- Returns:
A copy of the restored module’s dictionary.
- Return type:
Note
If
update
is True, the corresponding module may first be imported into the current namespace before the saved state is loaded from filename to the dictionary. Note that any module that is imported into the current namespace as a side-effect of usingupdate
will not be modified by loading the saved module in filename to a dictionary.Example
>>> import dill >>> alist = [1, 2, 3] >>> anum = 42 >>> dill.dump_module() >>> anum = 0 >>> new_var = 'spam' >>> main = dill.load_module_asdict() >>> main['__name__'], main['__session__'] ('__main__', '/tmp/session.pkl') >>> main is globals() # loaded objects don't reference globals False >>> main['alist'] == alist True >>> main['alist'] is alist # was saved by value False >>> main['anum'] == anum # changed after the session was saved False >>> new_var in main # would be True if the option 'update' was set False
- load_session(filename=None, main=None, **kwds)
Update the selected module (default is
__main__
) with the state saved atfilename
.Restore a module to the state saved with
dump_module()
. The saved module can be__main__
(e.g. an interpreter session), an imported module, or a module-type object (e.g. created withModuleType
).When restoring the state of a non-importable module-type object, the current instance of this module may be passed as the argument
main
. Otherwise, a new instance is created withModuleType
and returned.- Parameters:
filename – a path-like object or a readable stream. If None (the default), read from a named file in a temporary directory.
module – a module object or the name of an importable module; the module name and kind (i.e. imported or non-imported) must match the name and kind of the module stored at
filename
.**kwds – extra keyword arguments passed to
Unpickler()
.
- Raises:
UnpicklingError – if unpickling fails.
ValueError – if the argument
main
and module saved atfilename
are incompatible.
- Returns:
A module object, if the saved module is not
__main__
or a module instance wasn’t provided with the argumentmain
.
Examples
Save the state of some modules:
>>> import dill >>> squared = lambda x: x*x >>> dill.dump_module() # save state of __main__ to /tmp/session.pkl >>> >>> import pox # an imported module >>> pox.plus_one = lambda x: x+1 >>> dill.dump_module('pox_session.pkl', module=pox) >>> >>> from types import ModuleType >>> foo = ModuleType('foo') # a module-type object >>> foo.values = [1,2,3] >>> import math >>> foo.sin = math.sin >>> dill.dump_module('foo_session.pkl', module=foo, refimported=True)
Restore the state of the interpreter:
>>> import dill >>> dill.load_module() # updates __main__ from /tmp/session.pkl >>> squared(2) 4
Load the saved state of an importable module:
>>> import dill >>> pox = dill.load_module('pox_session.pkl') >>> pox.plus_one(1) 2 >>> import sys >>> pox in sys.modules.values() True
Load the saved state of a non-importable module-type object:
>>> import dill >>> foo = dill.load_module('foo_session.pkl') >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672] >>> import math >>> foo.sin is math.sin # foo.sin was saved by reference True >>> import sys >>> foo in sys.modules.values() False
Update the state of a non-importable module-type object:
>>> import dill >>> from types import ModuleType >>> foo = ModuleType('foo') >>> foo.values = ['a','b'] >>> foo.sin = lambda x: x*x >>> dill.load_module('foo_session.pkl', module=foo) >>> [foo.sin(x) for x in foo.values] [0.8414709848078965, 0.9092974268256817, 0.1411200080598672]
Changed in version 0.3.6: Function
load_session()
was renamed toload_module()
. Parametermain
was renamed tomodule
.See also
load_module_asdict()
to load the contents of module saved withdump_module()
into a dictionary.
- load_types(pickleable=True, unpickleable=True)
load pickleable and/or unpickleable types to
dill.types
dill.types
is meant to mimic thetypes
module, providing a registry of object types. By default, the module is empty (for import speed purposes). Use theload_types
function to load selected object types to thedill.types
module.
- loads(str, ignore=None, **kwds)
Unpickle an object from a string.
If ignore=False then objects whose class is defined in the module __main__ are updated to reference the existing class in __main__, otherwise they are left to refer to the reconstructed type, which may be different.
Default values for keyword arguments can be set in
dill.settings
.
- pickles(obj, exact=False, safe=False, **kwds)
Quick check if object pickles with dill.
If exact=True then an equality test is done to check if the reconstructed object matches the original object.
If safe=True then any exception will raised in copy signal that the object is not picklable, otherwise only pickling errors will be trapped.