cucumber PicklingError ever again__main__ - Multiprocessing just works, even in one-file scripts or testsdebug=True and verbose=True show exactly what's happeningcloudpickle on basic types while covering vastly more types. Multiple times faster than dill is a serialization engine.
It allows you to serialize and deserialize objects across Python processes.
It is built for the Python environment, and isn't directly meant for use in external or cross-language serialization.
However, it can do something that no other Python serializer can do: get rid of all of your PicklingErrors.
If you need super fast speed for simple types, use base pickle. It is literally what Python originally gave us! Of course it's the fastest.
But, if you need to serialize anything else, use .
pickle vs cucumber — same object, different outcomesimport threading
class Worker:
def __init__(self):
self.lock = threading.Lock()
self.thread = threading.Thread(target=self.run)
self.results = []
def run(self):
self.results.append("done")
worker = Worker()
With pickle:
import pickle
pickle.dumps(worker)
# TypeError: cannot pickle '_thread.lock' objects
With cloudpickle:
import cloudpickle
cloudpickle.dumps(worker)
# TypeError: cannot pickle '_thread.lock' objects
With :
from suitkaise import cucumber
data = cucumber .serialize (worker)
restored = cucumber .deserialize (data)
# works. lock and thread become Reconnectors, ready to be recreated.
cucumber .reconnect_all (restored)
# lock and thread are live again.
No errors. No workarounds. No tiptoeing around types that cause PicklingErrors.
cucumber handles every type that dill and cloudpickle can handle.
It also handles many more types that are frequently used in higher level programming and parallel processing.
And, it can handle user created classes, with all of these objects!
cucumber can handlethreading.localmultiprocessing.Queuemultiprocessing.Eventmultiprocessing.Managerqueue.SimpleQueuemmapre.Matchsqlite3.Connectionsqlite3.Cursorsocket.socketGeneratorTypeCoroutineTypeAsyncGeneratorTypeasyncio.Taskasyncio.FutureThreadPoolExecutorProcessPoolExecutorContextVarTokensubprocess.PopenFrameType has a way to dissect your class instances, allowing you to serialize essentially anything.
__main__ can handle classes defined in __main__.
handles all circular references in your objects.
is faster than cloudpickle and dill for most simple types.
Additionally, it is multiple times faster that both of them for many types.
NamedTemporaryFile — 33x fasterTextIOWrapper — 21x fasterthreading.Thread — 5x fasterdataclass — 2.5x fasterint — 2x fasterFor a full performance breakdown, head to the performance page.
intelligently reconstructs complex objects using custom handlers.
All you have to do after deserializing is call and provide any authentication needed, and all of your live resources will be recreated automatically.
You can even start threads automatically if you use .
Reconnector pattern — nothing else does thisWhen encounters a live resource (a database connection, an open socket, a running thread), it doesn't try to freeze and resume it -- that would be unsafe and often impossible. Instead, it creates a Reconnector object that stores the information needed to recreate the resource.
import psycopg2
from suitkaise import cucumber
# serialize a live database connection
conn = psycopg2.connect(host='localhost', database='mydb', password='secret')
data = cucumber .serialize (conn)
# deserialize it in another process
restored = cucumber .deserialize (data)
# restored.connection is a Reconnector, not a live connection yet
# reconnect with credentials (password is never stored in serialized data)
cucumber .reconnect_all (restored, password='secret')
# now restored.connection is a live psycopg2 connection again
This is a security-conscious design: authentication credentials are never stored in the serialized bytes. You provide them at reconnection time, so serialized data can be stored or transferred without leaking secrets.
No other Python serializer has this concept. Most either crash on live resources or silently produce broken objects.
Additionally, objects that don't need auth will be lazily reconstructed on first attribute access.
creates an intermediate representation (IR) of the object using pickle native types before using base pickle to serialize it to bytes.
{
"__cucumber_type__": "<type_name>",
"__handler__": "<handler_name>",
"__object_id__": <id>,
"state": {
# object's state in IR form
}
}
This allows everything to be cleanly organized and inspected.
Additionally, functions provide traceable, simple explanations of what went wrong if something fails.
# all you have to do is add debug=True
cucumber .serialize (obj, debug=True)
It also has an option to see how the object is getting serialized or reconstructed in real time with color-coded output.
# all you have to do is add verbose=True
cucumber .serialize (obj, verbose=True)
cucumber can handle any user class? can serialize any object as long as it contains supported types.
99% of Python objects only have supported types within them.
To prove to you that can handle any user class, I created a monster.
WorstPossibleObjectWorstPossibleObject is an object I created that would never exist in real life.
Its only goal: try and break .
It contains every type that can handle, in a super nested, circular-referenced, randomly-generated structure.
Each WorstPossibleObject is different from the last, and they all have ways to verify that they remain intact after being converted to and from bytes.
Not only does handle this object, but it can handle more than 100 different WorstPossibleObjects per second.
By handle, I mean:
It can then verify that it is the same object as it was when it got created, and that all of its complex objects within still work as expected.
This test includes a full round trip.
`serialize ()` → another process → `deserialize ()` → `reconnect_all ()` → verify → `serialize ()` → back to original process → `deserialize ()` → `reconnect_all ()` → verify
To see the full WorstPossibleObject code, head to the worst possible object page. Have fun!
cucumber sits in the landscape's real competitor is dill, not cloudpickle. Both and dill prioritize type coverage over raw speed. The difference: far outclasses dill on speed while exceeding its type coverage.
The fact that also competes with cloudpickle on speed -- despite covering vastly more types -- is the surprising part. cloudpickle is designed for speed with limited types. is designed for coverage and still keeps up.
pickle.cloudpickle is solid.PicklingError ever, and still competitive speed? That's cucumber .For a full performance breakdown, head to the performance page.
suitkaise is the serialization backbone of the ecosystem.
processing uses cucumber by default for all cross-process communication. Every Skprocess , every Pool .map , every Share operation goes through cucumber . You never think about serialization.@autoreconnect from processing builds on the Reconnector pattern to automatically reconnect live resources (like database connections) when they cross process boundaries.Share relies on cucumber to serialize any object you assign to it. This is what makes share.anything = any_object possible.suitkaise objects (Sktimer , Circuit , Skpath , etc.) are designed to serialize cleanly through cucumber .