This is an archived version of the course. Please see the latest version of the course.

Pickle

If you need to save complex Python data structures, and only expect to load it in the future into Python, then you can consider using Python’s pickle module.

Pickle

Image by Alina Kuptsova from Pixabay

The pickle module is used for storing Python object structures into a file (and retrieving it back into memory at a later time).

For example, you may use it to save your Machine Learning model that you have been spending the whole week training.

You pickle your Python objects onto the disk as a binary file (serialising), and you unpickle them from the disk into memory (deserialising).

You can pickle integers, floats, booleans, strings, tuples, lists, sets, dictionaries (that contain objects that can be pickled), top-level classes. No pickled gherkins, sorry! 🥒

Health warnings!

  • pickle is specific to Python. It is not recommended if you expect to share your data across different programming languages
  • Make sure you use the same Python version. It is not guaranteed that pickle will work with different versions of Python
  • Do not unpickle data from untrusted sources as you may execute malicious code inside the file when unpickling

Pickling time!

You pickle with pickle.dump(obj, file), and unpickle with pickle.load(file).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import pickle

courses = {70053: {"lecturer": "Josiah Wang", "title": "Python Programming"}, 
   70051: {"lecturer": "Robert Craven", "title": "Symbolic AI"}}

# Save courses to disk. Note binary mode!
with open("courses.pkl", "wb") as f:
    pickle.dump(courses, f)

# Load courses from disk. Again, it is a binary file!
with open("courses.pkl", "rb") as f: 
    pickled_courses = pickle.load(f)

print(pickled_courses)
## {70053: {'lecturer': 'Josiah Wang', 'title': 'Python Programming'}, 
## 70051: {'lecturer': 'Robert Craven', 'title': 'Symbolic AI'}} 

print(type(pickled_courses)) ## <class 'dict'> 

print(courses == pickled_courses)  ## True

Here is another example of pickling a list of objects (of a custom class)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import pickle

class Vector: 
    def __init__(self, x, y): 
        self.x = x 
        self.y = y

    def __str__(self): 
        return f"Vector ({self.x}, {self.y})"

    def __repr__(self):
        """ This makes the unique string representation
            of the object instance look more readable
        """
        return str(self)

v1 = Vector(2, 3) 
v2 = Vector(4, 3) 
v = [v1, v2] 

# Save v to disk.
with open('vectors.pkl', 'wb') as f: 
    pickle.dump(v, f)

# Load pickled file from disk
with open('vectors.pkl', 'rb') as f: 
    pickled_vectors = pickle.load(f)

print(pickled_vectors)  ## [Vector (2, 3), Vector (4, 3)]

print(type(pickled_vectors))  ## <class 'list'>