This is an archived version of the course. Please see the latest version of the course.

Encapsulation

I have only talked about public attributes so far.

What about private attributes?

In C++, attributes are private by default. You are encouraged to hide all attributes, and to only expose these private attributes using getter or setter methods, e.g. get_age() or set_age().

Python has a different philosophy when it comes to encapsulation. Python gives you a choice of either public or non-public attributes. In Python, technically you cannot make any attribute truly private (there are hacky ways to access it if a programmer is desperate), hence the term ‘non-public’ is used.

Quoting Python’s official PEP 8 style guide:

Public attributes are those that you expect unrelated clients of your class to use, with your commitment to avoid backwards incompatible changes. Non-public attributes are those that are not intended to be used by third parties; you make no guarantees that non-public attributes won’t change or even be removed.

So more than anything, it boils down to whether you think an attribute will never be changed in the future, and whether it really needs to be accessed as a public interface. If so, then definitely just make it public. Otherwise, you can choose to make it public or non-public.

There are several school of thoughts here even among Python developers. The official PEP 8 guide says “If in doubt, choose non-public; it’s easier to make it public later than to make a public attribute non-public.” But the more dominant view nowadays is to make everything public by default unless it is clearly non-public. If you ever need to make a public attribute non-public in the future, you can do this easily by exposing them as properties.

Let’s illustrate how to do this with an example. Here is an example code from before.

1
2
3
4
5
6
7
8
9
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

person = Person("Josiah Wang", 20)  
print(person.age)
person.age = 70
print(person.age)

Let’s say we now decided that we do not want a person’s age to be exposed directly. Instead, a person will always publicly declare themselves to be two years younger than their real age! We might also allow a person’s age to be modified, but only if the new age is less than 30.

If we were to think like a C++ programmer, then we might end up with Python code that looks like this.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Person:
    def __init__(self, name, age):
        self.name = name
        self.__age = age

    def get_age(self):
        return self.__age - 2

    def set_age(self, new_age):
        if new_age < 30:
            self.__age = new_age
        else:
            print("Never! I am always under 30!")


person = Person("Josiah Wang", 20)  
print(person.get_age())
person.set_age(10)
print(person.get_age())
person.set_age(70)
print(person.get_age())
print(person.__age)  ## this won't work!

In Python, you make an attribute (or method) non-public by prepending two underscores, e.g. self.__age (Line 4). This will activate name mangling which will hide this attribute from outside the class definition (so line 22 does not work). But you can actually still access this if you are desperate enough (with person._Person__age)! Don’t do this though, there is probably a clear reason why the original developer does not want you to use this variable in the first place!

Ignoring line 22, the code works. The real problems (from Python’s point) are:

  1. person.get_age() and person.set_age() make the code less readable in Lines 17-21 than simply just person.age in our original version.
  2. Everybody who have previously used the original Person class will have to modify all occurrences of person.age to person.get_age() and person.set_age().

Is there a better solution to this? In the next page, I will show the Pythonic way to approach this.