Python’s type system is fascinating. Everything in Python (anything that can be given a name or passed as an argument) is an object. This includes primitives (such as int
, str
, bool
objects), compound objects (made from handwritten classes), and very interestingly the classes or types themselves. This fact that everything in Python is derived from a common base makes it very powerful because any two of these objects can be combined or extended in similar ways. However, the implementation of this idea has consequences. It sometimes gives rise to confusing behaviours that can be hard to reason about if you only understand the behaviour intuitively.
In this post, we will go over what this implies, understand the type-system bottom-up, and then use that understanding to reason about some confusing behaviours. By the end of this post, you will have a clear picture of what classes and objects really are and what lies behind this abstraction.
Table of contents
Note: I drafted an updated version of some of the explanations for a more recent talk. It drops the 2nd section on method lookup, but I believe it has a better introduction into the chicken-egg problem and creating objects as classes. The JupyterLab slides can be found here. You can scroll to move between slides.
The Chicken-Egg Problem
object
and type
When people say that everything is an object in Python, they mean that very literally. There is an object
class that all Python objects are instances of. This is also the implicit base class when you create a new class.
|
|
Classes themselves are objects too. This is slightly counter-intuitive when you think of an object as an instance of some class. But it is very much an object in that you can pass it around as a function argument, name it differently, set attributes and call methods on it.
All classes are instances of the type
class. In that sense, type
is a metaclass and other
metaclasses need to inherit from type
. 1
Relationship between object
and type
There are three things about this that are confusing when considered together:
object
is a class- All classes are objects of the type
type
. type
itself is a class
We can verify this in the Python interpreter.
If an object
is a class, it should be an instance of the type
class.
|
|
The interpreter agrees.
To verify #2, we will create a custom class and check whether it’s an instance of object
. Then we’ll also verify whether it is an instance of type
.
|
|
Checks out.
Now let’s see if type
is a class. For this, we will check whether type
is an instance of type
and whether type
is an object
.
|
|
See, I wasn’t lying.
What’s happening here? type is a type, type is also an object, and object is itself a type.
We can understand this relationship better with issubclass
:
|
|
While type
is a subclass of object
, object
isn’t a subclass of type
.
Instead, only the type of object
is type
. That is, an object
is an instance of type
.
This diagram captures this relationship accurately:
Each object has a type, some other metadata, and its attributes associated to it. A type itself has everything that an object does (including a type), along with some extra fields such as the base classes that it borrows from.
This relationship can be better understood when we take a look at how it is implemented in CPython.
PyObject
and PyTypeObject
In the CPython implementation, there are two basic structs for PyObject
and PyTypeObject
that look kind of like this. Notice that the _object
struct has a pointer to its type
(*ob_type
), and there is a separate bases
tuple in _typeobject
that stores the bases
for a type object.
|
|
The ob_type
of object
is manually set to type
, and this object
is then added to the
bases
tuple in type
.
To make it clear, let’s try to write the C code that will represent such a relationship:
|
|
wtfpython has a good explanation on this here and CPython has comments explaining this too: https://github.com/python/cpython/blob/a286caa937405f7415dcc095a7ad5097c4433246/Include/object.h#L24-L29
Method Lookup
Lookup order, and setting and deleting attributes
This section sets up the context for the next one about method lookup.
The lookup order in Python goes like Object → Class → Base classes. However, the setting and deleting of attributes only happens directly on the object itself.
Take this example where we initialise an object whose class has an attribute x
, and then
try to delete it:
|
|
The attributes of a class can be looked up by its objects, but cannot be deleted from there.
However, deleting directly from the class will work fine:
>>> del A.x # is fine
>>> print(a.x)
AttributeError: 'A' object has no attribute 'x'
The same happens when setting attributes too. We can see it in this example where we set x
on both the object and the class, and then delete the x
on the object:
|
|
Even though we deleted a.x
, A.x
was still there and that’s the value we got when we printed a.x
the second time.
We can observe that the lookup is dynamic and any modifications to the attributes on the class should reflect when we perform lookup from the object.
self
injection
When we define methods inside a class body, we take the bound object as the first argument self
.
This is strange because self
is defined when the method is called from an initialised object,
but not when the method is called from the class itself.
|
|
Additionally, this self
injection does not happen at the time that the function is called, because
Python also allows you to assign a different variable to this method and it still works:
|
|
So what’s the magic here?
If we inspect the types of A.hello
and a.hello
, we’ll see the difference.
>>> type(A.hello)
<class 'function'>
>>> type(a.hello)
<class 'method'>
They’re not the same, and a.hello
is a method
and not a function
.
The hello
attribute returns different values depending on if it is called from the class or an initialised object. Python allows such behaviour with descriptors.
Descriptors
Python descriptors are classes that have __get__
, __set__
, or __del__
methods. So instead
of the actual value, these methods are called when a descriptor is accessed as an attribute. A
popular example of descriptors in use is the property
decorator that can be used for dynamic
attributes.
Python functions are actually descriptors with a __get__
method that returns a
method
with the object and function bound to it. I’ll take a couple of examples to
explain descriptors.
A simple Descriptor that always returns 10 would look like this 2:
|
|
A descriptor has access to the object it is called with from the obj
argument. We could write
this descriptor to return the square of obj.x
:
|
|
You can also directly try using the __get__
method on a function. Calling it with the object
as its argument will return a method that binds this function and the object.
Crafting methods by hand
This is a side section, but rather interesting because we can directly observe how methods are actually represented differently from functions.
The method
class isn’t available as a builtin, but you can get it by calling type()
on a method. This class takes a function and an object during initialisation, in that order. We can use that to create methods by hand that were never bound to the class or the object.
|
|
We just crafted a method that was never an attribute.
Classes as Objects
Creating classes with type
Everything is an object, and classes are not exempt from it. We saw earlier that classes are instances of the type
class.
We can also create classes with the type
class initialiser, similar to other objects. For example:
|
|
The type signature of the 3-argument type
initialisation is type(name, bases, dict) → type
.
The name
argument will be the value taken by cls.__name__
, bases
can be used to achieve
inheritance by providing a tuple of classes, and dict
is a dict of key-value pairs that
are the attributes on this class.
When a metaclass needs to be used, that class can be used instead of type
.
class
keyword as syntactic sugar
A syntactic sugar is a some additional syntax that makes it easier to express something but doesn’t add a new feature to the language.
The class
keyword can be seen as calling type
with the variables defined in its scope as
value of the dict
argument. In that sense, class
is a a syntactic sugar for creating
instances of type
. We’ll write a decorator
to show this that will mimick the behaviour of the class
keyword as closely as possible.
At the end, we should be able to define classes with something like this:
|
|
We should also be able to perform inheritance.
|
|
And use metaclasses:
|
|
This syntax is very similar to the one used to declare classes with the class
keyword, except that it is a def
, has a create_class
decorator, and returns locals()
.
The decorator gets the function object, and can access the parameters and default arguments. From there, we can resolve the variables from globals()
and initialise the class.
Feel free to give it a try now before reading on with the solution.
Our decorator gets access to the function and we can inspect it to find out its parameters
(base classes) and other keyword arguments (metaclass). We will only get these parameters
as strings, and we need to resolve it using globals()
. Then we can use the metaclass (by
default, type
) to initialise this with the three arguments. We can assume that the function
will return the locals()
dict from inside and use that for building the third argument
in initialisation.
|
|
Now, the examples above will work and we can recreate the class
keyword without using it.
These examples should work now:
|
|
We created a simple class, a subclass, a metaclass, and a class which uses that metaclass without using a decorator that spans less than 10 lines of code.
We can even get rid of the need to return locals()
from the definitions if we use something like inspect.getsource
and then performing an exec
on the function body and supplying a local env, but inspect.getsource
will actually read the literal text from the Python file to produce the output. I didn’t want to get that hacky and stick to what is always available at runtime.
Note on recreating the class
keyword with functions
One point to note with this is that the scoping in classes is different from scoping in
functions. Even though variables declared inside a class
block are scoped to that block
only, the class
block can readily access global variables.
It can get slightly confusing, but this is perfectly valid Python code:
|
|
This is not possible with our way of defining classes because functions by default don’t
get access to the global variables unless used with nonlocal
or global
.
Conclusion
- Because everything in Python is an object and all objects have a type, there exists a strange relationship between
object
andtype
- We looked at how
type
is itself atype
and how that relationship is defined in the interpreter - We saw how Python uses descriptors to implement methods and how we can craft our own
- We blurred the line between classes and objects by eliminating the need for a special keyword for classes, and instead working with
type
.
The objects all the way down philosophy has its quirks, but it also makes Python very expressive. We can combine classes and functions in a similar manner as other primitive types. Lists and tuples need not be restricted to a single type and we can even have custom meta types. All of this is possible because there is some shared structure to all the objects that the interpreter can exploit. With the knowledge of a few rules, we can extend and combine anything in Python.
Footnotes
[1] In fact, Python allows you to define your own metaclasses. Then the class will be an instance of a different type other than the default that is type
. Note that metaclasses also need to subclass from type
, so the stated relationship between type
and classes still holds.
[2] Example borrowed from Descriptor HowTo Guide, Python documentation.