Python’s type system is fascinating. Everything in Python (anything that can be given a name or passed as an argument) is an object. This includes primitives (such as int, str, bool objects), compound objects (made from handwritten classes), and very interestingly the classes or types themselves. This fact that everything in Python is derived from a common base makes it very powerful because any two of these objects can be combined or extended in similar ways. However, the implementation of this idea has consequences. It sometimes gives rise to confusing behaviours that can be hard to reason about if you only understand the behaviour intuitively. 

In this post, we will go over what this implies, understand the type-system bottom-up, and then use that understanding to reason about some confusing behaviours. By the end of this post, you will have a clear picture of what classes and objects really are and what lies behind this abstraction.

Table of contents

  1. The Chicken-Egg problem
  2. Method lookup
  3. Classes as objects

Note: I drafted an updated version of some of the explanations for a more recent talk. It drops the 2nd section on method lookup, but I believe it has a better introduction into the chicken-egg problem and creating objects as classes. The JupyterLab slides can be found here. You can scroll to move between slides.

The Chicken-Egg Problem

object and type

When people say that everything is an object in Python, they mean that very literally. There is an object class that all Python objects are instances of. This is also the implicit base class when you create a new class.

1
2
3
4
5
6
>>> isinstance(5, object)
True
>>> isinstance(int, object)
True
>>> isinstance("abcdef", object)
True

Classes themselves are objects too. This is slightly counter-intuitive when you think of an object as an instance of some class. But it is very much an object in that you can pass it around as a function argument, name it differently, set attributes and call methods on it.

All classes are instances of the type class. In that sense, type is a metaclass and other metaclasses need to inherit from type. 1

Relationship between object and type

There are three things about this that are confusing when considered together:

  1. object is a class
  2. All classes are objects of the type type.
  3. type itself is a class

We can verify this in the Python interpreter.

If an object is a class, it should be an instance of the type class.

1
2
>>> isinstance(object, type)
True

The interpreter agrees.

To verify #2, we will create a custom class and check whether it’s an instance of object. Then we’ll also verify whether it is an instance of type.

1
2
3
4
5
6
7
>>> class A:
...   pass
...
>>> isinstance(A, object)
True
>>> isinstance(A, type)
True

Checks out.

Now let’s see if type is a class. For this, we will check whether type is an instance of type and whether type is an object.

1
2
3
4
>>> isinstance(type, type)
True
>>> isinstance(type, object)
True

See, I wasn’t lying.

What’s happening here? type is a type, type is also an object, and object is itself a type.

We can understand this relationship better with issubclass:

1
2
3
4
>>> issubclass(type, object)
True
>>> issubclass(object, type)
False

While type is a subclass of object, object isn’t a subclass of type. Instead, only the type of object is type. That is, an object is an instance of type.

This diagram captures this relationship accurately:

Each object has a type, some other metadata, and its attributes associated to it. A type itself has everything that an object does (including a type), along with some extra fields such as the base classes that it borrows from.

This relationship can be better understood when we take a look at how it is implemented in CPython.

PyObject and PyTypeObject

In the CPython implementation, there are two basic structs for PyObject and PyTypeObject that look kind of like this. Notice that the _object struct has a pointer to its type (*ob_type), and there is a separate bases tuple in _typeobject that stores the bases for a type object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
// a simplified struct definition for PyObject and PyTypeObject

#define PyObject_HEAD       PyObject ob_base;

typedef struct _object PyObject;
typedef struct _typeobject PyTypeObject;

struct _object {
    PyTypeObject *ob_type;
    // and other metadata
};

struct _typeobject {
    PyObject_HEAD  // macro to "inherit" from PyObject
    const char  *name;
    PyObject    *bases;
    // and other things
};

The ob_type of object is manually set to type, and this object is then added to the bases tuple in type.

To make it clear, let’s try to write the C code that will represent such a relationship:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// let's assume this function is defined and initialises a tuple with a single PyObject
PyObject* tuple_init(PyObject*);

PyTypeObject type = { .name = "type" };
type.ob_base.ob_type = &type;

PyTypeObject object = { .name = "object" };
object.ob_base.ob_type = &type;

// type.bases needs to be set to a Python Tuple
type.bases = tuple_init((PyObject*) &object);

wtfpython has a good explanation on this here and CPython has comments explaining this too: https://github.com/python/cpython/blob/a286caa937405f7415dcc095a7ad5097c4433246/Include/object.h#L24-L29

Method Lookup

Lookup order, and setting and deleting attributes

This section sets up the context for the next one about method lookup.

The lookup order in Python goes like Object → Class → Base classes. However, the setting and deleting of attributes only happens directly on the object itself.

Take this example where we initialise an object whose class has an attribute x, and then try to delete it:

1
2
3
4
5
6
7
8
>>> class A:
...   x = 10
...
>>> a = A()
>>> print(a.x)
10
>>> del a.x
AttributeError: x

The attributes of a class can be looked up by its objects, but cannot be deleted from there.

However, deleting directly from the class will work fine:

>>> del A.x  # is fine
>>> print(a.x)
AttributeError: 'A' object has no attribute 'x'

The same happens when setting attributes too. We can see it in this example where we set x on both the object and the class, and then delete the x on the object:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> class A:
...   x = 10  # set on class
...
>>> a = A()
>>> a.x = 20  # set on object
>>> print(a.x)
20
>>> del a.x  # delete from object
>>> print(a.x)  # doesn't raise attribute error because A.x is still there
10

Even though we deleted a.x, A.x was still there and that’s the value we got when we printed a.x the second time.

We can observe that the lookup is dynamic and any modifications to the attributes on the class should reflect when we perform lookup from the object.

self injection

When we define methods inside a class body, we take the bound object as the first argument self. This is strange because self is defined when the method is called from an initialised object, but not when the method is called from the class itself.

1
2
3
4
5
6
7
8
9
>>> class A:
...   def hello(self):
...     print("Hello, world!")
...
>>> A.hello()
TypeError: A.hello() missing 1 required positional argument: 'self'
>>> a = A()
>>> a.hello()
Hello, world!

Additionally, this self injection does not happen at the time that the function is called, because Python also allows you to assign a different variable to this method and it still works:

1
2
3
4
>>> a = A()
>>> f = a.hello
>>> f()
Hello, world! 

So what’s the magic here?

If we inspect the types of A.hello and a.hello, we’ll see the difference.

>>> type(A.hello)
<class 'function'>
>>> type(a.hello)
<class 'method'>

They’re not the same, and a.hello is a method and not a function.

The hello attribute returns different values depending on if it is called from the class or an initialised object. Python allows such behaviour with descriptors.

Descriptors

Python descriptors are classes that have __get__, __set__, or __del__ methods. So instead of the actual value, these methods are called when a descriptor is accessed as an attribute. A popular example of descriptors in use is the property decorator that can be used for dynamic attributes.

Python functions are actually descriptors with a __get__ method that returns a method with the object and function bound to it. I’ll take a couple of examples to explain descriptors.

A simple Descriptor that always returns 10 would look like this 2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
>>> class Ten:
...   def __get__(self, obj, objtype=None):
...     return 10
...
>>> class A:
...   x = 5
...   y = Ten()
...
>>> a = A()
>>> a.x
5
>>> a.y
10

A descriptor has access to the object it is called with from the obj argument. We could write this descriptor to return the square of obj.x:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
>>> class SquareX:
...   def __get__(self, obj, objtype=None):
...     return obj.x * obj.x
...
>>> class A:
...   y = SquareX()
...
...   def __init__(self, x):
...     self.x = x
...
>>> a = A(5)
>>> a.y
25

You can also directly try using the __get__ method on a function. Calling it with the object as its argument will return a method that binds this function and the object.

Crafting methods by hand

This is a side section, but rather interesting because we can directly observe how methods are actually represented differently from functions.

The method class isn’t available as a builtin, but you can get it by calling type() on a method. This class takes a function and an object during initialisation, in that order. We can use that to create methods by hand that were never bound to the class or the object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
>>> class A:
...   greeting = "Hello, I am Foo Bar."
...   def hello(self):
...     pass
...
>>> a = A()
>>> method_class = type(a.hello)
>>> def greet(self):
...   print(self.greeting)
...
>>> greet_with_a = method_class(greet, a)
>>> greet_with_a()
Hello, I am Foo Bar.

We just crafted a method that was never an attribute.

Classes as Objects

Creating classes with type 

Everything is an object, and classes are not exempt from it. We saw earlier that classes are instances of the type class.

We can also create classes with the type class initialiser, similar to other objects. For example:

1
2
3
4
5
6
7
8
9
>>> def foo(self):
...   return "bar"
...
>>> A = type("A", tuple(), {"foo": foo, "x": 10})
>>> a = A()
>>> a.foo()
'bar'
>>> a.x
10

The type signature of the 3-argument type initialisation is type(name, bases, dict) → type.

The name argument will be the value taken by cls.__name__, bases can be used to achieve inheritance by providing a tuple of classes, and dict is a dict of key-value pairs that are the attributes on this class.

When a metaclass needs to be used, that class can be used instead of type.

class keyword as syntactic sugar

A syntactic sugar is a some additional syntax that makes it easier to express something but doesn’t add a new feature to the language.

The class keyword can be seen as calling type with the variables defined in its scope as value of the dict argument. In that sense, class is a a syntactic sugar for creating instances of type. We’ll write a decorator to show this that will mimick the behaviour of the class keyword as closely as possible.

At the end, we should be able to define classes with something like this:

1
2
3
4
5
6
7
8
9
@create_class()
def A():
	x = 10	
	def __init__(self, y, z):
	    self.y = y
	    self.z = z
	return locals()

a = A(5, 10)

We should also be able to perform inheritance.

1
2
3
4
5
6
@create_class(A)
def B():
	x = 20
	return locals()

b = B(5, 10)  # y and z, as in A.__init__

And use metaclasses:

1
2
3
4
5
6
7
8
9
@create_class(type)
def Meta():
    def __repr__(self):
        return f"Meta {self.__name__}"
	return locals()

@create_class(metaclass=Meta)
def C():
    pass

This syntax is very similar to the one used to declare classes with the class keyword, except that it is a def, has a create_class decorator, and returns locals().

The decorator gets the function object, and can access the parameters and default arguments. From there, we can resolve the variables from globals() and initialise the class.

Feel free to give it a try now before reading on with the solution.

Our decorator gets access to the function and we can inspect it to find out its parameters (base classes) and other keyword arguments (metaclass). We will only get these parameters as strings, and we need to resolve it using globals(). Then we can use the metaclass (by default, type) to initialise this with the three arguments. We can assume that the function will return the locals() dict from inside and use that for building the third argument in initialisation.

1
2
3
4
5
6
7
8
9
def create_class(*bases, **kwds):
    def wrapper(func):
        metaclass = kwds.get("metaclass", type)
        func_locals = func()
        klass = metaclass(
            func.__name__, bases, func_locals,
        )
        return klass
    return wrapper

Now, the examples above will work and we can recreate the class keyword without using it.

These examples should work now:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
@create_class()
def A():
    x = 20
    return locals()

@create_class(A)
def B():
    y = 30

    def __init__(self, foo, bar):
        self.foo = foo
        self.bar = bar

    return locals()

@create_class(type)
def Meta():
    def __repr__(self):
        return f"meta {self.__name__}"
    return locals()

@create_class(B, metaclass=Meta)
def C():
    return locals()

We created a simple class, a subclass, a metaclass, and a class which uses that metaclass without using a decorator that spans less than 10 lines of code.

We can even get rid of the need to return locals() from the definitions if we use something like inspect.getsource and then performing an exec on the function body and supplying a local env, but inspect.getsource will actually read the literal text from the Python file to produce the output. I didn’t want to get that hacky and stick to what is always available at runtime.

Note on recreating the class keyword with functions

One point to note with this is that the scoping in classes is different from scoping in functions. Even though variables declared inside a class block are scoped to that block only, the class block can readily access global variables.

It can get slightly confusing, but this is perfectly valid Python code:

1
2
3
4
5
6
7
x = 10

class A:
    x = x + 5
    print("x inside A:", x)  # x inside A: 15

print("x outside A", x)  # x outside A: 10

This is not possible with our way of defining classes because functions by default don’t get access to the global variables unless used with nonlocal or global.

Conclusion

  • Because everything in Python is an object and all objects have a type, there exists a strange relationship between object and type
  • We looked at how type is itself a type and how that relationship is defined in the interpreter
  • We saw how Python uses descriptors to implement methods and how we can craft our own
  • We blurred the line between classes and objects by eliminating the need for a special keyword for classes, and instead working with type.

The objects all the way down philosophy has its quirks, but it also makes Python very expressive. We can combine classes and functions in a similar manner as other primitive types. Lists and tuples need not be restricted to a single type and we can even have custom meta types. All of this is possible because there is some shared structure to all the objects that the interpreter can exploit. With the knowledge of a few rules, we can extend and combine anything in Python.

Footnotes

[1] In fact, Python allows you to define your own metaclasses. Then the class will be an instance of a different type other than the default that is type. Note that metaclasses also need to subclass from type, so the stated relationship between type and classes still holds.

[2] Example borrowed from Descriptor HowTo Guide, Python documentation.