Python instance lifecycle and metaclasses

Noel
6 min readFeb 19, 2022

First, I would like to inform you that this article was written based on what the author understood after studying the references.

Instance lifecycle

In general, the lifecycle of a Python object begins with the creation of an instance.

Built-in functions that can see this process are “__new__”, “__init__”, and “__del__ “.

Here is an example code.

class Person:
def __new__(cls, *args, **kwargs):
new_person = object.__new__(cls)
print("Person __new__ gets called")
print(f"Person Instance Memory Address: {id(new_person)}")
return new_person

def __init__(self, name):
print(self.__dict__)
self.name = name
print("Person __init__ gets called")
print(f"Set Initial State: {id(self)}")
print(self.__dict__)

def __del__(self):
print(f"The object is getting deleted: {self}")

First of all, in the case of __new__(cls, *args), it receives a class, not an instance, as the first argument. Since self points to a created instance, it receives cls as an argument and creates an object.

In other words, this function creates an instance object in memory.

> Person Instance Memory Address: 2452543222352

At this time, when id() is called, an integer value is returned, which indicates the memory address.

Most of the work is done in the CPython implementation.

Next, __init__(self, *args) defines the initial state of the created object.
This function is mainly called by __new__ that has finished creating an instance.

Generally, you override this function to define the properties of an instance, and the values are stored through a dict.

You can check these values by calling __dict__().

> {‘name’: ‘a’}

If you call id(self) inside __init__, you will see that the same value as id(new_person) previously called in __new__ is output. This is the basis for knowing that you are using the same object.

Additionally, some hackers often need the memory address value expressed in hexadecimal.

You can get the hexadecimal memory address of the values by using something like the below:

hex(id(person))

If you do not assign to a variable, the object created in this way will be deleted by __del__(self).

However, if this object is assigned to a variable, it is activated in the corresponding namespace and is available for future use, and the reference count is incremented for memory management.

Note - about variables In other programming languages, it is expressed as assigning a value to memory through the process of creating and initializing variables, but in Python, you can think of the created object as an act of labelling. As proof of this, if you extract the function below, you can see that the string key and object are mapped.

> print(globals())

There are two ways to check the reference count in Python, one is using the sys module and the other is using gc.

import sys
import gc
sys.getrefcount(person) len(gc.get_refgerrers(person))

In the case of gc.get_refgerrers, the returned data type is List[Dict[str, Any]].

Also, 1 is high for sys.getrefcount, because it includes references written as arguments.

Calling del() usually decrements the reference count by one, and when the object’s reference count reaches zero it is completely removed from memory.

del person
'person' in globals()
# output is False

Not only that, you can see it disappearing from the namespace as well.

del() can be confused with __del__.

del() only removes variable labels from that namespace, whereas __del__ defines the action to take when that object is destroyed from memory.

Metaclass

In general, we think of a class as a data type, but actually, a class itself is an object.

print(type(Person)) 
# output is <class 'type'>

It means that that is an instance of a metaclass called type.

So, it is possible to create a class without using the reserved word, class.

MyClass = type('MyClass', (), {})

But the class is not just syntactic sugar.

it also performs additional operations such as __prepare__, __qualname__[Represents the path to a class, function, or method defined in a module e.g. Paths used in from-import statements], __doc__[docstring of function].

Among these, special methods exist that can implement functionality similar to operator overloading in C++.

Let me introduce you to a lesser-known method among them.
First, there is __call__, which allows us to use objects like functions a.k.a functor.

Next, __slots__ makes sure it has no fields other than the one specified.
In general, when creating fields, __dict__ allows flexible creation and deletion.

However, if you use __slots__, you create an instance without __dict__, and you can reduce memory usage a bit more.

class Foobar:
__slots__ = "a", "b", "c"

'''
>>> foo = Foobar()
>>> foo.a = 1
>>> foo.x = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Foobar' object has no attribute 'x'
'''

This is how an object finds a property.

But, the process of finding the properties of a class is a little different.

If you are looking for a property in an object, you end up finding the object in a class, but in the case of a class, you can see that it goes all the way back to metaclasses.

Likewise, in general, __new__ is a confusing element between a class and a metaclass, because type can also define it.

So, when we have a class called Foo

  • Foo.__new__ is used to create an instance of Foo.
  • type.__new__ is used to create a class like Foo.

When describing __slots__, it is said that fields (properties) are generally created in __dict__, and the __dict__ is It is created in __prepare__ and it is called in the process of creating a class (the process of using the class reserved word that defines the class).

The general object creation process like this

but the previous class creation process goes through the following steps.

If you look at the diagram, you can see that even metaclasses can call __call__, so any Callable can be used as a metaclass.

class Foo(metaclass=print): 
pass

Also, since metaclasses have the characteristic of being inherited by subclasses, problems arise when multiple inheritance of classes with different metaclasses occurs.

class Meta1(type):
pass

class Meta2(type):
pass

class Base1(metaclass=Meta1):
pass

class Base2(metaclass=Meta2):
pass

class Foobar(Base1, Base2):
pass

'''
**except**
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases
'''

However, if different metaclasses have an inheritance relationship with each other, there is no problem and the lowest metaclass is used.

class Meta(type):
pass

class SubMeta(Meta):
pass

class Base1(metaclass=Meta):
pass

class Base2(metaclass=SubMeta):
pass

class Foobar(Base1, Base2):
pass

print(type(Foobar))

# output is <class '__main__.SubMeta'>

Wrap-up

So far, we have briefly discussed objects and their metaclasses.

Perhaps you need to go inside the implementation, not Python, to find out more, so I’d recommend accessing the documentation for PyObject and PyTypeObject first. Link

In order for Python to be used in hacking, I think it is important to know the life cycle of an object as it requires an in-depth understanding of the memory structure.

It will also help general developers to write more efficient code.

References

--

--