Classes in Python are also objects.
This is a key concept. In Python, almost everything is an object, including both functions and classes. This means that functions and classes can be provided as arguments, exist as members of class instances, and do anything that any other object is capable of doing.
What else does it mean to say that classes are objects? Chapter 4, “Magic Methods,” discussed how object instantiation works. The __new__
and __init__
methods of the class are called, in that order, to create the new object. Classes are not an exception to this process. Classes themselves, being objects, are instances of another class, which is responsible for creating them.
The classes responsible for generating other classes are called metaclasses. “Meta-” is a Greek prefix that simply means “post-” or “after.” For example, a portion of Aristotle's work is called “The Physics,” and the subsequent portion is called “The Metaphysics,” which simply means “the stuff that comes after the physics.” However, the meaning assigned to this prefix has since evolved to refer to a level of self-reference—an instantiation of a concept in order to work on that concept. If you have ever been unfortunate enough to be forced to sit through a meeting to plan other meetings, that particular atrocity could rightly be called a meta-meeting.
This chapter covers metaclasses. First, it delves into the philosophy behind Python's object model, and how metaclasses, classes, and objects connect to one another. Then, it explores examples of specific ways metaclasses can be used.
The relationship between a class and an instance of that class is straightforward and two-fold. First, a class defines the properties and available actions of its instances. Second, a class serves as a factory that creates said instances.
With this in mind, the only additional understanding necessary to grasp metaclasses is the realization that this relationship can be hierarchical. When you instantiate a class that you write, your class serves as the definition of the instance's properties and actions, and performs the generation of the instance. When you defined the class, you were simply using a special, substitute syntax that stands in for the instantiation of a different class, called type
.
This can best be illustrated by simply creating a class using type
directly, rather than using the Python class
keyword. This is syntactically quite ugly, but it offers a clear view into what is going on under the hood.
Therefore, consider the following simple set of classes:
class Animal(object):
"""A class representing an arbitrary animal."""
def __init__(self, name):
self.name = name
def eat(self):
pass
def go_to_vet(self):
pass
class Cat(Animal):
def meow(self):
pass
def purr(self):
pass
The Animal
class obviously represents an animal, and defines certain things that the animal is capable of doing, such as eating and being taken to the vet. The Cat
subclass additionally knows how to meow and purr, functions not available to other animals. (The method bodies are stubbed, and left to the reader's intuition.)
What happens here is that when the Python interpreter gets to the top statement in the code, class Animal(object)
, it invokes the type
constructor under the hood. As alluded to earlier, type
is a built-in class in Python, which is the default class for other class objects. It is the default class that creates other classes—or, the default metaclass.
However, nothing stops you from simply doing this directly. The type
constructor takes three positional arguments: name
, bases
, and attrs
. The name
argument (a string) is simply the name of the class. The bases
argument is a tuple of the superclasses for that class. Python supports multiple inheritance, which is why this is a tuple. If you are only inheriting from a single class, just send a tuple with a single element. Finally, the attrs
argument is a dictionary of all the attributes on the class.
The following code is (roughly) equivalent to the previous class Animal
block:
def init(self, name):
self.name = name
def eat(self):
pass
def go_to_vet(self):
pass
Animal = type('Animal', (object,), {
'__doc__': 'A class representing an arbitrary animal.',
'__init__': init,
'eat': eat,
'go_to_vet': go_to_vet,
})
This is, obviously, not the preferred way to instantiate a new class. Also, note that it is only roughly equivalent. It has a couple of differences, most notably that this code leaves functions called init
, eat
, and go_to_vet
, unattached to the class, in that namespace. This is worth noting, but not particularly important for the purposes of this discussion.
Focus on the call to type
. The first argument is just the string 'Animal'
. There is some repetition here. You are sending this string to assign the name of the class, but you are also assigning the result of the type call to the variable Animal
. The class
keyword handled this for you. Because this is a direct call to type
, you must manually assign the result to a variable, as you would for a new instance of any other class.
The second argument is a tuple with a single item: (object,)
. This means that the Animal
class inherits from object
, as it did in the initial class. You need the trailing comma to disambiguate to the Python interpreter that you want a tuple here. Parentheses have other uses in Python, and so a trailing comma is required for tuples with only a single element.
The third argument is a dictionary that defines the attributes of the class, equivalent to the indented portion of the class
block. You previously defined functions that map to the functions in your original class, and pass them into the attrs
dictionary. The dictionary keys are used to determine the name of the attribute within the class. One thing to note here is the docstring. The Python interpreter automatically takes the docstring in a class call and assigns it to the attribute __doc__
. Because you are instantiating type
directly, you must do that manually.
You can create the Cat
class similarly, as shown here:
def meow(self):
return None
def purr(self):
return None
Cat = type('Cat', (Animal,), {
'meow': meow,
'purr': purr,
})
This is mostly more of the same. The big change here is that you are now subclassing Animal
rather than object
. What you are passing here is the Animal
class itself. Also, note that it is still a tuple with a single element. You are not passing (Animal, object)
. The fact that object
is Animal
's superclass is baked into the Animal
class already. Sending in a tuple with more than one element is only necessary for multiple inheritance situations.
Consider the following instance of the Cat
class:
louisoix = Cat(name='Louisoix')
Notice the three things that are on deck. louisoix
is an object, and an instance of Cat
. The Cat
class is also an object (because classes are objects), and is an instance of type
. Finally, type
is the top of the chain.
You can also observe this in another way. Passing a single object to type
returns its class, as shown here:
>>> type(5)
<type 'int'>
So, observe the following chain:
>>> type(louisoix)
<class '__main__.Cat'>
>>> type(Cat)
<class 'type'>
>>> type(type)
<class 'type'>
The type
class is the base case here. It is the top of the chain, and, therefore, type(type)
returns itself.
type
is the primary metaclass in Python. Ordinary classes that are created with the class
keyword, by default, have type
as their metaclass.
Colloquially, you can refer to type
as the metaclass for both the class (Cat
) and its instances (louisoix
).
Additionally, type
is also the superclass from which other metaclasses inherit. This is analogous to object
being the class from which other classes inherit. Just as object
is the top of the class hierarchy, type
is the top of the metaclass hierarchy.
Writing a metaclass is syntactically very straightforward. You simply declare a class (using the class
keyword) that inherits from type
. The beauty of this object model shines through here. Classes are just objects, and metaclasses are just classes. The behaviors that metaclasses take on are inherited from type
. Any class that subclasses type
is, therefore, capable of functioning as a metaclass.
Before going into examples, note as an aside that you should never attempt to declare or use a metaclass that does not directly subclass type
. This will cause havoc with Python's multiple inheritance. Python's inheritance model requires any class to have exactly one metaclass. Inheriting from two classes with different metaclasses is acceptable if (and only if) one of the metaclasses is a direct subclass of the other (in which case, the subclass is used). Attempting to implement a metaclass that does not subclass type
will break multiple inheritance with any classes that use that metaclass, along with any classes that use type
(that is, virtually all of them). You do not want to do this.
The most important method that custom metaclasses must define is the __new__
method. This method actually handles the creation of the class, and must return the new class.
The __new__
method is a class method (that does not need to be explicitly decorated as such). The arguments sent to __new__
in custom metaclasses must mirror the arguments sent to type
's __new__
method, which takes four positional arguments.
The first argument is the metaclass itself, prepended to arguments in a manner similar to that of bound methods. By convention, this argument is called cls
.
Beyond this, __new__
expects three positional arguments:
First, the desired name of the class as a string (name
)
Second, a tuple of the class's superclasses (bases
)
Third, a dictionary of attributes that the class should contain (attrs
)
Most custom implementations of __new__
in metaclasses should ensure that they call the superclass implementation, and perform whatever work is needed in the code around that.
Recall at this point the distinction between the __new__
method and the __init__
method. In a class or a metaclass, the __new__
method is responsible for creating and returning the object. Conversely, the __init__
method is responsible for customizing the object after it has been created, and returns nothing.
In ordinary classes, you generally do not define a custom __new__
method at all. By contract, defining a custom __init__
method is extremely common. This is because the implementation of __new__
provided by object
is essentially always sufficient, but it is also necessary. Overriding it (even in direct subclasses of object
) would require calling the superclass method and being careful to return the result (the new instance). By contrast, overriding __init__
is easy and relatively risk-free. An object's implementation of __init__
is a no-op, and the method does not return anything at all.
When you're writing custom metaclasses, this behavior changes. Custom metaclasses generally should override the __new__
method, and generally do not implement an __init__
method at all. When doing this, keep in mind that you almost always must call the superclass implementation. type
's implementation of __new__
will actually provide you with the object you need to do work on and return.
Before diving into a metaclass that customizes behavior, consider a custom metaclass that does nothing but check all the boxes that have been covered thus far.
class Meta(type):
"""A custom metaclass that adds no actual functionality."""
def __new__(cls, name, bases, attrs):
return super(Meta, cls).__new__(cls, name, bases, attrs)
This discussion has not yet explored how to assign a metaclass within class creation using the class
keyword (more on that shortly). But you can create a class that uses the Meta
metaclass by calling Meta
directly, similar to the direct invocation of type
earlier.
>>> C = Meta('C', (object,), {})
This creates a class, C
, which is an instance of Meta
rather than an instance of type
. Observe the following:
>>> type(C)
<class '__main__.Meta'>
This is distinct from what you observe from a “normal” class, as shown here:
>>> class N(object):
... pass
...
>>> type(N)
<class 'type'>
It is worth noting that metaclasses are inherited. Therefore, subclasses of C
will be instances of Meta
, rather than being direct instances of type
as shown in the following code and illustrated in Figure 5.1.
>>> class D(C):
... pass
...
>>> type(D)
<class '__main__.Meta'>
Figure 5.1 Metaclass inheritance
In this case, D
is an instance of Meta
not because it has an explicit metaclass declared, or because you called Meta
to create it, but rather because its superclass is an instance of Meta
, and, therefore, it is also.
It is important to note here that classes may only have one metaclass. Under most circumstances, this is fine, even in scenarios where multiple inheritance is in play. If a class subclasses two or more distinct classes with distinct metaclasses, the Python interpreter will try to resolve this by checking the ancestry of the metaclasses. If they are direct ancestors, the subclass will be used.
Consider the following class that subclasses both C
(an instance of Meta
) and N
(an instance of type
)
>>> class Z(C, N):
... pass
...
>>> type(Z)
<class '__main__.Meta'>
Figure 5.2 shows what is happening in this code.
Figure 5.2 Metaclass inheritance with subclasses
What is going on here? The Python interpreter is told to create class Z
, and that it should subclass both C
and N
. This would be the equivalent of type('Z', (C, N), {})
.
First, the Python interpreter examines C
, and realizes that it is an instance of Meta
. Then it examines N
, and realizes that it is an instance of type
. This is a potential conflict. The two superclasses have different metaclasses. However, the Python interpreter also realizes that Meta
is a direct subclass of type
. Therefore, it knows it can safely use Meta
, and does so.
What happens if you have two metaclasses where one is not a direct descendent of the other? Now there is a conflict, and the Python interpreter does not know how to solve it. And it will cowardly refuse to try, as shown here:
>>> class OtherMeta(type):
... def __new__(cls, name, bases, attrs):
... return super(OtherMeta, cls).__new__(cls, name, bases, attrs)
...
>>> OtherC = OtherMeta('OtherC', (object,), {})
>>>
>>> class Invalid(C, OtherC):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in __new__
TypeError: Error when calling the metaclass bases
metaclass conflict: the metaclass of a derived class must be a (non-
strict) subclass of the metaclasses of all its bases
This happens because Python can only have one metaclass for each class, and will not try to guess which metaclass to use in an ambiguous case.
Before delving into more complex metaclasses, let's explore how to use them. Although it is, of course, possible to instantiate metaclasses directly (as shown with type
and Meta
earlier), it is not the desirable method.
The class
construct in Python provides a mechanism to declare the metaclass if type
is not the metaclass being used. However, the syntax to define which metaclass is different, depending on which version of Python you are using.
In Python 3, metaclasses are declared alongside the superclasses (if any). The syntax resembles that of a keyword argument in a function declaration or a function call, and the “keyword argument” is metaclass
.
Earlier, you created the C
class by calling Meta
directly. Here is the preferred way to do this in Python 3:
class C(metaclass=Meta):
pass
This class
keyword call does the exact same thing as creating the class by directly calling Meta
. This, however, is the preferred style.
One thing to note here is that you did not explicitly specify object
as the superclass. In most of the examples used in this book, you have explicitly specified object
as the superclass. This is because this book intends examples to be run on either Python 2 or Python 3. In Python 2, specifying this matters, because subclassing object
is what makes the class be a “new-style class,” which is a construct introduced a long time ago (Python 2.2) that altered Python's method-resolution order, as well as some of the other guts of how Python classes work. The direct subclassing of object
was used as a way to ensure backward-compatibility, forcing developers to “opt-in” to new-style classes, rather than to opt out of them.
In Python 3, which was a backward-incompatible release, all classes are new-style, and directly subclassing object
is no longer necessary, and thus is not done here. That said, the previous code is exactly equivalent to the following:
class C(object, metaclass=Meta):
pass
This style allows you to observe more explicitly the distinction between superclasses, which are declared here using a syntax akin to positional arguments in a function declaration, as opposed to the metaclass that is declared with the keyword argument syntax. They must be specified in this order, with metaclass
last, just like function arguments.
When directly subclassing object
in Python 3, either style (explicitly including it or omitting it) is acceptable.
Python 2 has an entirely different syntax for metaclass declaration. The Python 2 syntax is not supported under Python 3, and the Python 3 syntax is not supported under Python 2. (Skip down a section to see how to declare a metaclass in a way that does the right thing on both.)
The Python 2 syntax for declaring a metaclass is to assign a __metaclass__
attribute to the class. Consider the earlier creation of class C
using a call to Meta
. Following is the equivalent code in Python 2:
class C(object):
__metaclass__ = Meta
In this case, the metaclass is being assigned in the class body. This is fine. The Python interpreter looks for this when the class keyword is invoked, and uses Meta
rather than type
to create the new class.
Because Python 3 introduced backward-incompatible changes to the Python language, Python developers have come up with strategies for running the same set of code under either the Python 3 interpreter or the Python 2 interpreter with similar results.
One of the most popular ways to do this involves using a tool called six
, which was written by Benjamin Peterson and is available from PyPI.
six
provides two ways to declare a metaclass: by creating a stand-in class and using it as a direct superclass, or by using a decorator to add the metaclass.
The first method (which is the stand-in class method) looks like this:
import six
class C(six.with_metaclass(Meta)):
pass
What is happening here? six.with_metaclass
creates a dummy class of sorts that subclasses object
, and has Meta
as its metaclass, but which does nothing else. By applying this class as the superclass to C
, and based on how metaclasses interact with class inheritance (discussed previously), C
is now an instance of Meta
, regardless of which Python version is in use.
Depending on exactly what the metaclass in question does, sometimes this solution will not actually work. Because six.with_metaclass
actually instantiates a class, some metaclasses may want to do work, and it is possible that said work would not be compatible with having an abstract superclass.
six
provides one other way to assign a metaclass to a class, which is using a decorator: @six.add_metaclass
. The syntax for that looks like this:
import six
@six.add_metaclass(Meta)
class C(object):
pass
The result here becomes the same to the Python 2- or Python 3-specific implementations. Class C
is created, using the class
keyword, and the Meta
metaclass, rather than using type
. The decorator does this without instantiating an abstract class.
Because there are two incompatible syntaxes for Python 2 as opposed to Python 3, it's important to explore at this point when it is better to use the “pure” language approach, and when it is the right time to introduce six
.
Without delving too deeply into the theory, a good rule of thumb here is that if you are running Python 2, assume that you may at some point want to migrate to Python 3, and try to write cross-compatible code. This will entail using six
for any number of things (this among them), and so probably introducing six
into your codebase is wise. By contrast, if you are already exclusively in a Python 3 environment, it is unlikely that you will ever want to shift backward, and just writing Python 3 code should be fine.
One of the trickiest things when you're learning about metaclasses is understanding when it is really appropriate to actually use them. Realistically, most code fits pretty well into the traditional class and object structure, and does not really require the use of metaclasses.
Similarly, using metaclasses needlessly adds a layer of complexity and challenge to that code. Code is read more often than it is written, and, therefore, it is usually desirable to solve problems in the simplest possible way that meets the objectives.
That said, when in situations where metaclasses are appropriate, they are often a very clear solution that can make code much simpler to understand. Realizing when metaclasses can make code simpler rather than more complex is a valuable skill.
The most common reason to use a custom metaclass is to create a delineation between class declaration and class structure, particularly when you're creating APIs for other developers to use.
First, consider an example from the wild. Many Python developers are familiar with Django models, which is a popular web framework. Django models usually correspond to discrete database tables in a relational database.
A Django model declaration is quite straightforward. The following sample model might represent a book:
from django.db import models
class Book(models.Model):
author = models.CharField(max_length=100)
title = models.CharField(max_length=250)
isbn = models.CharField(max_length=20)
publication_date = models.DateField()
pages = models.PositiveIntegerField()
Given what you know about normal classes in Python, what do you expect to happen here? Clearly, models.CharField
, models.DateField
, and the like are instantiations of objects. So, you expect that when you create a Book
instance, you should get back those instances if you access those attributes.
Those familiar with Django know well that this is not what happens. If you try to get the author
attribute of a Book
instance, it will be a string. The same goes for title
and isbn
. The publication_date
attribute will be a datetime.date
object, and pages will be an int
. If any of these are not yet provided to the model, they will be None
.
How does this happen? What magic is going on under the hood to differentiate between how this class was declared (the code provided to generate it) and how it is structured when inspected? When the class is declared, its attributes are complex field objects. However, when you look at an instance of the class, those same attributes are set to values for a particular book.
The answer is, of course, that Django models use a special metaclass that ships with Django, which happens to be called ModelBase
. This is largely invisible when you're using Django, because django.db.models.Model
uses the ModelBase
metaclass. Therefore, subclasses get it for free.
ModelBase
does quite a lot of things. (Django is a mature framework, and its ORM has undergone a lot of iteration.) But a major thing it does is translate between how the model classes in Django are declared versus how their objects are structured. It is advantageous to Django to have a model declaration syntax that is extremely simple and straightforward. A model represents a table; the attributes on the model correspond to columns on the table.
Instances in the Django ecosystem represent rows within a table. When you are accessing a field on the instance, what you really want is the value for that row. So, a specific Book
instance might be The Hobbit, and you would want book.title
to be 'The Hobbit'
in this case.
Essentially, using a metaclass here is desirable because it allows both the declaration of your Book
class and accessing data on your Book
instances to be very clean, and to use a very intuitive API, even though those attributes do not match.
Going into every detail of the implementation of ModelBase
is beyond the scope of this book, but the implementation of this particular concept is actually extremely straightforward.
First, when the model class is being created, recall that the attributes of that class are passed to the metaclass's __new__
method in a dictionary, usually called attrs
. In this example model, this dictionary would include author
, title
, and so on, as keys in that dictionary. The values for those keys would be the Field
objects (all of these classes are subclasses of django.db.models.Field
).
The ModelBase
metaclass has a __new__
method that (among other things) iterates over the attrs
dictionary looking for Field
subclasses. Any fields that it finds are popped off of the attrs
dictionary and placed in another location—a separate dictionary called fields
(which actually lives in an object called _meta
that is written to the class). This implementation detail is not particularly important except to know that the actual field classes live somewhere else, hidden away where internal Django code can get at them when needed. But the average person who just wants to write a Django model does not need to see it.
Then, when an instance is created, the attributes corresponding to the field are instantiated and set to None
unless a default or a specific value for that row is provided, in which case that value takes precedence. Now, suddenly, when the attribute is accessed on that instance, the value for that row is returned instead of the Field
subclass. Similarly, the value can be written in a straightforward manner, without plowing over the Field
.
Essentially, what the metaclass does is take the class declaration, reorganize the structure of the attributes of the class, and then create the class with the new structure.
This paradigm is exceptionally useful when you're designing APIs. A primary goal of a good API is to be as simple as possible, and contain as little boilerplate code as possible. This means both that declaring a class should be simple and straightforward, and that using the class should be similarly simple and straightforward.
In the case of a Django model, those two goals are somewhat in conflict. The ModelBase
metaclass resolves that conflict.
Using metaclasses is an excellent way to bridge this gap. They do this by essentially making the class declaration into a front, and then transforming the declaration of the class into the actual class structure in the __new__
method.
Another key use for metaclasses is for class verification. If a class must conform to a particular interface, a metaclass can be a very effective way to enforce this. Usually, it is preferable that this sort of problem be handled by a sensible default. Occasionally, however, this is not possible.
For example, consider a class that requires either one or another attribute to be set, but not both. This is difficult to handle with a sensible default if it is important that one attribute be unset (as opposed to set to None
).
This concept can be handled using a metaclass. The following simple metaclass requires classes to contain either a foo
attribute or a bar
attribute:
class FooOrBar(type):
def __new__(cls, name, bases, attrs):
if 'foo' in attrs and 'bar' in attrs:
raise TypeError('Class %s cannot contain both `foo` and '
'`bar` attributes.' % name)
if 'foo' not in attrs and 'bar' not in attrs:
raise TypeError('Class %s must provide either a `foo` '
'attribute or a `bar` attribute.' % name)
return super(FooOrBar, cls).__new__(cls, name, bases, attrs)
The following Python 3 class uses this metaclass and conforms to this interface:
>>> class Valid(metaclass=FooOrBar):
... foo = 42
...
>>>
Everything here works fine. What happens if you try to set both attributes, or neither?
>>> class Invalid(metaclass=FooOrBar):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 9, in __new__
TypeError: Class Invalid must provide either a ‘foo‘ attribute or a ‘bar‘
attribute.
>>>
>>> class Invalid(metaclass=FooOrBar):
... foo = 42
... bar = 42
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __new__
TypeError: Class Invalid cannot contain both ‘foo‘ and ‘bar‘ attributes.
This particular implementation has a problem. It will not work well continuing down the subclass chain. The reason for this is because the metaclass examines the attrs
dictionary directly, but this only contains attributes set for the class being declared. It does not know anything about attributes that are inherited from superclasses.
>>> class Valid(metaclass=FooOrBar):
... foo = 42
...
>>> class AlsoValid(Valid):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __new__
TypeError: Class AlsoValid must provide either a ‘foo‘ attribute or a ‘bar‘
attribute.
This is a problem. After all, your AlsoValid
class is also valid. It contains a foo
attribute. An alternate approach to the FooOrBar
metaclass is necessary.
class FooOrBar(type):
def __new__(cls, name, bases, attrs):
answer = super(FooOrBar, cls).__new__(cls, name, bases, attrs)
if hasattr(answer, 'foo') and hasattr(answer, 'bar'):
raise TypeError('Class %s cannot contain both `foo` and '
'`bar` attributes.' % name)
if not hasattr(answer, 'foo') and not hasattr(answer, 'bar'):
raise TypeError('Class %s must provide either a `foo` '
'attribute or a `bar` attribute.' % name)
return answer
What is the difference here? This time, you are checking for the attributes on the instantiated class before it is returned, rather than looking at the attrs
dictionary.
The new class will get all the attributes from the superclass as part of the call to type
's constructor on the first line of the __new__
method. Therefore, the hasattr
calls work, regardless of whether the attribute is declared on this class or inherited from a superclass.
Could this be handled without a metaclass? Absolutely. Nothing prevents writing a simple method that receives the class as an argument and does this same check. In fact, this is an excellent use for a decorator. However, the class must be manually sent to the verification method. With a metaclass, this is just handled when the class is created. Sometimes, an explicit opt-in is preferable; other times, it is not. It simply depends on the use case.
Metaclasses can also be used as a tool to cause certain attributes of a class to not automatically inherit. The most common scenario in which you might want to do this is in conjunction with other metaclass behavior. For example, suppose that a metaclass provides functionality for its classes, but some classes will be created as abstract classes, and you do not want said functionality to run in this case.
An obvious way to go about this would be to allow the class to set an abstract
attribute, and only perform the special functionality of the metaclass if its abstract
is either not set or set to False
.
class Meta(type):
def __new__(cls, name, bases, attrs):
# Sanity check: If this is an abstract class, then we do not
# want the metaclass functionality here.
if attrs.get('abstract', False):
return super(Meta, cls).__new__(cls, name, bases, attrs)
# Perform actual metaclass functionality.
[...]
There is one problem with this approach, however. The abstract
attribute, like any other attribute, will be inherited by subclasses. That means that any subclass would have to explicitly declare itself not to be abstract, which seems strange.
class AbstractClass(metaclass=Meta):
abstract = True
class RegularClass(AbstractClass):
abstract = False
Intuitively, you want abstract
to have to be declared on all abstract classes, but for that attribute not to be inherited. It turns out that this is very easy, because instead of just reading the attrs
dictionary like your metaclass is doing, it can modify it, disposing of the abstract
attribute once it is no longer necessary.
In this case, you can do this by just popping the abstract
value off of the attrs
dictionary, as shown here:
class Meta(type):
def __new__(cls, name, bases, attrs):
# Sanity check: If this is an abstract class, then we do not
# want the metaclass functionality here.
if attrs.pop('abstract', False):
return super(Meta, cls).__new__(cls, name, bases, attrs)
# Perform actual metaclass functionality.
[...]
The difference here is subtle, but important. The abstract
attribute is being removed entirely from the actual class being created. In this example, AbstractClass
would not get the metaclass functionality, but the actual abstract
attribute would be gone. Most importantly, this means that subclasses do not inherit the attribute, which is exactly the behavior you want.
Both of the examples provided earlier as potential use cases for metaclasses can be solved without using metaclasses. In fact, essentially any major use case for metaclasses does not explicitly require their use.
A class decorator can easily handle requiring a class to conform to a particular interface, for example. It is a trivial matter to decorate each class, and the decorator is easily capable of ensuring that either foo
or bar
is set, but not both.
This raises an important question. What is the value of doing this with a metaclass? What value does a metaclass provide that a class decorator does not?
The answer to this sort of question is largely dependent on how the final classes are being used. The key difference between an approach that uses a metaclass as opposed to an approach that uses a class decorator is that the class decorator must be applied explicitly to each subclass. If the programmer implementing the subclasses forgets to apply it, the check does not happen.
By contrast, metaclasses are automatic and invisible to the programmer declaring the classes that use them. Few (if any) APIs ask a programmer to directly use a metaclass, but many of them ask a programmer to subclass a base class that the API package provides. By assigning a metaclass to that base class, all subclasses receive it, too. This causes that functionality of the metaclass to be applied without the end programmer having to think about it.
Put more simply, one of the first lines in the Zen of Python states, “Explicit is better than implicit.” But, like most things in that document, this adage is true … until it is not. For example, being implicit is better if you are talking about extraneous information or boilerplate. Similarly, sometimes being more explicit just means more maintenance, which is not usually a win.
Metaclasses really start to stand out as the operation on the metaclass becomes greater. It would not be reasonable or as maintainable to mark every Django model with an explicit decorator.
Similarly, consider meta-coding situations. In this context, the term meta-coding refers to code that inspects other code in the application. For example, consider code that should log itself.
A metaclass that causes all method calls from instances of a class to be logged somehow is quite easy to implement. The following metaclass causes its classes to “log” their function calls (except substituting actual logging for just printing to sys.stdout
):
class Logged(type):
"""A metaclass that causes classes that it creates to log
their function calls.
"""
def __new__(cls, name, bases, attrs):
for key, value in attrs.items():
if callable(value):
attrs[key] = cls.log_call(value)
return super(Logged, cls).__new__(cls, name, bases, attrs)
@staticmethod
def log_call(fxn):
"""Given a function, wrap it with some logging code and
return the wrapped function.
"""
def inner(*args, **kwargs):
print('The function %s was called with arguments %r and '
'keyword arguments %r.' % (fxn.__name__, args, kwargs))
try:
response = fxn(*args, **kwargs)
print('The function call to %s was successful.' %
fxn.__name__)
return response
except Exception as exc:
print('The function call to %s raised an exception: %r' %
(fxn.__name__, exc))
raise
return inner
Let's first review what is happening here. Logged
is being declared as a subclass of type
, which means it is a metaclass. The Logged
class has a __new__
method, and what that method does is iterate over all the attributes in the attrs
dictionary, check to see if they are callables (using the Python built-in function callable
), and wrap them if they are.
The wrapping function itself is very straightforward, especially if you are already familiar with the concept of decorators. It declares a local function that performs some logic (in this case, calling print
), and then calls the function that was passed as an argument to the log_call
method. To learn more about this pattern, see Chapter 1, “Decorators,” which makes extensive use of this paradigm.
What happens when a class uses this metaclass? Consider the following Python 3 class that has Logged
as its metaclass:
class MyClass(metaclass=Logged):
def foo(self):
pass
def bar(self):
raise TypeError('oh noes!')
When you create an instance of MyClass
, you discover that calling methods on it becomes . . . er, loud.
>>> obj = MyClass()
>>> obj.foo()
The function foo was called with arguments (<__main__.MyClass object at
0x1022a37f0>,) and keyword arguments {}.
The function call to foo was successful.
If you try to call obj.bar()
, you get an exception.
>>> obj.bar()
The function bar was called with arguments (<__main__.MyClass object at
0x1022a37f0>,) and keyword arguments {}.
The function call to bar raised an exception: TypeError('oh noes!',)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 19, in inner
File "<stdin>", line 5, in bar
TypeError: oh noes!
Astute readers probably noticed something. When MyClass
was instantiated, why was there no logging of the call to __init__
? After all, __init__
is certainly callable. It seems like it should have been noisy along with foo
and bar
.
Recall, however, that your metaclass loops over attributes in the attrs
dictionary, and you did not explicitly define __init__
in your MyClass
class. Rather, it is inherited from object
. This is the behavior you really want as well. Otherwise, subclassing would cause the log_call
“decorator” to be applied repeatedly on the same callables, which would result in repeated print
statements.
By explicitly defining __init__
, however, you can observe the noisy behavior there.
>>> class MyClass(metaclass=Logged):
... def __init__(self):
... pass
...
>>>
>>> obj = MyClass()
The function __init__ was called with arguments (<__main__.MyClass object
at 0x1022a3550>,) and keyword arguments {}.
The function call to __init__ was successful.
Also, note that, even though __init__
was not explicitly called in the Python shell, it is still the function that is logged, because the Python interpreter calls __init__
under the hood when a new instance is created.
It is worth noting, however, that this behavior only occurs at class creation time. If a method is added to the class after it is created (which usually should not be happening anyway), it will not be wrapped.
>>> MyClass.foo = lambda self: 42
>>> obj.foo()
42
In this case, your call to foo
was not noisy, because MyClass
had already been created, and so the metaclass had already done its job. Therefore, you just get a plain function call rather than a wrapped one.
Metaclasses are extremely powerful tools in Python. The fact that classes are first-class objects allows for those classes to be manipulated outside of when they are declared. Metaclasses are a way to accomplish this.
The presence of metaclasses in the Python language overcomes many of the limitations of other object-oriented languages, in which classes are statically declared at coding time.
The ultimate result is that Python's object model ends up being the best of all worlds. It combines the simplicity of languages with a traditional class structure and the power of languages that follow other models, such as prototypal inheritance in JavaScript and LUA.
It is a common misconception that metaclasses are difficult to understand. However, some of the power in Python's object model is in its simplicity and consistency. Metaclasses are, in fact, a very straightforward implementation that adds a huge amount of power to the language.
Chapter 6, “Class Factories,” covers another way to make classes, which is by constructing them on-the-fly.