The Zen of Polymorphism 4 Ways to Write Cleaner Python
Polymorphism is a cornerstone of object-oriented programming, allowing a single interface to represent different underlying forms. But as with any powerful tool, there are multiple ways to wield it. In Python, how do you choose the right approach for your specific problem?
In a talk by Brett Slatkin, author of Effective Python, he explores four distinct patterns for implementing polymorphism, each with its own strengths and weaknesses. Using the practical example of building a simple calculator, we can establish some rules of thumb—a “Zen of Polymorphism”—to guide our choices and write cleaner, more maintainable code.
This is a summary of his talk.
The Scenario: Building a Calculator¶
Imagine you’re creating a calculator program. The first step is to take a formula as a string, like (3 + 5) * (4 + 7)
, and parse it into an object tree. This tree-like data structure, often called an Abstract Syntax Tree (AST), is then processed to compute the final result.
First, we define the data types to represent our formula’s components:
# Data types to represent the formula tree
class Number:
def __init__(self, value):
self.value = value
class Add:
def __init__(self, left, right):
self.left = left
self.right = right
class Multiply:
def __init__(self, left, right):
self.left = left
self.right = right
With these classes, the formula (3 + 5) * (4 + 7)
can be represented as a tree of nested objects. Now, let’s explore four ways to calculate the result.
Approach 1: The “One Big Function”¶
The most straightforward approach is to write a single function that inspects the type of each node in the tree and decides what to do. This method relies on a series of isinstance()
checks.
def calculate(node):
if isinstance(node, Number):
return node.value
elif isinstance(node, Add):
return calculate(node.left) + calculate(node.right)
elif isinstance(node, Multiply):
return calculate(node.left) * calculate(node.right)
else:
raise TypeError(f"Unknown node type: {type(node)}")
This function recursively walks the tree. When it sees an Add
node, for example, it calls itself on the left and right child nodes and then adds their results.
- The Good: This is simple, the control flow is easy to follow, and it’s straightforward to debug. For a small number of types, it works just fine.
- The Bad: As you add more operations (like subtraction, division, or
Power
), the function gets longer and more complex. If you have subclasses (e.g.,PositiveInteger
inheriting fromInteger
), the order of yourisinstance
checks suddenly matters, making the code brittle. - The Ugly: What if you want to add new functionality, like a “pretty printer” to display the formula tree? You’d have to write another, nearly identical “one big function,” duplicating all the traversal logic and
isinstance
checks. This leads to a maintenance nightmare.
Approach 2: The Object-Oriented Way¶
To avoid the pitfalls of a single monolithic function, we can turn to classic Object-Oriented Programming (OOP). Here, we place the calculation logic directly on the classes themselves as a calculate
method.
# Base class with a default method
class Node:
def calculate(self):
raise NotImplementedError
class Number(Node):
# ... (init)
def calculate(self):
return self.value
class Add(Node):
# ... (init)
def calculate(self):
return self.left.calculate() + self.right.calculate()
class Multiply(Node):
# ... (init)
def calculate(self):
return self.left.calculate() * self.right.calculate()
Now, adding a new Power
operation is clean. You just create a Power
class that inherits from Node
and implements its own calculate
method. The existing code doesn’t need to change.
- The Good: This approach is extensible. Adding new data types is easy and doesn’t require modifying existing code. The behavior is located right next to the data it operates on.
- The Bad: The problem is flipped. While adding new types is easy, adding new operations is difficult. To implement our pretty printer, we would have to go back and add a
pretty_print()
method to every single class (Node
,Number
,Add
,Multiply
, etc.). - The Ugly: This scatters the pretty-printing logic across many files. The code is now organized by data type, not by functionality. This can lead to what’s known as the “wrong axis” problem: all the code for one feature is spread out, making it hard to debug or maintain. It also creates tightly coupled, monolithic classes that depend on numerous other parts of the system.
Approach 3: Dynamic Dispatch with functools.singledispatch
¶
What if we could group our functions by behavior instead of by data type? Python’s standard library has a powerful tool for this: functools.singledispatch
. This decorator allows you to register different versions of a function that are chosen based on the type of the first argument.
It looks like this:
from functools import singledispatch
@singledispatch
def calculate(node):
raise TypeError(f"Unknown node type: {type(node)}")
@calculate.register(Number)
def _(node):
return node.value
@calculate.register(Add)
def _(node):
return calculate(node.left) + calculate(node.right)
@calculate.register(Multiply)
def _(node):
return calculate(node.left) * calculate(node.right)
Here, we have a generic calculate
function and several specialized versions registered to it. When calculate(node)
is called, singledispatch
checks the type of node
and invokes the correct registered function.
- The Good: This solves the “wrong axis” problem. All the
calculate
logic is in one place. Adding a new operation likepretty_print
is trivial: just create a new set of dispatched functions in a separate file. This decouples the operations from the data classes and from each other, leading to cleaner dependencies. - The Bad: We’ve reintroduced some duplication. Notice that the recursive calls (
calculate(node.left)
) and the tree traversal logic (node.left
,node.right
) are repeated in both thecalculate
andpretty_print
implementations. For every new operation, we have to rewrite this boilerplate.
Approach 4: The Ultimate Decoupling with Catamorphism¶
To eliminate the final piece of duplication—the traversal logic—we can introduce a concept from functional programming called a catamorphism. It sounds intimidating, but it’s essentially a generic way to fold a recursive structure down into a single value.
Think of it as a universal tree-traversal function. This function walks the tree and applies a set of callback functions (which functional programmers call an algebra) at each step.
First, we create our generic traverse
function that knows how to walk our specific tree structure:
@singledispatch
def traverse(node, func):
# Default for unknown types
raise TypeError
@traverse.register(Number)
def _(node, func):
# For a leaf node, just call the function on its value
return func(node)
@traverse.register(Add)
@traverse.register(Multiply)
@traverse.register(Power)
def _(node, func):
# For a branch, traverse the children first
left_val = traverse(node.left, func)
right_val = traverse(node.right, func)
# Then call the function on the node and the results
return func(node, left_val, right_val)
Now, the calculate
and pretty_print
logic becomes incredibly simple. They are just “algebras”—another set of dispatched functions—that contain only the operational logic, with no traversal or recursion.
# The "algebra" for calculating results
@singledispatch
def calculate_algebra(node, *children):
...
@calculate_algebra.register(Number)
def _(node):
return node.value
@calculate_algebra.register(Add)
def _(node, left, right):
return left + right
# The "algebra" for pretty printing
@singledispatch
def pretty_print_algebra(node, *children):
...
@pretty_print_algebra.register(Add)
def _(node, left, right):
return f"({left} + {right})"
# --- Putting it all together ---
# To calculate, we traverse the tree with the calculation algebra
result = traverse(my_tree, calculate_algebra)
# To pretty print, we traverse with the pretty printing algebra
pretty_string = traverse(my_tree, pretty_print_algebra)
- The Good: This is the pinnacle of decoupling. The traversal logic is written once. The operational logic (calculating, printing, type checking, etc.) is completely separate and incredibly simple. These functions are pure, easy to test, and have no knowledge of the underlying tree structure.
Additional references:¶
Conclusion: The Zen of Polymorphism¶
So, which approach should you use? Here are the rules of thumb:
- Start with a “One Big Function.” For simple problems, this is often all you need. Don’t over-engineer from the beginning.
- Refactor to OOP when you need cohesion. If you are primarily adding new types of data and the behavior is tightly coupled to that data (like in many UI frameworks), OOP is a great fit.
- Use
singledispatch
to decouple behaviors. When you have a stable set of data structures but are constantly adding new, independent operations on them,singledispatch
helps organize your code along the right axis—by functionality. - Use a Catamorphism (Visitor Pattern) to separate traversal. For complex, recursive data structures like trees, abstracting the traversal logic into a generic function can dramatically simplify your operational code, making it cleaner, more reusable, and easier to test.
Page last modified: 2025-07-30 12:01:38