Skip to content

The Zen of Polymorphism 4 Ways to Write Cleaner Python

Polymorphism is a cornerstone of object-oriented programming, allowing a single interface to represent different underlying forms. But as with any powerful tool, there are multiple ways to wield it. In Python, how do you choose the right approach for your specific problem?

In a talk by Brett Slatkin, author of Effective Python, he explores four distinct patterns for implementing polymorphism, each with its own strengths and weaknesses. Using the practical example of building a simple calculator, we can establish some rules of thumb—a “Zen of Polymorphism”—to guide our choices and write cleaner, more maintainable code.

This is a summary of his talk.

The Scenario: Building a Calculator

Imagine you’re creating a calculator program. The first step is to take a formula as a string, like (3 + 5) * (4 + 7), and parse it into an object tree. This tree-like data structure, often called an Abstract Syntax Tree (AST), is then processed to compute the final result.

First, we define the data types to represent our formula’s components:

# Data types to represent the formula tree
class Number:
    def __init__(self, value):
        self.value = value

class Add:
    def __init__(self, left, right):
        self.left = left
        self.right = right

class Multiply:
    def __init__(self, left, right):
        self.left = left
        self.right = right

With these classes, the formula (3 + 5) * (4 + 7) can be represented as a tree of nested objects. Now, let’s explore four ways to calculate the result.

Approach 1: The “One Big Function”

The most straightforward approach is to write a single function that inspects the type of each node in the tree and decides what to do. This method relies on a series of isinstance() checks.

def calculate(node):
    if isinstance(node, Number):
        return node.value
    elif isinstance(node, Add):
        return calculate(node.left) + calculate(node.right)
    elif isinstance(node, Multiply):
        return calculate(node.left) * calculate(node.right)
    else:
        raise TypeError(f"Unknown node type: {type(node)}")

This function recursively walks the tree. When it sees an Add node, for example, it calls itself on the left and right child nodes and then adds their results.

  • The Good: This is simple, the control flow is easy to follow, and it’s straightforward to debug. For a small number of types, it works just fine.
  • The Bad: As you add more operations (like subtraction, division, or Power), the function gets longer and more complex. If you have subclasses (e.g., PositiveInteger inheriting from Integer), the order of your isinstance checks suddenly matters, making the code brittle.
  • The Ugly: What if you want to add new functionality, like a “pretty printer” to display the formula tree? You’d have to write another, nearly identical “one big function,” duplicating all the traversal logic and isinstance checks. This leads to a maintenance nightmare.

Approach 2: The Object-Oriented Way

To avoid the pitfalls of a single monolithic function, we can turn to classic Object-Oriented Programming (OOP). Here, we place the calculation logic directly on the classes themselves as a calculate method.

# Base class with a default method
class Node:
    def calculate(self):
        raise NotImplementedError

class Number(Node):
    # ... (init)
    def calculate(self):
        return self.value

class Add(Node):
    # ... (init)
    def calculate(self):
        return self.left.calculate() + self.right.calculate()

class Multiply(Node):
    # ... (init)
    def calculate(self):
        return self.left.calculate() * self.right.calculate()

Now, adding a new Power operation is clean. You just create a Power class that inherits from Node and implements its own calculate method. The existing code doesn’t need to change.

  • The Good: This approach is extensible. Adding new data types is easy and doesn’t require modifying existing code. The behavior is located right next to the data it operates on.
  • The Bad: The problem is flipped. While adding new types is easy, adding new operations is difficult. To implement our pretty printer, we would have to go back and add a pretty_print() method to every single class (Node, Number, Add, Multiply, etc.).
  • The Ugly: This scatters the pretty-printing logic across many files. The code is now organized by data type, not by functionality. This can lead to what’s known as the “wrong axis” problem: all the code for one feature is spread out, making it hard to debug or maintain. It also creates tightly coupled, monolithic classes that depend on numerous other parts of the system.

Approach 3: Dynamic Dispatch with functools.singledispatch

What if we could group our functions by behavior instead of by data type? Python’s standard library has a powerful tool for this: functools.singledispatch. This decorator allows you to register different versions of a function that are chosen based on the type of the first argument.

It looks like this:

from functools import singledispatch

@singledispatch
def calculate(node):
    raise TypeError(f"Unknown node type: {type(node)}")

@calculate.register(Number)
def _(node):
    return node.value

@calculate.register(Add)
def _(node):
    return calculate(node.left) + calculate(node.right)

@calculate.register(Multiply)
def _(node):
    return calculate(node.left) * calculate(node.right)

Here, we have a generic calculate function and several specialized versions registered to it. When calculate(node) is called, singledispatch checks the type of node and invokes the correct registered function.

  • The Good: This solves the “wrong axis” problem. All the calculate logic is in one place. Adding a new operation like pretty_print is trivial: just create a new set of dispatched functions in a separate file. This decouples the operations from the data classes and from each other, leading to cleaner dependencies.
  • The Bad: We’ve reintroduced some duplication. Notice that the recursive calls (calculate(node.left)) and the tree traversal logic (node.left, node.right) are repeated in both the calculate and pretty_print implementations. For every new operation, we have to rewrite this boilerplate.

Approach 4: The Ultimate Decoupling with Catamorphism

To eliminate the final piece of duplication—the traversal logic—we can introduce a concept from functional programming called a catamorphism. It sounds intimidating, but it’s essentially a generic way to fold a recursive structure down into a single value.

Think of it as a universal tree-traversal function. This function walks the tree and applies a set of callback functions (which functional programmers call an algebra) at each step.

First, we create our generic traverse function that knows how to walk our specific tree structure:

@singledispatch
def traverse(node, func):
    # Default for unknown types
    raise TypeError

@traverse.register(Number)
def _(node, func):
    # For a leaf node, just call the function on its value
    return func(node)

@traverse.register(Add)
@traverse.register(Multiply)
@traverse.register(Power)
def _(node, func):
    # For a branch, traverse the children first
    left_val = traverse(node.left, func)
    right_val = traverse(node.right, func)
    # Then call the function on the node and the results
    return func(node, left_val, right_val)

Now, the calculate and pretty_print logic becomes incredibly simple. They are just “algebras”—another set of dispatched functions—that contain only the operational logic, with no traversal or recursion.

# The "algebra" for calculating results
@singledispatch
def calculate_algebra(node, *children):
    ...

@calculate_algebra.register(Number)
def _(node):
    return node.value

@calculate_algebra.register(Add)
def _(node, left, right):
    return left + right

# The "algebra" for pretty printing
@singledispatch
def pretty_print_algebra(node, *children):
    ...

@pretty_print_algebra.register(Add)
def _(node, left, right):
    return f"({left} + {right})"

# --- Putting it all together ---
# To calculate, we traverse the tree with the calculation algebra
result = traverse(my_tree, calculate_algebra)

# To pretty print, we traverse with the pretty printing algebra
pretty_string = traverse(my_tree, pretty_print_algebra)
  • The Good: This is the pinnacle of decoupling. The traversal logic is written once. The operational logic (calculating, printing, type checking, etc.) is completely separate and incredibly simple. These functions are pure, easy to test, and have no knowledge of the underlying tree structure.

Additional references:

Conclusion: The Zen of Polymorphism

So, which approach should you use? Here are the rules of thumb:

  1. Start with a “One Big Function.” For simple problems, this is often all you need. Don’t over-engineer from the beginning.
  2. Refactor to OOP when you need cohesion. If you are primarily adding new types of data and the behavior is tightly coupled to that data (like in many UI frameworks), OOP is a great fit.
  3. Use singledispatch to decouple behaviors. When you have a stable set of data structures but are constantly adding new, independent operations on them, singledispatch helps organize your code along the right axis—by functionality.
  4. Use a Catamorphism (Visitor Pattern) to separate traversal. For complex, recursive data structures like trees, abstracting the traversal logic into a generic function can dramatically simplify your operational code, making it cleaner, more reusable, and easier to test.

Page last modified: 2025-07-30 12:01:38