Architecture Note: Runtime Dependency Injection with Plugin Discovery¶
Pluggy (discovery/hooks) and Dishka (dependency injection) can be combined to create a modular, recursive component system similar to [[Trac Component Architecture]], but strictly adhering to modern Inversion of Control (IoC) principles.
1. Executive Summary¶
This architecture enables a modular application where functionality is composed at runtime via plugins. It leverages Pluggy to discover “recipes” (Providers) from external modules and Dishka to execute those recipes and manage component lifecycles.
Unlike legacy systems (like Trac) where components self-register as singletons and fetch dependencies via a Service Locator (self.env), this architecture is purely Dependency Injection based. Components remain simple Python classes unaware of the plugin system, making them easier to test and maintain.
2. Core Concepts¶
The Roles¶
- The Host (Composition Root): The entry point that bootstraps Pluggy, filters configuration, and initializes the Dishka container.
- Pluggy (The Registry): Defines the “Extension Points.” It is responsible for answering the question: “Who is available to provide functionality?”
- Dishka (The Factory): Responsible for the “Lifecycle” and “Wiring.” It answers the question: “How do I instantiate this service and what does it need?”
- Plugins: Bundles that contribute Providers to the system.
The Workflow¶
- Discovery: The Host uses
setuptoolsentry points to find available plugins. - Filtering: The Host compares discovered plugins against a Configuration (whitelist) to ensure only desired implementations (e.g., Postgres vs. SQLite) are loaded.
- Collection: The Host invokes the root Pluggy hook (
get_providers) to gather all DishkaProviderobjects. - Containerization: A Dishka
Containeris built using these providers. - Runtime: When the App requests a service, Dishka creates it. Crucially, if that service creates a new extension point, its Provider triggers a secondary Pluggy hook to satisfy dependencies dynamically.
3. Detailed Design Patterns¶
A. The Basic Service Pattern (Polymorphism)¶
Addresses: “I want to choose a specific database implementation via config.”
In this pattern, the Application defines an interface (Protocol). A plugin provides the implementation.
- The Contract (
specs.py):
python class Database(Protocol): def query(self, sql: str) -> list: ... - The Plugin (
plugin_pg.py):
Returns aProviderthat maps the concrete class to the protocol.
python class PostgresProvider(Provider): @provide(scope=Scope.APP, provides=Database) def provide_db(self) -> Database: return PostgresDB(dsn="...")
B. The Recursive Extension Pattern (Plugins extending Plugins)¶
Addresses: “I have a Web Plugin, and I need other plugins to register Routes for it.”
This is the equivalent of Trac’s ExtensionPoint. Instead of the component iterating over extensions, the Dishka Provider iterates over hooks to inject the result into the component.
- The Definition (Web Plugin):
Defines a new HookSpec (register_routes). - The Consumer (Web Provider):
Injects thePluginManagerto call the hook during object creation.
```python
class WebProvider(Provider):
def init(self, pm: PluginManager):
self.pm = pm@provide(scope=Scope.APP) def provide_server(self) -> WebServer: # 1. Trigger the Hook defined by this plugin routes = self.pm.hook.register_routes() # 2. Flatten results flat_routes = [r for sublist in routes for r in sublist] # 3. Inject into Service return WebServer(routes=flat_routes)3. **The Extender (Admin Plugin):** Simply implements the hook. It does not need to know about Dishka.python
@hookimpl
def register_routes() -> List[Route]:
return [Route(“/admin”, AdminHandler)]
```
4. Implementation Reference¶
Step 1: Define Contracts¶
Use typing.Protocol to decouple interface from implementation.
# domain.py
from typing import Protocol, List
class Handler(Protocol):
def handle(self): ...
class Route:
def __init__(self, path: str, handler: Handler): ...
Step 2: Define Hook Specifications¶
Define the “slots” where plugins can attach.
# hooks.py
import pluggy
from dishka import Provider
from typing import List
hookspec = pluggy.HookspecMarker("my_app")
class CoreSpecs:
"""The bootstrap hook."""
@hookspec
def get_providers(self) -> List[Provider]:
"""Plugins return DI providers here."""
class WebSpecs:
"""An extension point defined by the Web module."""
@hookspec
def register_routes(self) -> List['Route']:
"""Plugins return routes here."""
Step 3: The “Composition Root” (Main)¶
This is the only place where pluggy and dishka are explicitly glued together.
# main.py
import pluggy
from dishka import make_container
from hooks import CoreSpecs
def bootstrap_application(config: dict):
# 1. Initialize Plugin Manager
pm = pluggy.PluginManager("my_app")
pm.add_hookspecs(CoreSpecs)
# 2. Discovery & Config Filtering
# Identify plugins via entry_points, but only register
# those allowed by 'config'
valid_plugins = load_and_filter_entry_points(config)
for plugin in valid_plugins:
# Pass 'pm' to plugin if it needs to resolve recursive hooks
instance = plugin(pm) if expects_pm(plugin) else plugin()
pm.register(instance)
# 3. Harvest Providers
# Flatten the list of lists returned by the hook
providers_lists = pm.hook.get_providers()
all_providers = [p for sub in providers_lists for p in sub]
# 4. Build Container
container = make_container(*all_providers)
return container
5. Addressing Use Cases¶
Use Case 1: Activation/Deactivation via Config¶
Problem: We have PluginA and PluginB installed, but we only want PluginA active.
Solution: The bootstrap_application function reads a config file (e.g., settings.toml). It iterates over importlib.metadata.entry_points(). If the entry point name is not in the config’s “enabled_plugins” list, it is skipped. pm.register() is never called for PluginB, so its providers are never passed to Dishka.
Use Case 2: The “Trac” Recursive Scenario¶
Problem: A NotificationPlugin needs Transport implementations (Email, SMS) provided by other plugins.
Solution:
1. NotificationPlugin defines a hook spec: get_transports.
2. EmailPlugin implements get_transports returning an EmailTransport object.
3. NotificationPlugin has a Dishka Provider. Inside provide_notifier, it calls pm.hook.get_transports().
4. It injects the list of transports into the Notifier service constructor.
5. Result: The Notifier class is clean (just receives a list). The complexity is handled entirely in the wiring layer.
Use Case 3: Lifecycle Management (Singleton vs Transient)¶
Problem: The Database connection must be shared (Singleton), but the Request Context must be unique per request.
Solution: Dishka handles this via scopes within the Provider.
* @provide(scope=Scope.APP) -> Behaves like a Trac Component (Singleton).
* @provide(scope=Scope.REQUEST) -> Created anew for every web request.
6. Comparison: Legacy (Trac) vs. Modern (Dishka/Pluggy)¶
| Feature | Trac Component Architecture | Modern (Dishka + Pluggy) |
|---|---|---|
| Component Definition | Class inherits from Component. |
Plain Python Class (POJO). |
| Dependencies | Fetched via Service Locator (self.env[Interface]). |
Injected via Constructor (__init__). |
| Extension Points | ExtensionPoint(Interface) attribute. |
Pluggy HookSpec + Provider orchestration. |
| Coupling | High: Code is bound to Trac framework. | Low: Business logic is framework-agnostic. |
| Testing | Hard: Requires mocking the ComponentManager. |
Easy: Just instantiate the class with mocks. |
7. Conclusion¶
This architecture successfully creates a Microkernel system. The “Core” is minimal. Features are added via plugins that define both the Service Logic (the Classes) and the Wiring Logic (the Providers). By using Dishka to mediate the creation of objects, we maintain strict separation of concerns, ensuring that while the system is highly dynamic and extensible, the individual components remain testable and clean.
This is a complete, runnable implementation guide for a Modular Data Processing System (like a simplified ETL pipeline).
This example demonstrates:
1. Core: Defines interfaces (DataSource, Processor).
2. Plugin A (Recursive): A “Core Plugin” that aggregates other processors.
3. Plugin B (Implementation): A concrete Data Source (e.g., CSV).
4. Plugin C (Extension): Adds a text processing step (e.g., “Uppercase”).
5. Bootstrap: Config-driven loading.
1. Project Structure¶
/my_project
├── main.py # Entry point (Bootstrap)
├── core/
│ ├── __init__.py
│ ├── contracts.py # Protocols/Interfaces
│ ├── hooks.py # Pluggy HookSpecs
│ └── service.py # The Logic (Pipeline Engine)
└── plugins/
├── __init__.py
├── core_provider.py # Wiring logic (The Recursive part)
├── source_csv.py # A specific data source
└── proc_uppercase.py # An extension processor
2. The Core Layer (Contracts & Hooks)¶
This layer defines what the system does, not how.
core/contracts.py
from typing import Protocol, List
# 1. The Data Source Interface
class DataSource(Protocol):
def read_data(self) -> List[str]:
...
# 2. The Processor Interface (The Extension Point)
class Processor(Protocol):
def process(self, data: List[str]) -> List[str]:
...
# 3. The Main Service Interface
class Pipeline(Protocol):
def run(self) -> None:
...
core/hooks.py
import pluggy
from dishka import Provider
from typing import List
from .contracts import Processor
# Namespace for hooks
hookspec = pluggy.HookspecMarker("data_etl")
class AppSpecs:
"""System-level hooks"""
@hookspec
def get_providers(self) -> List[Provider]:
"""Return Dishka Providers to register."""
class PipelineSpecs:
"""Extension-level hooks"""
@hookspec
def register_processors(self) -> List[Processor]:
"""Plugins return processor instances here."""
core/service.py
Pure Python logic. No Dishka, no Pluggy.
from typing import List
from .contracts import DataSource, Processor, Pipeline
class EtlPipeline:
def __init__(self, source: DataSource, processors: List[Processor]):
self.source = source
self.processors = processors
def run(self) -> None:
print("--- Starting Pipeline ---")
data = self.source.read_data()
print(f"Raw Data: {data}")
for proc in self.processors:
data = proc.process(data)
print(f"Final Data: {data}")
print("--- Finished ---")
3. The Plugins¶
Plugin A: The Core Wiring (Recursive Logic)¶
This plugin provides the EtlPipeline. It acts as the bridge. It asks Pluggy for “processors” and feeds them into Dishka.
plugins/core_provider.py
import pluggy
from dishka import Provider, provide, Scope
from core.contracts import Pipeline, DataSource
from core.service import EtlPipeline
from core.hooks import AppSpecs, PipelineSpecs, hookspec
# Hook Implementation
hookimpl = pluggy.HookimplMarker("data_etl")
class CoreWiringProvider(Provider):
def __init__(self, pm: pluggy.PluginManager):
super().__init__()
self.pm = pm
@provide(scope=Scope.APP)
def provide_pipeline(self, source: DataSource) -> Pipeline:
# --- RECURSIVE MAGIC HERE ---
# 1. Ask Pluggy for anyone who registered a processor
results = self.pm.hook.register_processors()
# 2. Flatten list of lists
all_processors = [p for sublist in results for p in sublist]
print(f"DEBUG: CoreProvider found {len(all_processors)} processors.")
# 3. Inject into the service
return EtlPipeline(source=source, processors=all_processors)
class CorePlugin:
def __init__(self, pm):
self.pm = pm
# Register the Extension Spec so others can use it
self.pm.add_hookspecs(PipelineSpecs)
@hookimpl
def get_providers(self):
# We pass 'pm' to the provider so it can call hooks later
return [CoreWiringProvider(self.pm)]
Plugin B: A Concrete Data Source¶
This provides the DataSource dependency required by the pipeline.
plugins/source_csv.py
from dishka import Provider, provide, Scope
from core.contracts import DataSource
from core.hooks import hookspec # Assume strict import in real app
hookimpl = pluggy.HookimplMarker("data_etl")
class CsvSource:
def read_data(self):
return ["apple", "banana", "cherry"]
class CsvProvider(Provider):
@provide(scope=Scope.APP)
def provide_source(self) -> DataSource:
return CsvSource()
class CsvPlugin:
@hookimpl
def get_providers(self):
return [CsvProvider()]
Plugin C: An Extension (Processor)¶
This plugin doesn’t provide a Service; it hooks into the register_processors slot defined by the Core.
plugins/proc_uppercase.py
from typing import List
from core.contracts import Processor
from core.hooks import PipelineSpecs # To reference the spec if needed
import pluggy
hookimpl = pluggy.HookimplMarker("data_etl")
class UppercaseProcessor:
def process(self, data: List[str]) -> List[str]:
print("-> Running Uppercase Processor")
return [s.upper() for s in data]
class UppercasePlugin:
# Notice: This plugin returns NO providers.
# It only contributes logic to the pipeline hook.
@hookimpl
def get_providers(self):
return []
@hookimpl
def register_processors(self) -> List[Processor]:
return [UppercaseProcessor()]
4. The Bootstrap (Main)¶
This connects everything based on configuration.
main.py
import pluggy
from dishka import make_container
from core.hooks import AppSpecs
from core.contracts import Pipeline
# Simulating imports (In real life, use importlib.metadata)
from plugins.core_provider import CorePlugin
from plugins.source_csv import CsvPlugin
from plugins.proc_uppercase import UppercasePlugin
# Configuration: We select CSV source and enable Uppercase processor
CONFIG = {
"enabled_plugins": ["core", "csv_source", "uppercase_processor"]
}
PLUGIN_MAP = {
"core": CorePlugin,
"csv_source": CsvPlugin,
"uppercase_processor": UppercasePlugin,
"disabled_plugin": object # Represents an installed but disabled plugin
}
def main():
# 1. Setup Pluggy
pm = pluggy.PluginManager("data_etl")
pm.add_hookspecs(AppSpecs)
# 2. Load & Filter Plugins
print(f"Loading plugins: {CONFIG['enabled_plugins']}")
for name in CONFIG['enabled_plugins']:
plugin_cls = PLUGIN_MAP.get(name)
if not plugin_cls:
continue
# Instantiate plugin. If it expects 'pm', inject it.
# (Simple dependency injection for the plugin wrapper itself)
if hasattr(plugin_cls, "__init__") and "pm" in plugin_cls.__init__.__code__.co_varnames:
instance = plugin_cls(pm)
else:
instance = plugin_cls()
pm.register(instance)
# 3. Harvest Dishka Providers
# This calls 'get_providers' on all registered plugins
providers_nested = pm.hook.get_providers()
all_providers = [p for sub in providers_nested for p in sub]
# 4. Create Container
container = make_container(*all_providers)
try:
# 5. Run Application
# Requesting 'Pipeline' triggers CoreWiringProvider
# CoreWiringProvider triggers 'register_processors' hook
# UppercasePlugin responds
pipeline = container.get(Pipeline)
pipeline.run()
finally:
container.close()
if __name__ == "__main__":
main()
5. Running the Example¶
Output of python main.py:
Loading plugins: ['core', 'csv_source', 'uppercase_processor']
DEBUG: CoreProvider found 1 processors.
--- Starting Pipeline ---
Raw Data: ['apple', 'banana', 'cherry']
-> Running Uppercase Processor
Final Data: ['APPLE', 'BANANA', 'CHERRY']
--- Finished ---
6. Addressing Specific Use Cases¶
Use Case: Swapping implementations (Config)¶
If you change main.py config to:
# Assume we have a SqlPlugin
CONFIG = { "enabled_plugins": ["core", "sql_source", "uppercase_processor"] }
Dishka will now receive a SqlProvider instead of CsvProvider. The CoreWiringProvider doesn’t care; it just asks for DataSource.
Use Case: Adding more processing steps¶
If you create a ReversePlugin and add it to config:
class ReverseProcessor:
def process(self, data):
return [s[::-1] for s in data]
# In ReversePlugin
@hookimpl
def register_processors():
return [ReverseProcessor()]
The pipeline automatically runs: Source -> Uppercase -> Reverse -> Output.
Use Case: Testing (Why this architecture rocks)¶
Because we separated Logic from Wiring, we can test the EtlPipeline without Dishka or Pluggy.
tests/test_service.py
from core.service import EtlPipeline
class MockSource:
def read_data(self): return ["foo"]
class MockProcessor:
def process(self, data): return ["bar"]
def test_pipeline_logic():
# Pure Python unit test
pipe = EtlPipeline(MockSource(), [MockProcessor()])
# ... capture stdout or return value ...
# Assert that logic flows correctly
We can also test the integration using Dishka manually:
tests/test_integration.py
from dishka import make_container
from plugins.source_csv import CsvProvider
from plugins.core_provider import CoreWiringProvider
# ... mock the PM ...
def test_container_assembly():
# Validate that providers play nice together
container = make_container(CsvProvider(), MockCoreProvider())
assert container.get(Pipeline) is not None
Page last modified: 2025-11-26 13:50:23