SMO Optimization Capabilities¶
This note analyzes the Synergetic Meta-Orchestrator’s (SMO) current optimization capabilities, as implemented in its codebase (notably within utils/placement.py
and utils.scaling.py
). We will describe into how these optimization functions are invoked, the specific data they consume, and how their outputs drive SMO’s orchestration actions.
The SMO employs mathematical optimization, primarily using the CVXPY library, to make intelligent decisions regarding the deployment and ongoing management of Hyper Distributed Application Graphs (HDAGs). Its current optimization logic is focused on two key areas: service placement and replica scaling.
1. Service Placement Optimization¶
SMO uses two distinct placement algorithms: a simpler “naive” algorithm for initial deployment and a more complex CVXPY-based optimization for re-evaluating placement.
A. Initial Placement (utils.placement.calculate_naive_placement
)¶
This algorithm determines the first placement of services when an HDAG is initially deployed.
- Invocation: Called only once per HDAG, within the
services.hdag.graph_service.deploy_graph
function immediately after theGraph
object is created in the database. - Data Flow (Inputs):
cluster_capacities
: List of available CPU cores per cluster, sourced from SMO’s database (Cluster.available_cpu
).cluster_accelerations
: List of GPU capabilities per cluster (boolean/int), sourced from SMO’s database (Cluster.acceleration
).cpu_limits
: List of CPU core requirements for each service, translated from the HDAG descriptor’sintent.compute.cpu
field usingutils.intent_translation.tranlsate_cpu
.accelerations
: List of GPU requirements for each service (boolean/int), derived from the HDAG descriptor’sintent.compute.gpu.enabled
.replicas
: Hardcoded to[1, 1, ..., 1]
for all services during initial deployment.
- Logic: Implements a first-fit-like heuristic, attempting to place each service onto the first available cluster that meets its resource (CPU, GPU) requirements without exceeding capacity.
- Data Flow (Outputs):
- Returns a
placement
matrix (2D list:[[0, 1, 0], [1, 0, 0], ...]
). - This matrix is immediately stored in the
graph.placement
field (JSONB) in the database. - It’s converted (via
utils.placement.convert_placement
) into aservice_placement
dictionary (e.g.,{'serviceA': 'cluster2', 'serviceB': 'cluster1'}
). - This dictionary is used to populate the
clustersAffinity
andserviceImportClusters
fields within the Helmvalues_overwrite
data for each service before the initialhelm install
command is executed viaservices.hdag.graph_service.helm_install_artifact
.
- Returns a
B. Re-Placement Optimization (utils.placement.decide_placement
)¶
This CVXPY-based optimization is used when explicitly triggered to re-evaluate the placement of an existing, deployed HDAG.
- Invocation: Called within the
services.hdag.graph_service.trigger_placement
function, which is executed when a request hits theGET /graphs/<name>/placement
API endpoint. - Data Flow (Inputs):
cluster_capacities
,cluster_acceleration
: Sourced from SMO’s database (Cluster
model), reflecting the current known state.cpu_limits
,acceleration
: Sourced from the persisted SMO database records for each service within the HDAG (Service.cpu
,Service.gpu
).replicas
: Fetched live from the Karmada control plane usingutils.karmada_helper.KarmadaHelper.get_replicas
to reflect the actual current replica count of each deployed service.current_placement
: The existing placement matrix, retrieved from the SMO database (graph.placement
). This is used as they
variable in the objective function to penalize changes.
- Logic (CVXPY Model):
- Decision Variables: Boolean matrix
x[s, e]
(1 if services
is placed on clustere
). - Objective Function: Minimizes
w_dep * cp.sum(x)
(deployment cost/spread) +w_re * cp.sum(cp.multiply(y, (y - x)))
(re-optimization cost/stability penalty). Weightsw_dep
andw_re
are currently hardcoded constants (1). - Constraints: Ensures each service is on one cluster, respects cluster CPU capacity based on current live replica counts, respects GPU requirements, and potentially handles service dependencies (
d[i-1]
constraint - exact logic depends ond
).
- Decision Variables: Boolean matrix
- Data Flow (Outputs):
- Returns a new optimized
placement
matrix. - This new matrix updates the
graph.placement
field in the database. - The code then compares the new placement to the old one for each service. If a service’s placement changes:
- Its corresponding Helm
values_overwrite
data (stored inService.values_overwrite
) is updated with the newclustersAffinity
and recalculatedserviceImportClusters
. - The
helm upgrade
command is executed for that specific service viaservices.hdag.graph_service.helm_install_artifact
to apply the change.
- Its corresponding Helm
- The updated cluster assignments (
cluster_placement
) are used to restart the background scaling threads with the correct context.
- Returns a new optimized
2. Replica Scaling Optimization (utils.scaling.decide_replicas
)¶
This CVXPY-based optimization runs continuously in background threads for deployed services (if scaling is enabled) to adjust replica counts based on load.
- Invocation: Called repeatedly within the
utils.scaling.scaling_loop
function. This loop runs in a separate thread for each cluster managing services of a given graph, started byservices.hdag.graph_service.spawn_scaling_processes
after initial deployment or re-placement. - Data Flow (Inputs):
request_rates
: List of current request rates per second for each managed service, fetched live from Prometheus usingutils.prometheus_helper.PrometheusHelper.get_request_rate
. (Note:image-compression-vo
usesnoise-reduction
‘s rate).previous_replicas
: The list of replica counts determined in the previous iteration of this loop (or fetched initially from Karmada). Used in the transition cost calculation.cpu_limits
: List of CPU core limits per service, fetched once at the start of the scaling loop from Karmada usingutils.karmada_helper.KarmadaHelper.get_cpu_limit
.acceleration
: List of GPU requirements per service (boolean/int), passed in when the thread starts (originating from HDAG descriptor/Service model). Currently sourced from hardcoded values inspawn_scaling_processes
.alpha
,beta
: Coefficients for the linear service capacity model (capacity = alpha * replicas + beta
), passed in when the thread starts. Currently sourced from hardcoded dictionaries inspawn_scaling_processes
keyed by service name.cluster_capacity
,cluster_acceleration
: The specific capacity and GPU capability of the single cluster this scaling loop is responsible for, passed in when the thread starts (originating from DBCluster
model).maximum_replicas
: List of maximum allowed replicas per service, passed in when the thread starts. Currently sourced from hardcoded values inspawn_scaling_processes
.
- Logic (CVXPY Model):
- Decision Variables: Integer list
r_current[s]
(target replicas for services
on this cluster). - Objective Function: Minimizes
w_util * cp.sum(...)
(resource utilization cost, favoring fewer replicas/CPU) +w_trans * cp.sum(...)
(transition cost, favoring fewer replica changes). Weightsw_util
andw_trans
are currently hardcoded (0.4). - Constraints: Respects the capacity of the single cluster this loop manages, ensures deployed capacity meets or exceeds current live request rates (
alpha[s] * r_current[s] + beta[s] >= request_rates[s]
), enforces minimum (1) and maximum replica counts per service.
- Decision Variables: Integer list
- Data Flow (Outputs):
- Returns
new_replicas
, a list of optimal integer replica counts for the services managed by this thread/cluster. - If a solution is found (
problem.status == cp.OPTIMAL
):- For each service,
utils.karmada_helper.KarmadaHelper.scale_deployment
is called to instruct Karmada to adjust the deployment to the new replica count. - This
new_replicas
list becomes theprevious_replicas
input for the next iteration of the loop.
- For each service,
- If no optimal solution is found (e.g., constraints cannot be met, returns
None
):- SMO makes an HTTP GET request to its own
/graphs/<name>/placement
endpoint, triggering a full graph re-placement evaluation in the hope that moving services might resolve the scaling infeasibility. NB: this seems non-ideal, see the discussion beloe.
- SMO makes an HTTP GET request to its own
- Returns
Notes¶
How to replace the REST API call from within the SMO itself?¶
It does not make sense from a software design perspective. It introduces unnecessary overhead, potential reliability issues, and obscures the code flow. A direct function call (after refactoring the core logic) is likely the most appropriate and efficient solution for triggering the re-placement logic from within the scaling loop in the same application process, provided thread safety and application context are handled correctly.
Better Alternatives:
- Direct Function Call (Refactored):
- How: Refactor the core logic of
trigger_placement
into a separate, internal function (e.g.,_perform_graph_re_placement(graph_name)
withingraph_service.py
). - Both the
/graphs/<name>/placement
API route handler and thescaling_loop
would call this internal function directly. - Considerations: Need to ensure the function called from the background thread handles Flask application context correctly if needed (e.g., for database access using
with app.app_context():
). State modifications must be thread-safe. - Benefit: Most efficient, clear code flow, simpler error handling.
- How: Refactor the core logic of
- Internal Task Queue:
- How: When
scaling_loop
detects failure, it pushes a “re-placement task” (with the graph name) onto an internal queue (could be a simple Pythonqueue.Queue
managed by another thread, or a more robust library like Celery or RQ if SMO needed more advanced background tasking). A dedicated worker processes items from the queue, calling the re-placement logic. - Benefit: Decouples the scaling loop from the immediate execution of the re-placement, allows for better control (e.g., rate-limiting re-placements), and potentially simplifies concurrency management. However, it adds the complexity of managing the queue system.
- How: When
Page last modified: 2025-05-08 11:32:02