Docker Swarm & Docker Stacks¶
Docker Swarm and Docker Stacks work together to help you run and manage containerized applications across multiple Docker hosts (machines).
Imagine you have a complex application with multiple parts (e.g., a web frontend, a backend API, a database). * You want to run multiple copies of some parts for scalability and reliability. * You want these parts to find and talk to each other easily. * You want to manage all these containers across several servers as if they were one big system.
This is where Docker Swarm and Docker Stacks come in.
1. Docker Swarm: The Cluster Manager¶
What it is:
Docker Swarm (or Swarm mode) is Docker’s native clustering and orchestration solution. It turns a pool of Docker hosts (physical or virtual machines) into a single, virtual Docker host.
Key Concepts:
-
Node: An instance of the Docker engine participating in the swarm.
- Manager Nodes: These nodes perform orchestration and cluster management tasks. They make decisions about where to run containers, manage the swarm’s state, and dispatch tasks to worker nodes. For high availability, you typically run multiple manager nodes (e.g., 3 or 5) using the Raft consensus algorithm to agree on the state of the swarm.
- Worker Nodes: These nodes execute the containers (tasks) assigned by the manager nodes. They report the status of their tasks back to the managers.
-
Service: A definition of the tasks to execute on the manager or worker nodes. It’s the desired state for a set of containers. When you create a service, you specify:
- The Docker image to use.
- The number of replicas (how many instances of the container should run).
- Port mappings.
- Network connections.
- Update policies (how to roll out new versions).
- Resource constraints, etc.
Swarm will then try to maintain this desired state. If a container (task) dies, Swarm will try to reschedule it.
-
Task: A running container that is part of a service and scheduled on a node. A service will have one or more tasks.
-
Load Balancing: Swarm includes an internal load balancer (ingress routing mesh). When you publish a port for a service, Swarm makes that port accessible on every node in the swarm. Traffic to that port on any node is routed to an available task for that service, even if the task is running on a different node.
How it works (simplified):
1. You initialize a swarm on a Docker host, making it the first manager node (docker swarm init
).
2. Other Docker hosts join the swarm as managers or workers using a join token (docker swarm join
).
3. You define services (docker service create ...
).
4. The manager nodes schedule tasks (containers) for these services across the available worker nodes based on your service definition.
5. The swarm continuously monitors the state and tries to reconcile the actual state with the desired state (e.g., restarts failed tasks, scales services up/down).
2. Docker Stacks: The Application Definition¶
What it is:
A Docker Stack is a higher-level abstraction for deploying and managing a multi-service application on a Docker Swarm. It allows you to define your entire application (multiple related services, networks, volumes) in a single docker-compose.yml
file (version 3 and above) and deploy it as a single unit.
Key Concepts:
docker-compose.yml
(v3+): This YAML file is the blueprint for your application stack. It defines:services
: Each service in your application (e.g.,web
,api
,db
).networks
: Custom networks for your services to communicate.volumes
: Persistent storage for your services.configs
andsecrets
: For managing configuration data and sensitive information.deploy
key: This is crucial for Swarm. Inside thedeploy
key for each service, you specify Swarm-specific configurations like:replicas
: Number of instances.placement
: Constraints on where tasks can run (e.g., only on nodes with a specific label).update_config
: How to perform rolling updates (parallelism, delay).restart_policy
: What to do if a task fails.resources
: CPU/memory limits and reservations.
How it works (simplified):
1. You create a docker-compose.yml
file defining all the services, networks, and volumes for your application. You include the deploy
key for Swarm-specific settings.
2. You use the command docker stack deploy -c docker-compose.yml <stack_name>
on a manager node.
3. Docker Swarm reads this file, and for each service defined:
* It creates or updates the corresponding Swarm service.
* It ensures the defined networks and volumes are available.
* It deploys the tasks (containers) across the swarm according to the deploy
configurations.
Docker Swarm + Docker Stacks: How They Work Together¶
- Docker Swarm provides the underlying cluster infrastructure and orchestration engine. It manages the nodes, schedules containers, ensures high availability, and handles networking and load balancing.
- Docker Stacks provide a user-friendly way to define and manage complex, multi-service applications that run on top of the Docker Swarm. The
docker-compose.yml
file (used bydocker stack deploy
) is translated into a set of Swarm service definitions.
Think of it like this:
- Docker Swarm is the Orchestra Pit & Conductor: It’s the environment, the musicians (nodes), and the conductor (manager nodes) making sure everyone plays their part.
- Docker Stack (and its
docker-compose.yml
) is the Musical Score: It defines all the different parts (services like violins, trumpets, drums) and how they should play together to create the full symphony (your application).
Benefits of using Docker Stacks with Docker Swarm:
- Simplified Application Deployment: Manage your entire multi-service application with a single
docker-compose.yml
file and a fewdocker stack
commands. - Declarative Configuration: You define the desired state of your application, and Swarm works to achieve and maintain it.
- Scalability: Easily scale individual services within your stack by changing the
replicas
count in your Compose file and redeploying the stack. - Rolling Updates: Update your application services with zero downtime using rolling updates configured in the Compose file.
- Readability & Version Control: Your entire application infrastructure is defined in a text file, which can be version-controlled (e.g., with Git).
- Leverages Docker Compose Familiarity: If you’re already using Docker Compose for local development, moving to Swarm with Stacks is a relatively small step.
Example docker-compose.yml
for a Stack:
version: '3.8' # Use a Swarm-compatible version
services:
web:
image: myapp/web:latest
ports:
- "80:80" # Expose port 80 on the host, mapped to port 80 in the container
networks:
- app-net
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
restart_policy:
condition: on-failure
api:
image: myapp/api:v1.2
networks:
- app-net
deploy:
replicas: 2
placement:
constraints: [node.role == worker] # Only deploy on worker nodes
resources:
limits:
cpus: '0.50'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
networks:
app-net:
driver: overlay # Overlay network for multi-host communication
To deploy this:
docker stack deploy -c docker-compose.yml my_application_stack
To list stacks:
docker stack ls
To list services in a stack:
docker stack services my_application_stack
To remove a stack:
docker stack rm my_application_stack
Vs. Nomad¶
Let’s compare Docker Swarm (+ Stacks) with HashiCorp Nomad. Both are workload orchestrators, but they have different philosophies, scopes, and strengths.
Docker Swarm (+ Stacks)
- Focus: Primarily designed for orchestrating Docker containers.
- Ease of Use: Generally considered very easy to set up and use, especially if you’re already familiar with Docker and Docker Compose. The
docker swarm init
command is all it takes to start a cluster. - “Batteries Included”: Swarm comes with built-in features for:
- Service discovery (DNS-based within the swarm).
- Load balancing (ingress routing mesh).
- Secrets management (
docker secret
). - Configuration management (
docker config
). - Overlay networking for multi-host communication.
- Ecosystem: Tightly integrated with the Docker CLI and Docker ecosystem. Docker Stacks use
docker-compose.yml
(v3+) files, making the transition from local development (Docker Compose) to production (Swarm) smoother. - Scalability: Good for many use cases, from small to reasonably large clusters. While it can scale, it’s generally not considered as massively scalable as Nomad or Kubernetes for extreme loads.
- Flexibility (Workload Types): Limited to Docker containers. You can’t run VMs, raw executables, or Java JARs directly as first-class citizens scheduled by Swarm.
- Architecture: Simpler architecture with Manager and Worker nodes. Managers use Raft for consensus.
- State Management: Handled internally by the manager nodes.
HashiCorp Nomad
- Focus: A general-purpose workload orchestrator. It’s designed to schedule and manage any type of application, not just Docker containers.
- Ease of Use: Relatively simple to operate, especially when compared to Kubernetes. It aims for operational simplicity with a single binary for both clients and servers. However, for a full-featured setup, you’ll typically integrate it with Consul (for service discovery/networking) and Vault (for secrets).
- “Bring Your Own Components” (for full features):
- Service Discovery & Networking: While Nomad can run services, robust service discovery, health checking, and service mesh capabilities are typically provided by HashiCorp Consul.
- Secrets Management: Securely managed by integrating with HashiCorp Vault.
- Nomad itself focuses purely on scheduling and placement.
- Ecosystem: Deeply integrates with other HashiCorp tools (Consul, Vault, Terraform). This “HashiStack” provides a powerful, cohesive, but more modular solution.
- Scalability: Designed for very high scalability, capable of managing tens of thousands of nodes and millions of containers/tasks. HashiCorp often touts benchmarks showing its scalability.
- Flexibility (Workload Types): Highly flexible. Nomad uses task drivers to run different types of workloads:
- Docker containers
- Virtual Machines (e.g., QEMU/KVM)
- Java applications (JAR files)
- Raw executables / Static binaries
- Batch jobs
- System jobs
- Architecture: Simple architecture with Servers (equivalent to Swarm Managers, using Raft) and Clients (equivalent to Swarm Workers).
- State Management: Handled internally by the server nodes.
- Federation: Strong capabilities for multi-region and multi-cluster deployments, allowing you to manage workloads across different datacenters or cloud providers from a single control plane.
Here’s a table summarizing the key differences:
Feature | Docker Swarm (+ Stacks) | HashiCorp Nomad |
---|---|---|
Primary Goal | Docker container orchestration | General-purpose workload orchestration (flexible task drivers) |
Workload Types | Docker containers only | Docker, VMs, JARs, binaries, batch jobs, etc. |
“Batteries Included” | High (networking, service discovery, secrets) | Lower (relies on Consul for networking/discovery, Vault for secrets) |
Ecosystem | Docker CLI & Docker ecosystem | HashiCorp ecosystem (Consul, Vault, Terraform) |
Ease of Setup | Very easy | Easy (single binary), but full setup with Consul/Vault is more involved |
Learning Curve | Lower for Docker users | Moderate; higher if learning Consul/Vault simultaneously |
Scalability | Good, but less proven for massive scale vs. Nomad | Excellent, designed for very large scale |
Flexibility | Lower (tied to Docker) | Higher (can run diverse workloads) |
Operational Simplicity | Very high for its scope | High (single binary), but overall system has more moving parts if using Consul/Vault |
Multi-Region | Limited native capabilities | Strong federation capabilities |
Job Definition | docker-compose.yml (for Stacks) or docker service create |
HCL (HashiCorp Configuration Language) job files |
Vs. Kubernetes¶
Let’s compare Docker Swarm (+ Stacks) with Kubernetes. This is a very common comparison, as Kubernetes is the dominant container orchestrator in the market.
Kubernetes (K8s)
- Focus: A comprehensive platform for automating deployment, scaling, and management of containerized applications. It’s much more than just an orchestrator; it’s an entire ecosystem.
- Ease of Use: Significantly more complex to set up and manage than Swarm. It has a steep learning curve due to its many concepts and components (Pods, Services, Deployments, Ingress, ConfigMaps, Secrets, Namespaces, etc.). Managed Kubernetes services (like GKE, EKS, AKS) greatly simplify cluster operation but not necessarily application deployment and management concepts.
- “Batteries Included” (and much more): Provides a vast array of features:
- Advanced service discovery and DNS.
- Sophisticated load balancing (internal and external).
- Automated rollouts and rollbacks.
- Self-healing (restarts, reschedules, replaces containers).
- Rich secret and configuration management (Secrets, ConfigMaps).
- Storage orchestration (PersistentVolumes, PersistentVolumeClaims, StorageClasses).
- Horizontal Pod Autoscaling (HPA) based on metrics.
- Resource management and QoS.
- Extensibility through Custom Resource Definitions (CRDs) and Operators.
- Ecosystem: Massive, industry-standard, and backed by the Cloud Native Computing Foundation (CNCF). Huge community, extensive tooling, and wide vendor support.
- Scalability: Designed for very high scalability and resilience, capable of managing tens of thousands of nodes and millions of containers. Proven in massive production deployments.
- Flexibility (Workload Types): Primarily designed for containers, but can be extended (e.g., Kubevirt for VMs, though containers are the first-class citizens).
- Architecture: More complex architecture:
- Control Plane Nodes: Run components like
kube-apiserver
,etcd
(distributed key-value store for cluster state),kube-scheduler
,kube-controller-manager
. - Worker Nodes: Run
kubelet
(agent to manage containers on the node) andkube-proxy
(network proxy).
- Control Plane Nodes: Run components like
- Job Definition: YAML manifest files defining various Kubernetes objects (Deployments, Services, Pods, etc.). Tools like Helm are often used to package and manage these applications.
Here’s a table summarizing the key differences:
Feature | Docker Swarm (+ Stacks) | Kubernetes (K8s) |
---|---|---|
Primary Goal | Docker container orchestration | Comprehensive container application platform |
Workload Types | Docker containers only | Primarily containers; extensible (e.g., VMs via Kubevirt) |
Core Feature Set | Service discovery, load balancing, secrets, configs | All of Swarm’s, plus advanced scheduling, storage, self-healing, autoscaling, extensibility |
Ecosystem & Community | Docker ecosystem, good but smaller | Massive, CNCF-backed, industry standard |
Ease of Setup/Operation | Very easy | Complex (managed services help, but still complex conceptually) |
Learning Curve | Low for Docker users | Steep, many concepts and components |
Scalability | Good, suitable for many use cases | Extremely high, designed for massive scale |
Flexibility & Extensibility | Lower | Extremely high (CRDs, Operators, API) |
Resource Requirements (Orchestrator) | Lighter | Heavier (especially the control plane) |
Application Definition | docker-compose.yml (for Stacks) |
YAML manifests (Deployments, Services, etc.); Helm charts for packaging |
Networking Model | Simpler (overlay, ingress routing mesh) | More complex and powerful (CNI, Services, Ingress Controllers, Network Policies) |
Storage Orchestration | Basic (Docker volumes) | Rich and flexible (PV, PVC, StorageClasses, CSI) |
“Out-of-the-box” Feel | Simpler, more integrated feel for Docker users | More of a powerful framework; many choices and configurations |
When to Choose Which¶
1. Choose Docker Swarm (+ Stacks) if¶
-
✅ Simplicity and Ease of Use are Top Priority:
- You want the easiest and fastest way to get a robust container orchestrator up and running.
- Your team is small, has limited dedicated operations (Ops) resources, or wants to minimize operational overhead.
docker swarm init
is appealingly simple.
-
✅ You are Deeply Invested in the Docker Ecosystem:
- Your team is already very familiar with Docker, Docker CLI, and Docker Compose.
- You want a natural progression from local
docker-compose.yml
development to a clustered production environment.
-
✅ Your Primary Workloads are Docker Containers:
- You don’t have an immediate or strong need to orchestrate VMs, raw binaries, or Java JARs as first-class citizens within the same orchestrator.
-
✅ “Good Enough” Orchestration is Sufficient:
- You need solid features like service discovery, load balancing, rolling updates, and secrets management, but don’t require the absolute cutting edge or extreme configurability of Kubernetes.
- Your scaling needs are moderate to significant, but perhaps not “web-scale” with tens of thousands of nodes.
-
✅ You Prefer an “All-in-One” Feel for Core Docker Orchestration:
- You like that networking, service discovery, and basic load balancing are tightly integrated without needing to configure separate components for these core tasks.
2. Choose Kubernetes (K8s) if¶
-
✅ You Need a Comprehensive, Future-Proof, and Industry-Standard Platform:
- You’re building for the long term and want to align with the dominant ecosystem and its vast community support, tooling, and available talent.
- Portability across cloud providers (who all offer managed K8s) is important.
-
✅ You are Building Complex, Large-Scale, or Highly Available Microservice Architectures:
- You require advanced scheduling, sophisticated service discovery and networking (e.g., service mesh), declarative APIs, and robust self-healing capabilities.
- You anticipate needing fine-grained resource controls, advanced deployment strategies (canary, blue/green natively or with add-ons), and horizontal pod autoscaling based on custom metrics.
-
✅ Extensibility and a Rich Ecosystem are Critical:
- You need to integrate with a wide array of CNCF projects and vendor tools (monitoring, logging, security, CI/CD, service mesh, serverless frameworks on K8s).
- You might need Custom Resource Definitions (CRDs) and Operators to extend Kubernetes to manage custom or stateful applications.
-
✅ Your Organization Has (or is Willing to Invest in) the Requisite Expertise:
- You understand and accept the steeper learning curve and higher operational complexity (though managed services like GKE, EKS, AKS significantly reduce the cluster operation burden).
-
✅ You Require Advanced Storage Orchestration:
- You need flexible and dynamic provisioning of various storage types for stateful applications.
3. Choose HashiCorp Nomad if¶
-
✅ Workload Flexibility Beyond Containers is a Key Requirement:
- You need to orchestrate a diverse set of workloads on the same platform: Docker containers, VMs (QEMU), raw executables, Java JARs, and batch jobs.
- You value task drivers for different application types.
-
✅ Operational Simplicity for the Core Scheduler is Desired, with Massive Scalability:
- You appreciate Nomad’s single binary for clients and servers and its design for managing potentially tens of thousands of nodes and millions of tasks.
- You prefer a scheduler that does one job (scheduling) exceptionally well and allows integration with other best-of-breed tools for service discovery (Consul) and secrets (Vault).
-
✅ You are Already Invested in or Plan to Use the HashiCorp Ecosystem:
- You see the value in the tight, designed-to-work-together integration of Nomad, Consul (for service discovery, service mesh, health checking), and Vault (for secrets management).
-
✅ Multi-Region / Federation is a Core Architectural Need:
- You need to easily manage workloads across multiple datacenters or cloud regions from a single control plane. Nomad has strong built-in federation capabilities.
-
✅ You Want a Simpler Alternative to Kubernetes but More Flexibility than Swarm:
- Nomad offers a middle ground: easier to operate than Kubernetes, but capable of handling more diverse workloads than Swarm.
- It’s suitable for teams that want high performance and flexibility without the full conceptual overhead of Kubernetes.
Quick Decision Tree (Highly Simplified)¶
-
Primary need is JUST Docker containers + simplicity is king?
- ➡️ Docker Swarm
-
Need to orchestrate diverse workloads (Docker, VMs, binaries) OR already deep in HashiCorp tools OR need extreme scalability with operational simplicity of the core scheduler?
- ➡️ HashiCorp Nomad
-
Need the most powerful, feature-rich, industry-standard container platform, willing to handle complexity, building for massive scale, or need deep ecosystem integration?
- ➡️ Kubernetes
Conclusion¶
Page last modified: 2025-05-23 15:34:32