SlideShare a Scribd company logo
Container Orchestration
from Theory to Practice
Laura Frank & Stephen Day
Open Source Summit, Los Angeles
September 2017
v0
2
Stephen Day
Docker, Inc.
stephen@docker.com
@stevvooe
Laura Frank
Codeship
laura@codeship.com
@rhein_wein
3
Agenda
- Understanding the SwarmKit object model
- Node topology
- Distributed consensus and Raft
SwarmKit
An open source framework for building orchestration systems
https://github.com/docker/swarmkit
SwarmKit Object Model
Laura Frank & Stephen Day
Container Orchestration from Theory to Practice
6
Orchestration
A control system for your cluster
ClusterO
-
Δ St
D
D = Desired State
O = Orchestrator
C = Cluster
St
= State at time t
Δ = Operations to converge S to D
https://en.wikipedia.org/wiki/Control_theory
7
Convergence
A functional view
D = Desired State
O = Orchestrator
C = Cluster
St
= State at time t
f(D, Sn-1
, C) → Sn
| min(S-D)
8
Observability and Controllability
The Problem
Low Observability High Observability
Failure Process State User Input
9
Data Model Requirements
- Represent difference in cluster state
- Maximize observability
- Support convergence
- Do this while being extensible and reliable
Show me your data structures
and I’ll show you your
orchestration system
11
Services
- Express desired state of the cluster
- Abstraction to control a set of containers
- Enumerates resources, network availability, placement
- Leave the details of runtime to container process
- Implement these services by distributing processes across a cluster
Node 1 Node 2 Node 3
12
Declarative
$ docker network create -d overlay backend
vpd5z57ig445ugtcibr11kiiz
$ docker service create -p 6379:6379 --network backend redis
pe5kyzetw2wuynfnt7so1jsyl
$ docker service scale serene_euler=3
serene_euler scaled to 3
$ docker service ls
ID NAME REPLICAS IMAGE COMMAND
pe5kyzetw2wu serene_euler 3/3 redis
13
Reconciliation
Spec → Object
Object
Current State
Spec
Desired State
14
Demo Time!
Task Model
Prepare: setup resources
Start: start the task
Wait: wait until task exits
Shutdown: stop task, cleanly
Runtime
Orchestrator
16
Task Model
Atomic Scheduling Unit of SwarmKit
Object
Current State
Spec
Desired
State
Task0
Task1
…
Taskn Scheduler
Manager
Task
Task
Data Flow
ServiceSpec
TaskSpec
Service
ServiceSpec
TaskSpec
Task
TaskSpec
Worker
Consistency
19
Field Ownership
Only one component of the system can
write to a field
Consistency
Worker
Pre-Run
Preparing
Manager
Terminal States
Task State
New Allocated Assigned
Ready Starting
Running
Complete
Shutdown
Failed
Rejected
Field Handoff
Task Status
State Owner
< Assigned Manager
>= Assigned Worker
22
Observability and Controllability
The Problem
Low Observability High Observability
Failure Process State User Input
23
Orchestration
A control system for your cluster
ClusterO
-
Δ St
D
D = Desired State
O = Orchestrator
C = Cluster
St
= State at time t
Δ = Operations to converge S to D
https://en.wikipedia.org/wiki/Control_theory
Orchestrator
24
Task Model
Atomic Scheduling Unit of SwarmKit
Object
Current State
Spec
Desired
State
Task0
Task1
…
Taskn Scheduler
SwarmKit Doesn’t Quit!
Node Topology in
SwarmKit
Laura Frank & Stephen Day
Container Orchestration from Theory to Practice
We’ve got a bunch of nodes…
now what?
28
Push Model Pull Model
Manager
Worker
Discovery
System
(ZooKeeper)
3 - Payload
1 - Register
2 - Discover Manager
Worker
Registration &
Payload
29
Push Model Pull Model
Pros Provides better control over
communication rate
- Managers decide when to
contact Workers
Cons Requires a discovery
mechanism
- More failure scenarios
- Harder to troubleshoot
Pros Simpler to operate
- Workers connect to Managers
and don’t need to bind
- Can easily traverse networks
- Easier to secure
- Fewer moving parts
Cons Workers must maintain connection
to Managers at all times
30
Push vs Pull
• SwarmKit adopted the Pull model
• Favored operational simplicity
• Engineered solutions to provide rate control in pull mode
Rate Control in a Pull Model
32
Rate Control: Heartbeats
• Manager dictates heartbeat rate to Workers
• Rate is configurable (not by end user)
• Managers agree on same rate by
consensus via Raft
• Managers add jitter so pings are spread
over time (avoid bursts)
Manager
Worker
Ping? Pong!
Ping me back in
5.2 seconds
33
Rate Control: Workloads
• Worker opens a gRPC stream to
receive workloads
• Manager can send data whenever it
wants to
• Manager will send data in batches
• Changes are buffered and sent in
batches of 100 or every 100 ms,
whichever occurs first
• Adds little delay (at most 100ms) but
drastically reduces amount of
communication
Manager
Worker
Give me
work to do
100ms - [Batch of 12 ]
200ms - [Batch of 26 ]
300ms - [Batch of 32 ]
340ms - [Batch of 100]
360ms - [Batch of 100]
460ms - [Batch of 42 ]
560ms - [Batch of 23 ]
Replication
Running multiple managers for high availability
35
Replication
Manager Manager Manager
Worker
Leader FollowerFollower
• Worker can connect to any
Manager
• Followers will forward traffic to
the Leader
36
Replication
Manager Manager Manager
Worker
Leader FollowerFollower
• Followers multiplex all workers
to the Leader using a single
connection
• Backed by gRPC channels
(HTTP/2 streams)
• Reduces Leader networking load
by spreading the connections
evenly
Worker Worker
Example: On a cluster with 10,000 workers and 5 managers,
each will only have to handle about 2,000 connections. Each
follower will forward its 2,000 workers using a single socket to
the leader.
37
Replication
Manager Manager Manager
Worker
Leader FollowerFollower
• Upon Leader failure, a new one
is elected
• All managers start redirecting
worker traffic to the new one
• Transparent to workers
Worker Worker
38
Replication
Manager Manager Manager
Worker
Follower FollowerLeader
• Upon Leader failure, a new one
is elected
• All managers start redirecting
worker traffic to the new one
• Transparent to workers
Worker Worker
39
Replication
Manager
3
Manager
1
Manager
2
Worker
Leader FollowerFollower
• Manager sends list of all
managers’ addresses to Workers
• When a new manager joins, all
workers are notified
• Upon manager failure, workers
will reconnect to a different
manager
- Manager 1 Addr
- Manager 2 Addr
- Manager 3 Addr
40
Replication
Manager
3
Manager
1
Manager
2
Worker
Leader FollowerFollower
• Manager sends list of all
managers’ addresses to Workers
• When a new manager joins, all
workers are notified
• Upon manager failure, workers
will reconnect to a different
manager
41
Replication
Manager
3
Manager
1
Manager
2
Worker
Leader FollowerFollower
• Manager sends list of all
managers’ addresses to Workers
• When a new manager joins, all
workers are notified
• Upon manager failure, workers
will reconnect to a different
manager
Reconnect to
random manager
Presence
Scalable presence in a distributed environment
43
Presence
• Leader commits Worker state (Up vs Down) into Raft
− Propagates to all managers
− Recoverable in case of leader re-election
• Heartbeat TTLs kept in Leader memory
− Too expensive to store “last ping time” in Raft
• Every ping would result in a quorum write
− Leader keeps worker<->TTL in a heap (time.AfterFunc)
− Upon leader failover workers are given a grace period to reconnect
• Workers considered Unknown until they reconnect
• If they do they move back to Up
• If they don’t they move to Down
Distributed Consensus
Laura Frank & Stephen Day
Container Orchestration from Theory to Practice
45
The Raft Consensus Algorithm
Orchestration systems typically use some kind of service to
maintain state in a distributed system
- etcd
- ZooKeeper
- …
Many of these services are backed by the Raft consensus
algorithm
46
SwarmKit and Raft
Docker chose to implement the algorithm directly
- Fast
- Don’t have to set up a separate service to get started with
orchestration
- Differentiator between SwarmKit/Docker and other orchestration
systems
demo.consensus.group
secretlivesofdata.com
Consistency
Sequencer
● Every object in the store has a Version field
● Version stores the Raft index when the object was last updated
● Updates must provide a base Version; are rejected if it is out of date
● Similar to CAS
● Also exposed through API calls that change objects in the store
49
50
Versioned Updates
Consistency
service := getCurrentService()
spec := service.Spec
spec.Image = "my.serv/myimage:mytag"
update(spec, service.Version)
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.3.0
...
Version = 189
Original object:
51
Raft index when it was last updated
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.3.0
...
Version = 189
Service ABC
Spec
Replicas = 4
Image = registry:2.4.0
...
Version = 189
Update request:Original object:
52
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.3.0
...
Version = 189
Original object:
Service ABC
Spec
Replicas = 4
Image = registry:2.4.0
...
Version = 189
Update request:
53
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.4.0
...
Version = 190
Updated object:
54
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.4.0
...
Version = 190
Service ABC
Spec
Replicas = 5
Image = registry:2.3.0
...
Version = 189
Update request:Updated object:
55
Sequencer
Service ABC
Spec
Replicas = 4
Image = registry:2.4.0
...
Version = 190
Service ABC
Spec
Replicas = 5
Image = registry:2.3.0
...
Version = 189
Update request:Updated object:
56
github.com/docker/swarmkit
github.com/coreos/etcd (Raft)
THANK YOU

More Related Content

What's hot (20)

PPTX
DCUS17 : Docker networking deep dive
Madhu Venugopal
 
PPTX
Docker Swarm for Beginner
Shahzad Masud
 
PDF
Transformative Git Practices
Nicola Paolucci
 
PDF
Efficient Parallel Testing with Docker
Laura Frank Tacho
 
PDF
Docker Swarm 45-min Workshop (Mountain View Docker Meetup 2/24/2016)
Mike Goelzer
 
PDF
Container Orchestration from Theory to Practice
Docker, Inc.
 
PPTX
Container Orchestration with Docker Swarm and Kubernetes
Will Hall
 
PDF
Going Production with Docker and Swarm
C4Media
 
PDF
Docker for Devs - John Zaccone, IBM
Docker, Inc.
 
PPTX
Docker SF Meetup January 2016
Patrick Chanezon
 
PDF
What's New in Docker 1.12?
Ajeet Singh Raina
 
PPTX
Docker swarm workshop
Luis Borbon
 
PPTX
Cgroups, namespaces and beyond: what are containers made from?
Docker, Inc.
 
PDF
Docker HK Meetup - 201707
Clarence Ho
 
PDF
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
William Yeh
 
PDF
Live Container Migration: OpenStack Summit Barcelona 2016
Phil Estes
 
PDF
Leveraging the Power of containerd Events - Evan Hazlett
Docker, Inc.
 
PDF
Demystifying puppet
Ajeet Singh Raina
 
PDF
LinuxKit Deep Dive
Docker, Inc.
 
PPTX
Orchestrating Least Privilege by Diogo Monica
Docker, Inc.
 
DCUS17 : Docker networking deep dive
Madhu Venugopal
 
Docker Swarm for Beginner
Shahzad Masud
 
Transformative Git Practices
Nicola Paolucci
 
Efficient Parallel Testing with Docker
Laura Frank Tacho
 
Docker Swarm 45-min Workshop (Mountain View Docker Meetup 2/24/2016)
Mike Goelzer
 
Container Orchestration from Theory to Practice
Docker, Inc.
 
Container Orchestration with Docker Swarm and Kubernetes
Will Hall
 
Going Production with Docker and Swarm
C4Media
 
Docker for Devs - John Zaccone, IBM
Docker, Inc.
 
Docker SF Meetup January 2016
Patrick Chanezon
 
What's New in Docker 1.12?
Ajeet Singh Raina
 
Docker swarm workshop
Luis Borbon
 
Cgroups, namespaces and beyond: what are containers made from?
Docker, Inc.
 
Docker HK Meetup - 201707
Clarence Ho
 
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
William Yeh
 
Live Container Migration: OpenStack Summit Barcelona 2016
Phil Estes
 
Leveraging the Power of containerd Events - Evan Hazlett
Docker, Inc.
 
Demystifying puppet
Ajeet Singh Raina
 
LinuxKit Deep Dive
Docker, Inc.
 
Orchestrating Least Privilege by Diogo Monica
Docker, Inc.
 

Viewers also liked (11)

PPTX
Science and software development
Robert Pickering
 
PDF
Rails Applications with Docker
Laura Frank Tacho
 
PDF
Deep Dive into Docker Swarm Mode
Ajeet Singh Raina
 
PDF
Best Practices for Developing & Deploying Java Applications with Docker
Eric Smalling
 
PDF
Iterative Security: Secrets when you're not ready for Vault
Tom McLaughlin
 
PDF
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Skelton Thatcher Consulting Ltd
 
PDF
Putting the F in FaaS: Functional Compositional Patterns in a Serverless World
Lars Trieloff
 
PDF
Bucketbench: Benchmarking Container Runtime Performance
Phil Estes
 
PPTX
Containerd internals: building a core container runtime
Docker, Inc.
 
PPTX
Kubernetes CRI containerd integration by Lantao Liu (Google)
Docker, Inc.
 
PPTX
Devoxx 2017 "Continuous Delivery with Containers: The Good, the Bad, and the ...
Daniel Bryant
 
Science and software development
Robert Pickering
 
Rails Applications with Docker
Laura Frank Tacho
 
Deep Dive into Docker Swarm Mode
Ajeet Singh Raina
 
Best Practices for Developing & Deploying Java Applications with Docker
Eric Smalling
 
Iterative Security: Secrets when you're not ready for Vault
Tom McLaughlin
 
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Skelton Thatcher Consulting Ltd
 
Putting the F in FaaS: Functional Compositional Patterns in a Serverless World
Lars Trieloff
 
Bucketbench: Benchmarking Container Runtime Performance
Phil Estes
 
Containerd internals: building a core container runtime
Docker, Inc.
 
Kubernetes CRI containerd integration by Lantao Liu (Google)
Docker, Inc.
 
Devoxx 2017 "Continuous Delivery with Containers: The Good, the Bad, and the ...
Daniel Bryant
 
Ad

Similar to SwarmKit in Theory and Practice (20)

PDF
Container orchestration from theory to practice
Docker, Inc.
 
PDF
Springone2gx 2014 Reactive Streams and Reactor
Stéphane Maldini
 
PDF
Orchestrating Linux Containers while tolerating failures
Docker, Inc.
 
PDF
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
PDF
Frenetic: A Programming Language for OpenFlow Networks
Open Networking Summits
 
PDF
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
Qin Liu
 
PDF
Characterizing and Contrasting Kuhn-tey-ner Awr-kuh-streyt-ors
Sonatype
 
PDF
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 
PPTX
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Nitin S
 
PPTX
Solr Lucene Conference 2014 - Nitin Presentation
Nitin Sharma
 
PDF
Training Slides: Basics 102: Introduction to Tungsten Clustering
Continuent
 
PDF
3450 - Writing and optimising applications for performance in a hybrid messag...
Timothy McCormick
 
PDF
A Multi-Agent System Approach to Load-Balancing and Resource Allocation for D...
Soumya Banerjee
 
PPTX
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
PPTX
SDN approach.pptx
TrongMinhHoang1
 
PPTX
Fyber - airflow best practices in production
Itai Yaffe
 
PPTX
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
bloomreacheng
 
PPTX
Microservices Part 4: Functional Reactive Programming
Araf Karsh Hamid
 
PDF
intro lect.pdfkkpkpkpkpkpjjkojkopjjojjoj
AmolJoglekar5
 
PPTX
SDN Architecture & Ecosystem
Kingston Smiler
 
Container orchestration from theory to practice
Docker, Inc.
 
Springone2gx 2014 Reactive Streams and Reactor
Stéphane Maldini
 
Orchestrating Linux Containers while tolerating failures
Docker, Inc.
 
Training Slides: 202 - Monitoring & Troubleshooting
Continuent
 
Frenetic: A Programming Language for OpenFlow Networks
Open Networking Summits
 
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
Qin Liu
 
Characterizing and Contrasting Kuhn-tey-ner Awr-kuh-streyt-ors
Sonatype
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Nitin S
 
Solr Lucene Conference 2014 - Nitin Presentation
Nitin Sharma
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Continuent
 
3450 - Writing and optimising applications for performance in a hybrid messag...
Timothy McCormick
 
A Multi-Agent System Approach to Load-Balancing and Resource Allocation for D...
Soumya Banerjee
 
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
SDN approach.pptx
TrongMinhHoang1
 
Fyber - airflow best practices in production
Itai Yaffe
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
bloomreacheng
 
Microservices Part 4: Functional Reactive Programming
Araf Karsh Hamid
 
intro lect.pdfkkpkpkpkpkpjjkojkopjjojjoj
AmolJoglekar5
 
SDN Architecture & Ecosystem
Kingston Smiler
 
Ad

More from Laura Frank Tacho (6)

PDF
The Container Shame Spiral
Laura Frank Tacho
 
PDF
Using Docker For Development
Laura Frank Tacho
 
PDF
Deploying a Kubernetes App with Amazon EKS
Laura Frank Tacho
 
PDF
Building Efficient Parallel Testing Platforms with Docker
Laura Frank Tacho
 
PDF
Stop Being Lazy and Test Your Software
Laura Frank Tacho
 
PDF
Happier Teams Through Tools
Laura Frank Tacho
 
The Container Shame Spiral
Laura Frank Tacho
 
Using Docker For Development
Laura Frank Tacho
 
Deploying a Kubernetes App with Amazon EKS
Laura Frank Tacho
 
Building Efficient Parallel Testing Platforms with Docker
Laura Frank Tacho
 
Stop Being Lazy and Test Your Software
Laura Frank Tacho
 
Happier Teams Through Tools
Laura Frank Tacho
 

Recently uploaded (20)

PDF
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 
PDF
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
PDF
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
PPTX
Chess King 25.0.0.2500 With Crack Full Free Download
cracked shares
 
PPTX
Presentation about Database and Database Administrator
abhishekchauhan86963
 
PPT
Brief History of Python by Learning Python in three hours
adanechb21
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PDF
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
PDF
How to Download and Install ADT (ABAP Development Tools) for Eclipse IDE | SA...
SAP Vista, an A L T Z E N Company
 
PDF
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PDF
Salesforce Pricing Update 2025: Impact, Strategy & Smart Cost Optimization wi...
GetOnCRM Solutions
 
PDF
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
PDF
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
PDF
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
PDF
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
PDF
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
PDF
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
PPTX
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
PPTX
Cutting Optimization Pro 5.18.2 Crack With Free Download
cracked shares
 
SAP GUI Installation Guide for macOS (iOS) | Connect to SAP Systems on Mac
SAP Vista, an A L T Z E N Company
 
On Software Engineers' Productivity - Beyond Misleading Metrics
Romén Rodríguez-Gil
 
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
Chess King 25.0.0.2500 With Crack Full Free Download
cracked shares
 
Presentation about Database and Database Administrator
abhishekchauhan86963
 
Brief History of Python by Learning Python in three hours
adanechb21
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
Enhancing Security in VAST: Towards Static Vulnerability Scanning
ESUG
 
How to Download and Install ADT (ABAP Development Tools) for Eclipse IDE | SA...
SAP Vista, an A L T Z E N Company
 
Enhancing Healthcare RPM Platforms with Contextual AI Integration
Cadabra Studio
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
Salesforce Pricing Update 2025: Impact, Strategy & Smart Cost Optimization wi...
GetOnCRM Solutions
 
Virtual Threads in Java: A New Dimension of Scalability and Performance
Tier1 app
 
Generating Union types w/ Static Analysis
K. Matthew Dupree
 
How Agentic AI Networks are Revolutionizing Collaborative AI Ecosystems in 2025
ronakdubey419
 
Salesforce Implementation Services Provider.pdf
VALiNTRY360
 
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
Summary Of Odoo 18.1 to 18.4 : The Way For Odoo 19
CandidRoot Solutions Private Limited
 
Web Testing.pptx528278vshbuqffqhhqiwnwuq
studylike474
 
Cutting Optimization Pro 5.18.2 Crack With Free Download
cracked shares
 

SwarmKit in Theory and Practice