The best Docker tools
In recent years, the name, ‘Docker’ has become a synonym for container technology. With its cross-platform, self-developed standard, the company has established an alternative to hypervisor-based hardware virtualization. With this, Docker has successfully revitalized basic Linux core functions for virtualization on the operating system level.
Read on for the most important extensions for the container platform and an overview of the most popular third-party projects, which develops Docker tools on an open-source basis.
- Docker extensions and online services
- Docker tools from external providers
- Docker container as part of a digital infrastructure platform
- Conclusion: How secure is Docker for the future?
Docker extensions and online services
Today, Docker is far more than just a sophisticated platform for managing software containers. Developers have created a range of diverse extensions and online services to make the deployment of applications via distributed infrastructure and cloud environments easier, faster, and more flexible. In addition to clustering and orchestration, these provide users with a central app marketplace and a toll for managing cloud resources.
When users talk of ‘Docker’, they are usually referring to the open source client-server application that underlies the container platform, which is actually called the ‘Docker engine’. Central components of the Docker engine are the Docker daemon, a REST API, and a CLI (command line interface) as the user interface. This design allows you to talk to the Docker engine through command line commands, and to manage images, Docker files, and containers conveniently from the terminal.
Find a detailed description of the Docker engine in our Docker tutorial for beginners.
The Docker node provides users with a cloud-based registry that allows Docker images to be downloaded, centrally managed, and shared with other Docker users. Registered users can file Docker images publicly or in private repositories. Downloading a public image (known as pulling in Docker terminology) does not require a user account. An integrated tag mechanism enables the versioning of images.
In addition to public repositories of other Docker users, there are also many node resources that are provided in official image archives by the Docker developer team and well-known open source projects. The most popular Docker images include the NGINX webserver, the Redis in-memory database, the BusyBox Unix tool kit, and the Ubuntu Linux distribution.
Organizations are another important Docker node feature , which allow users of the container platform to make private repositories, exclusively available to specific people. Access rights are managed within an organization using teams and memberships to groups.
The Docker machine enables Docker hosts to supply and manage Docker hosts on almost any infrastructure. The tool automates the implementation of Docker and makes it much easier to provide Docker hosts.
The developer team refer to a virtual host running the Docker engine as a ‘Docker host’ or a ‘Dockerized host’. While it is native to all Linux distributions, the use of Docker on macOS or Windows systems once required an abstracting layer in the form of a Docker machine. This has changed fundamentally with release v1.12. Today, Docker is available across all the most popular platforms, including Mac and Windows. The central application field of the Docker machine has thus shifted to remote scenarios and the management of Docker hosts in cloud environments.
If a large number of Docking nodes are being used in a network or on cloud infrastructures such as Amazon Web Services (AWS) or DigitalOcean, users will eventually return to the Docker machine. The tool reduces the effort required to create new hosts with a simple Docker machine create command and allows you to manage multiple Docker nodes from the terminal. The Docker machine takes on the task of creating an SSL PKI as well as the distribution of user authentication certificates. The software is used in combination with Docker swarm for the deployment of Docker clusters.
The Docker machine is generally used on the local system. The Docker tool automatically creates new hosts, installs the Docker engine, and configures the locally installed command line program for remote access. This allows you to respond to hosts in the cloud via a local system terminal to remotely carry out these Docker commands.
Since its 1.12 version, the Docker engine has contained a native function that enables its users to manage Docker hosts in clusters called swarms. The cluster management and orchestration capabilities built into the Docker engine are based on the Swarmkit Toolbox. If using an older version of the container platform, the Docker tool is available as a standalone application.
Clusters are made up of any number of Docker hosts and hosted on the infrastructure of an external IaaS provider or in their own data center.
As a native clustering tool, Swarm gathers a pool of Docker hosts into a single virtual host and serves the Socker, REST API. Thus, any Docker tool associated with the Docker daemon can access Swarm and scale across any number of Docker hosts. With the Docker engine, CLI, users can create swarms, distribute applications in the cluster, and manage the behavior of the swarm with no need for additional orchestration software.
Docker engines that have been combined into clusters run in swarm mode. Select this if you want to create a new cluster or add a Docker host to an existing swarm. Individual Docker hosts in a cluster are referred to as "nodes". The nodes of a cluster can run as virtual hosts on the same local system, but more often is a cloud-based design, where the individual nodes of the Docker swarm are distributed across different systems and infrastructures.
The software is based on master-slave architecture. When tasks are to be distributed in the swarm, users pass a service to the manager node, which acts as the cluster’s master. This master is then responsible for scheduling containers in the cluster and serves as a primary user interface for accessing swarm resources.
The manager node sends individual units, known as tasks, to the subordinating slaves, which in Docker terminology are referred to as ‘worker nodes’.
- Services: services are central structures in Docker clusters. A service is a group of containers based on the same image. Its job is to define the tasks that are executed in the cluster. When creating a service, the user specifies which image and commands are used. In addition, services offer the possibility to scale applications. Users of the Docker platform simply define how many containers are to be started for a service.
- Tasks: to distribute services in the cluster, they are divided into individual work units (tasks) by the manager node. Each task includes a Docker container as well as the commands that are executed in it.
In addition to the management of cluster control and orchestration of containers, manager nodes by default also offer worker node functions – unless you restrict the tasks of such master nodes just to management.
An agent program runs on every worker node. This accepts tasks and provides the respective master node status reports on the progress of the transferred task. The following graphic shows a schematic representation of a Docker Swarm:
When implementing a Docker Swarm, users generally rely on the Docker machine.
Compose is Docker’s solution for multi-container applications. Originally developed as fig by the software developer Orchard, the lean orchestration tool now expands the Docker portfolio.
Docker Compose enables you to merge multiple containers and execute with a single commend. The open source tool is implemented in the Python script language. The basic element of Compose is the central control file based on the award-winning language YAML. The syntax of this compose file is similar to that of the open source Vagrant software, which is used when creating and provisioning virtual machines.
In the docker-compose.yml file, you define any number of software containers, including all dependencies, as well as their relationships with each other. Such multi-container applications are controlled according to the same pattern as individual software containers. Use the docker-compose command in combination with the desired subcommand to manage the entire life cycle of the application.
The Docker tool can be easily integrated into a cluster based on Swarm. In this way, you can run multi-container applications created with Compose just as easily on distributed systems as on a single Docker host.
Another feature of Docker Compose is an integrated scaling mechanism. With the orchestration tool, you can comfortably use the command line program to define how many containers you would like to start for a particular service.
Docker Cloud (previously Tutum)
At the end of 2015, Docker took over the container deployment and management service Tutum and converted it to Docker Cloud. While the Docker hub serves as a central storage platform for images, the Docker Cloud provides you with a web-based platform that allows for you to create, test, monitor, and distribute Docker containers across a variety of infrastructures.
The Docker Cloud offers a native integration with all Docker services and is characterized by the support of several cloud providers, such as Microsoft Azure, Amazon Web Services, DigitalOcean, IBM Softlayer, Packet.net, or Google Container Engine. In addition, users have the opportunity to connect the Docker Cloud to their own infrastructure.
Please note: The Docker Cloud doesn’t provide hosting services. All resources for nodes, services, and stacks that you create via the deployment platform have to be deployed by external providers or their own data center.
A stack is a set of multiple Docker services that together create an application.
The native binding to established infrastructure-as-a-service providers (IaaS) allows you to distribute applications across various hosting platforms or realize them in combination with local node hybrid architectures. The management of Docker nodes via various infrastructures happens centrally via a graphical user interface, the Docker Cloud Dashboard. Using this, you can connect with a click to your data center or an external hosting provider and create individual Docker nodes or entire node clusters.
Node clusters can’t be implemented in the Docker Cloud via various external providers.
In addition to the cross-platform deployment and scaling of applications in the Cloud, the deployment platform offers interfaces for various registries such as the Docker Hub, as well as automated build and test functions. An overview of the core functions and the applications possibilities of the Docker Cloud is offered in the following video from the development team:
The Docker Toolbox
Since the release of v1.12, Docker is also available for newer macOS and Windows computers as a native desktop application. But users of older systems have to rely on the Docker toolbox to be able to install the container platform.
The Docker toolbox is an installer that automates the setup of a Docker environment on older Windows and macOS systems. The software contains the following components:
- Docker Engine
- Docker Machine
- Docker Compose
- Oracle VirtualBox
- Docker GUI Kitematic
- Command line interface configured for Docker
The core of the Docker toolbox is the open source software Kitematic. This integrates with the Docker machine to provide a virtual machine based on the Oracle VirtualBox for installing the container platform on Windows and Mac. Docker users have an available graphical user interface that allows containers to be created, started, and stopped with a click of the mouse. An interface to the Docker Hub also allows you to search the registry directly from the desktop app and download images.
In addition to the GUI, there is a preconfigured command line interface. All changes that users make via the one input path are also applied to the other mode. So users have the possibility to seamlessly switch between GUI and CLI for executing and controlling a container. In addition, Kitematic offers functions that automatically assign ports, define environmental variables, and configure drives (volumes).
Even if the Docker toolbox is marked in the Docker documentation as a legacy, the software is still supported. The development team still recommends use via the native desktop apps, Docker for Windows or Docker for Mac, provided that the system requirements are met.
Docker tools from external providers
In addition to the in-house development from Docker Inc., there are various software tools and platforms from external providers that provide interfaces for the Docker Engine or are specially developed for the popular container platform. The Docker ecosystem’s most popular open source projects include the orchestration tool Kubernetes, cluster management tool Shipyard, multi-container shipping solution Panamax, continuous integration platform Drone, Cloud-based operating system OpenStack, and the Mesosphere DC/OS datacenter operating system, based on the cluster manager Mesos.
It’s not always possible for Docker to come up with their own orchestration tools such as Swarm and Compose. For this reason, various companies have been investing in their own development work for years to create tailor-made tools designed to facilitate the operation of container platform in large, distributed infrastructures. Among the most popular solutions of this type, in addition to Spotify’s open source orchestration platform Helios, is the open source project Kubernetes.
Kubernetes is a cluster manager for container-based applications, mostly developed by Google and today under the patronage of the Cloud Native Computing Foundation (CNCF).
The Cloud Native Computing Foundation (CNCF) is a sub-organization of the Linux Foundation (LF). Crucial to the foundation of the organization in 2015 was a cooperation between the Linux Foundation and the software company Google, in which the Google project Kubernetes transferred as a donation to the CNCF. The goal of the organization is to promote the container technology in open source projects. Beside Google, the other members are Docker, Twitter, Huawei, Intel, Cisco, IBM, Univa, and VMware.
Since the middle of 2015, Kubernetes 1.0 has been released for use in production systems and is used, for example, in the Google Container Engine (GKE).
The goal of Kubernetes is to automate applications in a cluster. To do this, the orchestration tool uses a REST-API, the command line program, and a graphical web interface as controls interfaces, via which the automations can be triggered and status reports can be requested. Use Kubernetes to:
- execute container-based photos on a cluster,
- install and manage applications in distributed systems,
- scale applications, and
- use the available hardware as best as possible.
To this end, Kubernetes combines containers into logical parts – so-called pods. Pods represent the basic units of the cluster manager, which can be distributed in the cluster by scheduling.
Like Docker’s Swarm, Kubernetes is also based on a master-slave architecture. A cluster is composed of a Kubernetes master as well as a variety of slaves, s-called Kubernetes nodes (also workers or minions). The Kubernetes master functions as a central control plane in the cluster and subsists of four basic components, which make it possible to direct communication in the cluster and to distribute tasks. A Kubernetes master consists of an API server, the configuration memory etcd, a scheduler, and a controller manager.
- API server: All automations in the Kubernetes cluster are initiated with REST-API via an API server. This functions as the central administration interface is the cluster.
- etcd: You can imagine the open source configuration memory etcd as a memory of a Kubernetes cluster. The Key Value Store developed by CoreOS specifically for distributed systems stores configuration data and makes it available to every node in the cluster. The current state of the cluster can be managed at any time via etcd.
- Scheduler: The scheduler is responsible for distributing container groups (pods) in the cluster. For this, it determines the resource requirements of a pod and matches this with the available resources of the individual nodes in the cluster.
- Controller manager: The controller manager is a service of the Kubernetes master and regulates the state of the cluster, performs routine tasks, and so controls orchestration. The main task of the controller manager is to ensure that the state of the cluster corresponds to the defined target state.
The overall components of the Kubernetes master can be located on the same host or distributed over several master hosts within a high-availability cluster.
While the Kubernetes master is responsible for the orchestration, the pods distributed in the cluster are run on hosts, the Kubernetes nodes, which are subordinate to the master. To do this, a container engine needs to run on each Kubernetes node. Docker represents the de facto standard. But in principle, Kubernetes is not committed to a specific container engine.
In addition to the container engine, Kubernetes nodes cover the following components:
- kubelet: kubelet is an agent that runs on each Kubernetes node and is used to control and manage the node. As the central point of contact of each node, kubelet is connected to the Kubernetes master and ensures that information is passed on to and received from the control plane. The agent accepts commands and tasks from the Kubernetes master and monitors the execution on the respective node.
- kube-proxy: In addition to the container engine and the kubelet engine, the proxy service kube-proxy runs on every Kubernetes node. This ensures that requests are forwarded from the outside to the respective containers, and provides services to users of container-based applications. The kube-proxy also offers rudimentary load balancing.
The following graphic shows a schematic representation of the master-slave architecture on which the orchestration platform Kubernetes is based:
In addition to the core project Kubernetes, there are numerous tools and extensions that make it possible to add additional functions to the orchestration platform. The most popular are the monitoring and error diagnosis tools Prometheus, Weave Scope, and sysdig, as well as the package manager Helm. Plugins also exist for Apache Maven and Gradle, as well as a java API for the remote control of Kubernetes.
Shipyard is one of the Docker user community-developed management solutions based on Swarm, which allows users to maintain Docker resources like containers, images, hosts, and private registries via a graphical user interface. This is available as a web application via the browser and so is different than the desktop app Kitematic.
The software is 100% compatible with the Docker remote APIa dn uses the open source noSQL database RethinkDB for data storage for user accounts, addresses, and occurrences.
Beside the cluster management functions via a central web interface, Shipyard also presents user authentication as well as role-based access control.
The software is based on the cluster management toolkit Citadel and subsists essentially of three main components: Controller, API, and UI.
- Shipyard controller: The controller is the core component of the management tool Shipyard. The Shipyard controller integrates with the RethinkDB as part of the data storage and makes it possible to address individual hosts in a Docker cluster and to control events.
- Shipyard API: The Shipyard API is based on REST. All functions of the management tool are controlled via the Shipyard API.
- Shipyard user interface (UI): The Shipyard UI is an AngularJS app, which presents users with a graphical user interface for the management of Docker clusters in the web browser. All interactions in the user interface take place via the Shipyard API.
Further information about Shipyard can be found on the official website of the open source project.
The developers of the open source software project Panamax have set themselves the goal of simplifying the deployment of multi-container apps. The free tool offers users a graphical user interface via which they can comfortably develop, deploy, and distribute complex applications based on Docker containers using drag-and-drop.
Panamax makes it possible to save complex multi-container applications as application templates, and distribute them in cluster architectures with just a click. Via an integrated app marketplace hosted by GitHub, templates for self-made applications can be stored in repositories and made available to other users. The following animation demonstrates the Panamax concept with the motto “Docker Management for Humans”:
Panamax is offered by CenturyLink as an open source software under the Apache 2 license. The basic components of the Panamax architecture are divided into two groups: the Panamax local client, and any number of remote deployment targets.
The Panamax local client is the core component of the Docker tool. This is executed on the local system and allows complex container-based applications to be created. The local client is comprised of the following components:
- CoreOS: Installation of the Panamax local client requires the Linux distribution CoreOS as its host system, which is specifically designed for software containers. The development team recommends that users install CoreOS on the local system via Vagrant and Oracle Virtualbox. The classic software structure of a Panamax installation is as follows: The base is a local computer with any operating system. Users install CoreOS on this with the help of the virtualization software VirtualBox. The Panamax client is then run as a Docker container in CoreOS. The software was specially developed as a Docker client for the lean Linux distribution. In addition to the Docker features, users have access to various CoreOS functions. These include Fleet and Journalctl, among others:
- Fleet: Instead of integrating directly with Docker, the Panamax Client uses the cluster manager Fleet to orchestrate its containers. Fleet is a cluster manager that controls the Linux daemon systemd in computer clusters.
- Journalctl: The Panamax client uses Journalctl to request log messages from the Linux system manager systemd from the journal.
- Local client installer: The local client installer contains all components necessary for installing the Panamax client on a local system.
- Panamax local agent: The central component of the local client is the local agent. This is linked to various other components and dependencies via the Panamax API. Among others, these include the local Docker host, the Panamax UI, external registries, and the remote agents on the deployment targets in the cluster. The local agent integrates with the following program interfaces on the local system via the Panamax API to exchange information about running applications:
- Docker remote API: Panamax searches for images on the local system via the Docker remote API and obtains information about running containers.
- etcd API: Files are transmitted to the CoreOS Fleet daemon via the etcd API.
- systemd-journal-gatewayd.services: Panamax obtains the journal output of running services via systemd-journal-gatewayd.services.
In addition, the Panamax API also enables interactions with various external APIs.
- Docker registry API: Panamax obtains image tags from the Docker registry via the Docker registry API.
- GitHub API: Panamax loads templates from the GitHub repository using the GitHub API.
- KissMetrics API: The KissMetrics API collects data about templates that users run.
- Panamax UI: The Panamax UI functions as a user interface on the local system and enables users to control the Docker tool via a graphical interface. User input is directly forwarded to the local agent via Panamax API. The Panamax UI is based on the CTL Base UI Kit, a library of UI components for web projects from CenturyLink.
Each node in a Docker cluster without management tasks is called a remote deployment target in Panamax terminology. Deployment targets consist of a Docker host, which is configured to deploy Panamax templates with the help of the following components:
- Deployment target installer: The deployment target installer starts a Docker host, complete with a Panamax remote agent and orchestration adapter.
- Panamax remotea agent: If a Panamax remote agent is installed, then applications can be distributed over the local Panamax client to any desired endpoint in the cluster. The Panamax remote agent runs as a Docker container on every deployment target in the cluster.
- Panamax orchestration adapter: In the orchestration adapter, the program logic is provided for each orchestration tool available for Panamax in an independent adapter layer. Because of this, users have the option to always choose the exact orchestration technology that is supported by their target environment. Pre-configured adapters include Kubernetes and Fleet:
- Panamax kubernetes adapter: In combination with the Panamax remote agent, the Panamax Kubernetes adapter enables the distribution of Panamax templates in Kubernetes clusters.
- Panamax fleet adapter: In combination with the Panamax remote agent, the Panamax Fleet adapter enables the distribution of Panamax templates in clusters controlled with the help of the Fleet cluster manager.
The following graphic shows the interplay between the individual Panamax components in a Docker cluster:
The CoreOS-based Panamax container management tool provides users with a variety of standard container orchestration technologies through a graphical user interface, as well as a comfortable option for managing complex multi-container applications in cluster architectures across any system (i.e. your own laptop).
With the public template repository, Panamax users have access to a public template library with various resources via GitHub.
Drone is a lean continuous integration platform with minimal requirements. Use the tool written in Go to automatically load builds (individual development stages of a new software) from Git repositories such as GitHub, GitLab, or Bitbucket, and run them in isolated Docker containers for test purposes. You can run any test suite and send reports and status messages via e-mail. For every software test, a new container based on images from the public Docker registry is created. So, any publicly available Docker image can be used as the environment for the code to be tested.
“Continuous Integration” (CI) refers to a process in software development, in which newly developed software components – builds – are merged and run in test environments at regular intervals (usually at least once a day). Complex applications are often developed by large teams. CI is a strategy to recognize and resolve integration errors early, which can occur during the collaboration of different developers. Drone offers software developers the possibility to realize different test environments with the help of Docker containers.
Drone is integrated in Docker and supported by various programming languages, such as PHP, Node.js, Ruby, Go, or Python. The container platform is the only true dependency. You can create your own personal continuous integration platform with Drone on any system on which Docker can be installed. Drone supports various version control repositories. You can find a guide for the standard installation with GitHub integration on the open source project’s website under readme.drone.io.
Control of the continuous integration platform takes place over a web interface. Here you can load software builds from any Git repository, merge them into applications, and run the result in a pre-defined test environment. For this, there is a .drone.yml file defined for each software test which specifies how to create and run the application.
In the following video, Drone co-developer Brad Rydzewski gives some insight into the open source continuous integration platform:
When it comes to building and operating open source-based cloud structures, OpenStack is the software solution of choice. The open source cloud operating system emerged in 2010 from a cooperation between the US web host Rackspace and the space agency NASA. The free software project found support under Apache licensing through companies such as AT&T, SUSE, Canonical, HP, Intel Red Hat, and IBM.
With OpenStack you can manage computer, storage, and network resources in a data center from a central dashboard and make them available to end users via a web interface.
The cloud operating system is based on a modular architecture that’s comprised of six core components:
- Nova (central computation component): The central computing component of OpenStack goes by the nickname “Nova” and is basically the brain of the software architecture. Nova is responsible for the control of all the system’s computing entities and enables users to start virtual machines (VMs) based on images, run them in cloud environments, and manage them in clusters. VMs can be distributed over any number of notes. As a cloud computing fabric controller (controlling entity for the computer network), Nova is the basic component for IaaS services based on OpenStack. Nova supports various hypervisor and virtualization technologies, as well as bare metal architectures where VMs are installed directly on the hardware. Generally used are KVM, VMware, Xen, Hyper-V, or the Linux container software LXC.
- Neutron (network component): Neutron (formally Quantum) is a portable, scalable, and API-supported system component used for network control. The module provides an interface for complex network topologies and supports various plugins through which extended network functions can be integrated.
- Cinder (block storage): Cinder is the nickname of a component in the OpenStack architecture that provides persistent block storage for the operation of virtual machines. The module provides virtual storage via a self-service API. Through this, end users can make use of storage resources without being aware of which device is providing the storage.
- Keystone (identity service): Keystone provides OpenStack users with a central identity service. The module functions as an authentication and permissions system between the individual OpenStack components. Access to projects in the cloud is regulated by tenants. Each tenant represents a user, and several user accesses with different rights can be defined for each.
- Glance (image service): With the Glance module, OpenStack provides a service that allows images of virtual machines to be stored and retrieved. The computation component Nova uses images provided by Glance as a template to create virtual machine instances.
- Swift (object storage): The OpenStack module Swift is used by the computation component Nova to save unstructured data objects and retrieve them via a REST-API. Swift is characterized by a redundant, scalable architecture and a high error tolerance.
Since the Havanna release in 2013, the core component Nova has offered a hypervisor driver for the Docker engine. This embeds an HTTP client in the OpenStack architecture with which the internal Docker API can be addressed via a Unix socket.
The Docker driver for Nova makes it possible to retrieve images from the image service Glance and load them directly into the Docker file system. In addition, files can be exported from the container platform via the Docker standard command (docker save) and stored in Glance. A step-by-step manual about how to integrate Docker into an existing OpenStack architecture is provided on the OpenStack wiki.
The OpenStack foundation also offers a 90-minute workshop for Docker integration on YouTube:
In addition to the six core components, the OpenStack architecture can be extended by various optional modules:
- Horizon (dashboard): The OpenStack dashboard is nicknamed “Horizon” and offers users a web-based, graphical user interface for controlling other components like Nova or Neutron.
- Heat (orchestration): The optional module Heat provides a service for orchestrating multiple components of existing cloud applications via templates. To do this, Heat offers a native REST-API and the HOT template format. The AWS cloud formation template format as well as a corresponding query API are also supported.
- Ceilometer (telemetry service): Ceilometer is OpenStack’s central telemetry service, with which data can be collected for statistical surveys on the subject of billing or benchmarking.
- Trove (database-as-a-service): Trove is a database-as-a-service component that provides both relational and non-relational databases.
- Zaqar (messaging): The optional module Zagar (formerly Marconi) extends OpenStack with a multi-tenant cloud messaging service for web developers. The software offers a REST-API for sending messages between different components of a SaaS or mobile app.
- Designate (DNS service): Designate offers DNS-as-a-Service for OpenStack. The management of domain records takes place over a REST-API. The module is multi-tenant capable and uses authentication functions via a Keystone function.
- Barbican (key management): The Barbican module provides a REST-API for saves, managing, and presenting passwords, keys, and certificates.
- Sahara (elastic map reduce): The optional component Sahara enables users to create and manage Hadoop clusters based on OpenStack.
- Ironic (bare metal driver): The additional module Ironic is a fork of the baremetal driver of the computational component Nova, and gives OpenStack users the possibility to provide bare metal machines instead of virtual machines.
- Murano (application catalog): Murano gives developers and cloud administrators the ability to deploy applications in a searchable catalog sorted by category.
- Manila (shared file system): Manila is an additional module for OpenStack that expands the architecture of the cloud operating system by collective and distributed file systems.
- Magnum (container API service): An important extra component is the API service Magnum, the provides container orchestration tools such as Docker Swarm, Kubernetes, or Apache Mesos as OpenStack components.
Because of the modular expandability, OpenStack has been established as one of the leading operating systems for the construction and operation of open source-based cloud structured. For Docker users, OpenStack and its Nova Docker driver are a powerful platform that allow containers to be run and managed professionally in complex distributed systems.
DC/OS (Data Center Operating System) is an open source software for the operation of distributed systems developed by Mesosphere Inc. The project is based on the open source cluster manager Apache Mesos and is an operating system for data centers. The source code is available to users under the Apache license Version 2 on GitHub. The developers also market an enterprise version of the software at mesosphere.com. Extensive project documentation can be found on dcos.io.
Consider DC/OS as a type of Mesos distribution that provides all the functions of the cluster manager via a central control interface, and also extends Mesos with a comprehensive infrastructure where various components make the operation of applications on distributed systems, in the cloud, or on in-house environments much easier. The following video from the development team contains a short demonstration of the DC/OS basic functions:
DC/OS uses the distributed system core of the Mesos platform. This makes it possible to bundle the resources of an entire data center and manage them in the form of an aggregated system like a single logical server. This way, you can control entire clusters of physical or virtual machines with the same ease with which you operate a single computer.
The software simplifies the installation and management of distributed applications and automates tasks such as resource management, scheduling, and inter-process communication. The management of a cluster based on Mesosphere DC/OS, as well as its included services, takes place over a central command line program (CLI) or web interface (GUI).
DC/OS isolates the resources of the cluster and provides shared services, such as service discovery or package management. The core components of the software run in a protected area – the core space. This includes the master and agent programs of the Mesos platform, which are responsible for resource allocation, process isolation, and security functions.
- Mesos master: The Mesos master is a master process that runs on a master node. The purpose of the Mesos master is to control resource management and orchestrate tasks (abstract work units) that are carried out on an agent node. To do this, the Mesos master distributes resources to registered DC/OS services and accepts resource reports from Mesos agents.
- Mesos agents: Mesos agents are slave processes that run on agent accounts and are responsible for executing the tasks distributed by the master. Mesos agents deliver regular reports about the available resources in the cluster to the Mesos master. These are forwarded by the Mesos master to a scheduler (i.e. Marathon, Chronos, or Cassandra). This decides which task to run on which node. To run a task, the Mesos agent starts a so-called executor process, which is run and managed in isolation in a container. The separation of the process takes place either on the Mesos universal container Runtime or the Docker container platform.
All other system components as well as applications run by the Mesos agents via executor run in the user space. The basic components of a standard DC/OS installation are the admin router, the Mesos DNS, a distributed DNS proxy, the load balancer Minuteman, the scheduler Marathon, Apache ZooKeeper, and Exhibitor.
- Admin router: The admin router is a specially configured webserver based on NGINX that provides DC/OS services as well as central authentication and proxy functions.
- Mesos DNS: The system component Mesos DNS provides service discovery functions that enable individual services and applications in the cluster to identify each other through a central domain name system (DNS).
- Distributed DNS proxy: The distributed DNS proxy is an internal DNS dispatcher.
- Minuteman: The system component Minuteman functions as an internal load balancer that works on the transport layer (Layer 4) of the OSI reference model.
- DC/OS Marathon: Marathon is a central component of the Mesos platform that functions in the Mesosphere DC/OS as an init system (similar to systemd). Marathon starts and supervises DC/OS services and applications in cluster environments. In addition, the software provides high-availability features, service discovery, load balancing, health checks, and a graphical web interface.
- Apache ZooKeeper: Apache ZooKeeper is an open source software component that provides coordination functions for the operation and control of applications in distributed systems. ZooKeeper is used in Mesosphere DC/OS for the coordination of all installed system services.
- Exhibitor: Exhibitor is a system component that is automatically installed and configured with ZooKeeper on every master node. Exhibitor also provides a graphical user interface for ZooKeeper users.
On cluster resources aggregated via DC/OS, diverse workloads can be executed at the same time. For example, this enables parallel operation on the cluster operating system of big data systems, microservices, or container platforms such as Hadoop, Spark, and Docker.
With Mesosphere Universe, a public app catalog is available for DC/OS. With this, you can install applications like Spark, Cassandra, Chronos, Jenkins, or Kafka easily with a click on the graphical user interface.
Docker container as part of a digital infrastructure platform
Docker has successfully managed to revamp the IT industry from the ground up. If you take a look at the statistics of the open source project, it’s hard to believe that the first release of the container platform was only four years ago, on March 13, 2013.
Docker users have downloaded more than 8 billion container images since then, and around a half million different Docker apps are available to users in the registry. The project is based on a developer community of around 3,000 contributors. Today, the Docker ecosystem encompasses a vast array of tools, extensions, and software platforms that make work with Docker container ever more efficient. According to their data, the container technology developed by Docker is currently involved in more than 100,000 third-party projects. But what is the success of the Docker platform and its side projects based on?
In times when more and more software users are being drawn into the cloud, the value of an application increases with its degree of networking and distribution. This trend is also reflect in the type and manner of software development, testing, and execution today. Cloud infrastructure offers the attractive option for companies to centrally manage IT resources and provide location-independent access. Concepts like infrastructure-as-a-service (IaaS) make business processes independent from the hardware resources of their own server rooms, and so offer maximum flexibility and scalability through on-demand billing models.
Professionally operated data centers or large hosting providers also offer strategies for failure safety, high availability, and business continuity, which small and medium-sized companies can rarely implement in their own facilities. Against this backdrop, a lightweight virtualization technology that allows applications with minimal overhead to be transferred to a portable, platform-independent format and distributed with a single command line directive in cloud infrastructures and hybrid environments really hits the mark.
Docker uses the “Digital Infrastructure Platform” trend, infrastructure as a cloud-based service, like no other technology. The Docker ecosystem makes it possible to break up complex applications into microservices, which can then be distributed via APIs over various nodes in a cluster. Companies can now secure themselves not just agility and flexibility, but also stability. Multi-container apps whose processes run on various hosts in a distributed system can be scaled horizontally and enjoy fail-safe operation thanks to redundancies. The extensive independence of containers in encapsulated microservices also ensures that updates and patches are only applied to parts of an application, and never interfere with the application as a whole. Software errors can be resolved with much less effort as a result.
Even when changing from traditional infrastructures to a digital infrastructure platform, container technology can still play a key role: Most of all, the ability to pack applications and all of their dependencies into a portable container and distribute them across platforms (deployment) helps administrators to transfer old systems (i.e. legacy applications) from static IT to dynamic IT.
IT environments can be divided into two basic areas, based on the increasing popularity of cloud-based infrastructures: Static IT includes enterprise and backend applications that operate on classic infrastructures in on-premise environments or on a private cloud. If companies create new applications and systems, these are more and more often implemented directly on public cloud infrastructure platforms in order to ensure maximum scalability, flexibility, and location independence. For this area, the term dynamic IT was established.
The use of a container platform like Docker also offers the chance to optimize applications in regard to speed, performance, and change management. But there are still concerns about the security of container technology.
Given that you have the container engine installed, you can run container-based applications on any system. Containers include not only the code of the actual application, but also all binary files and libraries required for the runtime of the application. This primarily simplifies the deployment, or distribution of software on various systems – and not only for newly developed applications. The container technology is also effectively used when it comes to legacy applications – aka out-of-date applications still considered essential for daily operations – and transferring them to the cloud.
Older software components are often difficult to integrate into cloud infrastructures because of their dependencies on particular operating systems, frameworks, or classes. Here, a lean container engine like Docker can provide an abstraction layer that compensates for code dependencies without the need for administrators to take the overhead of a virtual operating system into account.
Speed and performance
If you compare the container technologies that the Docker platform is based on with a classic hardware virtualization via VM, the differences in terms of speed and performance are very clear.
Since applications encapsulated in containers rely directly on the core of the host system, no separate systems have to be started for this type of virtualization. Instead of a hypervisor, Docker uses a lightweight container engine. This eliminates not only the boot-up time for a virtual machine, but also the resource consumption of the additional operating system. Container apps are ready quicker and with fewer resources than applications based on virtual machines. This enables the administration to start significantly more containers from one host system than would be possible on guest systems on a comparable hardware platform. For example, a server can host from 10 to 100 virtual machines, but around 1,000 containers.
There’s also the added fact that containers which include only binary files and libraries in addition to the application code need far less storage space than a virtual machine, including the application and operating system.
DevOps and change management
Digitalization and the trend toward mobile internet user has significantly accelerated the life cycle of applications. It’s not only updates, patches, and bug fixes that need to be provided as soon as possible, but also the releases of new software versions that are following one another at shorter and shorter intervals.
But deploying updates presents challenges for developers and administrators. While software manufacturers would like to present their customers with new functions for an application as quickly as possible, administrations are terrified of the risk of failure that any change to the IT infrastructure and its software components brings along with it. Solution strategies for difficulties of this type are given the keyword DevOps. The DevOpsDays were launched in 2009, at which process improvement approaches for the cooperation of development and IT operations were discussed for the first time at an international conference.
The goal of DevOps is to improve the quality of new software versions as well as to accelerate development, distribution, and implementation through a more effective cooperation of all involved parties and extensive automation. The automatable DevOps tasks include build processes from the repository, static and dynamic code analysis, as well as module, integration, system, and performance tests. At the heart of the DevOps approach there are still considerations of continuous integration (CI) and continuous delivery (CD) – two central application fields of the Docker platform.
Continuous delivery is an approach meant to optimize software distribution using various technologies, processes, and tools at the touch of a button. Central to this is an automation of the so-called deployment pipeline.
Docker offers integration possibilities for established CI/CD tools such as Jenkins, Travis, or Drone and make it possible to automatically load code from the Docker Hub or version control repositories like GitHub, GitLab, or Bitbucket. The container platform represents a foundation for DevOps workflows in which developers can create new application components together and run them in any test environment.
Docker also supports on-premise, cloud, and virtual environments as well as various operating systems and distributions. One of the key benefits of a Docker-based CI/CD architecture is that companies are no longer slowed down by inconsistencies between different IT environments.
Even if containers run encapsulated processes on the same core, Docker uses a number of isolation techniques to shield them from one another. These focus on core functions of Linux core, like Cgroups and Namespaces. Each container gets its own host name, own process IDs, and own network interface. Each container also only sees the part of the file system assigned to it. The allocation of system resources like storage, CPU, and network bandwidth happens on a Cgroup mechanism. This ensures that each container can only claim the share allocated to it.
Containers still don’t offer the same degree of isolation that can be accomplished with virtual machines. If an attacker hijacks a virtual machine, they still have little chance of interacting with the core of the underlying host system. Containers as encapsulated instances of the same host core, though, give attackers significantly more freedom.
Despite the described isolation techniques, important core subsystems such as Cgroups as well as core interfaces in the /sys and /proc directories can be reached from containers. This gives attackers the ability to circumvent the host’s security functions. Plus, all containers run on a host system in the same user namespace. As a result, a container that’s granted root privileges retains them even when interacting with the host core. So administrators should make sure that all containers start with only restricted rights.
The Docker daemon, which is responsible for managing containers on the host system, also has root privileges. A user who has access to the Docker daemon automatically obtains access to all of the directories that the daemon can access, as well as the ability to communicate over a REST-API via HTTP. The Docker documentation recommends to only grant Daemon access to trustworthy users.
The Docker development team also recognized these safety concerns as an obstacle for the establishment of container technology on production systems. In addition to the fundamental isolation techniques of the Linux core, newer versions of the Docker engine also support the frameworks AppArmor, SELinux, and Seccomp, that function as a type of firewall for core resources.
- AppArmor: With AppArmor, access rights of containers to the file systems are regulated.
- SELinux: SELinux provides a complex regulatory system where access control to core resources can be implemented.
- Seccomp: Seccomp (Secure Computing Mode) supervises the invoking of system calls.
Docker also uses Linux capabilities to restrict the root permissions with which the Docker engine starts containers.
Other security concerns also exist regarding software vulnerabilities within application components that are distributed by the Docker registry. Since basically anyone can create Docker images and make them publically accessible to the community in the Docker Hub, there’s the risk of introducing malicious code to your system through an image download. Before deploying an application, Docker users should make sure that the entire code provided in an image for the execution of containers stems from a trustworthy source. As part of the container platform’s enterprise edition (EE), Docker has been offering a certification program since the beginning of 2017 through which infrastructure, container, and plugin providers can test and distinguish their software. To obtain a certificate, the following requirements must be fulfilled:
- Infrastructure certification: Software developers who would like to provide certified infrastructure components for the Docker ecosystem have to prove, according to the appropriate tests, that their product is optimized for collaboration with the Docker platform.
- Container certification: A container will only be awarded with the official Docker certificate if it’s created in accordance with best practices and has passed all software tests, vulnerability checks, and security audits.
- Plugins: A plugin for Docker EE can only be adorned with the Docker certificate if it’s developed in accordance with best practices and has passed all API compliance tests and vulnerability checks.
In addition to boosting the security for users, Docker certification are designed to provide software developers with the ability to make their projects stand out from the large number of available resources on the market.
Conclusion: How secure is Docker for the future?
Container technology is no longer a niche topic. There are already a number of market-leading companies, such as Amazon, Red Hat, or Cisco are converting parts of their IT to container systems. The technology is a resource-conserving alternative to classic hardware virtualization and is particularly suitable for companies that need to provide interactive web applications or handle large volumes of data. An increasing demand is expected in the coming years in all e-commerce industries. Global players like eBay and Amazon already use containers as their standard technology. Smaller online retailers will soon follow suit.
At the same time, the fledgling software company Docker has gained a spot as a technological market leader in the industry of operating system level virtualization, thanks to the open source concept and efforts to standardize within a few years. But competition in the container industry doesn’t sleep. Users benefit from the speed of innovation that the intense competition between providers of container solutions and administration tools brings with it.
In the past, security concerns were primarily an obstacle in the way of the spread of container technology. The central challenge for developers of container-based virtualization technology is the improvement of the isolation process in the Linux core – an area which saw progress in the past through projects such as SELinux, AppArmor, and Seccomp.
If you look at the high acceptance rate for containers, it can be assumed that the operation system level virtualization will establish itself as an alternative to virtual machines in the long-term as well. The Docker project and its rapidly growing ecosystem in particular offer companies a future-proof basis for the operation of software containers in the development and operation business.