Using Gluster for a Distributed Docker Storage Volume

Leave your reply

Learn how to install and use Gluster to provide distributed storage for Docker containers on a Linux server. Gluster is a distributed file system which allows you to create a single storage volume spanning multiple hosts.

Thanks to the new Docker Volume plug-in for Gluster, Gluster is a natural choice for creating a distributed data storage volume for Docker containers. It provides an easy solution for situations where you need to run containers on multiple hosts (servers) which all have access to the same shared storage volume.

Requirements

  • At least two Linux servers running either Ubuntu 14.04 or CentOS 7.
  • Docker installed and running.

Firewall Rules

To begin, you will need to allow firewall access for the ports used by Gluster. Go to your Cloud Panel and click on Network -> Firewall Policies on the menu on the left.

Click Network -> Firewall Policies

If you have previously created a firewall policy, click to select that policy and scroll down to edit it. Otherwise, click Create to create your new firewall policy.

Click Create

Give your policy a name, then fill in your firewall rules.

Add firewall policy

After adding each rule, click the green + button to add it and get a new line.

Click the green +

Add the following rules:

TCP/UDP, from port 111 to port 111, ALL 
TCP/UDP, from port 24007 to port 24007, ALL
TCP/UDP, from port 24008 to port 24008, ALL 
TCP, from port 2049 to port 2049, ALL  
TCP, from port 38465 to port 38465, ALL    
TCP, from port 38466 to port 38466, ALL     
TCP, from port 38467 to port 38467, ALL

In addition to this list, you need to add one TCP/UDP port starting at 49152 for each node that will be added to the shared storage pool.

This tutorial will use two nodes, so you will need to add the following two rules at a minimum:

TCP/UDP, from port 49152 to port 49152, ALL  
TCP/UDP, from port 49153 to port 49153, ALL   

IMPORTANT: If this is a new firewall policy, you will also want to add any firewall rules which apply to your existing services.

Click the Add Predefined Values button and click to select any services which apply. At the very least, be sure to add a rule for SSH, so that you can SSH to your server.

Add predefined values

When you have finished adding your new firewall rules, click the Create button.

Click Create

Next, you need to assign your servers to this firewall rule. Scroll down and click the Assign button.

Click Assign

Click to select the server(s) you want to assign tho this firewall rule, then click Save changes.

Assign server

Install Gluster on Ubuntu 14.04

To install Gluster on Ubuntu 14.04, start by using the command:

sudo apt-get update

Once all the packages have updated, run the command:

sudo apt-get install glusterfs-server

This will install Gluster and start the service running.

You can check the status of Gluster with the command:

sudo service glusterfs-server status

Should you need to restart, stop, or start Gluster, the usual service commands will apply:

sudo service glusterfs-server restart
sudo service glusterfs-server stop
sudo service glusterfs-server start

Install Gluster on CentOS 7

To install Gluster on CentOS 7, start by using the command:

sudo yum update

Next, get the latest glusterfs-epel repository:

sudo wget -P /etc/yum.repos.d/ http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo

And the latest EPEL repository:

sudo yum install http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Install Gluster and Samba with the command:

sudo yum install glusterfs-server samba

Once Gluster and Samba have finished installing, start the Gluster service with the command:

sudo systemctl start glusterd.service

You can check the status of Gluster with the command:

sudo systemctl status glusterd.service

Should you need to restart, stop, or start Gluster, the usual systemctl commands will apply:

sudo systemctl restart glusterd.service
sudo systemctl stop glusterd.service
sudo systemctl start glusterd.service

Creating a Cluster

After installing Gluster, you can verify that it is running with the command:

sudo glusterfs --version

We will be using two servers for this tutorial, designated server1 and server2. You can use as many other servers as you like, simply run the same commands on each server.

Gluster can work with servers by IP address or by hostname, if the servers are accessible by hostname. For this tutorial we will be using:

  • server1 IP address: 192.0.0.1
  • server2 IP address: 192.0.0.2

Substitute your own hostnames or IP addresses in the examples which follow.

To begin, have Gluster probe the other server(s) which you will be adding. From server1 probe server2:

sudo gluster peer probe 192.0.0.2

Gluster should respond with peer probe: success.

Then switch to server2 and probe server1:

sudo gluster peer probe 192.0.0.1

Gluster should again respond with peer probe: success.

Now that you have told Gluster which servers are in the cluster, test the server cluster with the command:

sudo gluster peer status

This should return a message like:

Number of Peers: 1

Hostname: 192.0.0.2
Uuid: 5d77763c-3043-4a1d-bdb7-e07a4d6dc46b
State: Peer in Cluster (Connected)

This message tells us that the two-server cluster is working correctly. There is 1 other peer in the cluster, IP address 192.0.0.2.

If you run the same command from server2 you will get a similar response, but it will list the IP address of server1 instead:

Number of Peers: 1

Hostname: 192.0.0.1
Uuid: 8186baf0-ef76-4b40-ae7d-6a3bd1957a3c
State: Peer in Cluster (Connected)

Creating a Gluster Volume

Now that the cluster is up and running, it's time to assign the storage volume.

To begin, create a storage directory on server1 which will be used by Gluster to store the data:

sudo mkdir -p /data/media

Create the same storage directory on server2.

To create the Gluster volume, use the command:

sudo gluster volume create [volume name] replica [replica number] transport tcp [Server 1 IP address or hostname]:[path to storage directory] [Server 2 IP address or hostname]:[path to storage directory] 

If you are using more servers, simply add their information to the end in the same format:

[server IP address or hostname]:[path to storage directory]

We are creating a two-replica volume, so we will use replica 2. If you are adding more volumes, change the replica number to match the number of volumes you are using.

For this example we will create the volume with the command:

sudo gluster volume create media replica 2 transport tcp 192.0.0.1:/data/media  192.0.0.2:/data/media force

Now that the volume has been created, start the volume using the command:

sudo gluster volume start [volume name]

In this case the command will be:

sudo gluster volume start media

You can check the status of the volume with the command:

sudo gluster volume info

This will return results which look something like:

Volume Name: media
Type: Replicate
Volume ID: c5cadbea-0054-4c41-853e-b50733aacebf
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.0.0.1:/data/media
Brick2: 192.0.0.2:/data/media
Options Reconfigured:
performance.readdir-ahead: on

You will need to mount the Gluster volume before you can use it properly. Although it is possible to simply write files to the Gluster volume (/data/media in our example) it is strongly recommended that you work with the Gluster volume through a mount point.

The command to mount the Gluster volume is:

sudo mount -t glusterfs [hostname or IP address]:/[volume name] [mount point]

Note: By common convention, storage volumes are mounted to the /mnt directory. However, if you are already using that mount point, you will need to create a new one for Gluster. To do so, simply create the directory, then use the path to that directory in the command.

This command will need to be repeated for every host which will be accessing the Gluster volume.

On server1 the command will be:

sudo mount -t glusterfs 192.0.0.1:/media /mnt

On server2 the command will be:

sudo mount -t glusterfs 192.0.0.2:/media /mnt

Next, test to make sure the volume is working. From server1 run the command:

sudo echo "Hello from Server1" >> /mnt/hello-from-server1

If you list the files in the /mnt directory, you will see the file there:

ls /mnt

Switch to server2 and run the command:

sudo echo "Hello from Server2" >> /mnt/hello-from-server2

Install the Gluster Plug-In for Docker on Ubuntu 14.04

To begin, install Etherium using the commands:

sudo apt-get install software-properties-common
sudo add-apt-repository ppa:ethereum/ethereum
sudo add-apt-repository ppa:ethereum/ethereum-dev
sudo apt-get update
sudo apt-get install ethereum

Next, install Go and Mercurial with the commands:

sudo apt-get install golang  
sudo apt-get install gccgo
sudo apt-get install mercurial

Update your environment variables by running the commands:

export GOPATH=$HOME/go  
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$GOPATH/bin  

Finally, install the Docker Volume Plug-In for Gluster with the command:

go get github.com/calavera/docker-volume-glusterfs

Install the Gluster Plug-In for Docker on CentOS 7

To begin, install Go with the command:

sudo yum install golang
export GOPATH=$HOME/go  
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$GOPATH/bin  

Install Git and Mercurial using the command:

sudo yum install git
sudo yum install hg

Finally, install the Docker Volume Plug-In for Gluster with the command:

go get github.com/calavera/docker-volume-glusterfs

Starting the Docker Volume Plug-In

To begin, start the plug-in on server1 with the command:

docker-volume-glusterfs -servers [Server 1 hostname or IP address]:[Server 2 hostname or IP address] &

Note: sudo will not work for the docker-volume-glusterfs command by default. You will either need to su to root, or add /root/go/bin/docker-volume-glusterfs to the list of available sudo commands on your system.

List all of the servers you are using, either by IP address or hostname, separated by a colon (no spaces) at the end of this command.

For this example we are using two servers, so we will begin with the command:

docker-volume-glusterfs -servers 192.0.0.1:192.0.0.2 &

Switch to server2 and run the same command to start the plug-in. Repeat for any other server you are adding to the cluster.

Using the Docker Volume Plug-In

To use the Gluster volume, simply add --volume-driver glusterfs to the docker run command, then use --volume to specify the remote volume:

sudo docker run --volume-driver glusterfs -v [name of Gluster volume]:[path to storage directory] [other flags and commands as wanted]

Start the Gluster plug-in is for each of the containers you want to connect to the Gluster volume, on each of the servers you want to use.

To begin, on server1 launch a container from the official CentOS 7 image, and have it execute a command to create a test file on the container:

sudo docker run -it --volume-driver glusterfs -v media:/mnt centos echo "Hello from a container" >> /mnt/hello-from-container

Once the container has started, if you list the files in the /mnt directory, you will see the file hello-from-container:

ls /mnt

Next, switch to server2, launch a container from the official CentOS 7 image, and attach to it in a terminal:

sudo docker run -it --volume-driver glusterfs -v media:/mnt centos /bin/bash

Once you are at the container's command prompt, list the files in the shared storage volume with the command:

ls /mnt

You will see all three of the files we have created so far: hello-from-server1, hello-from-server2, and hello-from-container.

Basic Gluster Troubleshooting

If you have issues with Gluster, be sure to check the following:

1. Is Gluster running?

You can check the status of the Gluster daemon by using the commands:

  • Ubuntu 14.04: sudo service glusterfs-server status
  • CentOS 7: sudo systemctl status glusterd.service

If the daemon is not running, start it with the command:

  • Ubuntu 14.04: sudo service glusterfs-server start
  • CentOS 7: sudo systemctl start glusterd.service

2. Is the volume running?

Use the command:

sudo gluster volume info

You should see a line which reads:

Status: Started

If not, you may need to start the volume with the command:

sudo gluster volume start [volume name]

3. Are all the peers connected?

You can check the status of all of your cluster's peers with the command:

sudo gluster peer status

This will return the number of other peers in the cluster. For example, if you have two servers in a cluster, the output will say:

Number of Peers: 1

It will also list the IP address or hostname of the other peers.

4. Are all the ports open?

Gluster needs a long list of ports open to perform, including one port for each host you add to the cluster. This list will need to be added to your Cloud Panel and to any other firewall you may be using on your server(s).

5. Tell Gluster to resync ("self-heal")

Gluster's daemon will automatically resync files ("self-heal") when there are problems. However, sometimes it can be useful to manually trigger this operation.

To force Gluster to resync, use the command:

sudo gluster volume heal [volume name]

For example, to resync the volume we created in this tutorial named media the command would be:

sudo gluster volume heal media