top of page
  • Writer's pictureSathish Kumar

Docker Networking - Part 3 (Overlay Driver)

Updated: Jan 15, 2021



In the previous article, I gave an introduction to Docker Swarm. Docker swarm enables users to run microservices (containers) across multiple hosts, thereby providing redundancy and protection in case of hardware failure. Manager host in a swarm cluster ensures the required number of replicas are always running- if a host reboots, manager nodes spins up the required number of replicas on hosts that are healthy.


So, how do containers running in different swarm hosts communicate with each other? In this article, I will try to answer this question.



Nodes in a swarm cluster need not share the same subnet and can be connected anywhere. To become a member of the swarm cluster all a node needs is manager node IP and auth keys. Given this, if we were to deploy a webservice in a cluster following network requirements come into play:

  1. The webservice should be accessible to the external world with any of swarm node host IP address. This could be used in load balancing.

  2. Cluster nodes should be able to communicate with each other.

Requirement "1" is taken care of by docker swarm implementation. When a service is created, the docker swarm automatically opens the exposed port on all docker swarm nodes. The service can be accessed with any swarm node IP address. Further, docker implements something called "routing mesh" along with a load balancer- this internally load balances traffic between containers running the service in different nodes.


Docker provides a network driver called Overlay, which makes the requirement "2" possible.


Before creating an overlay network, we need to ensure the docker swarm cluster is created. You can look at this article for a brief overview of how to create a swarm cluster.


I have already created a 2-node cluster



root@sathish-vm1:/home/sathish# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
dtbd1uwpw2q958e9jkf5nnd52 *   sathish-vm1         Ready               Active              Leader              19.03.12
9ea38z6jayedmyh813ekzoo80     sathish-vm2         Ready               Active                                  19.03.12

Let's create the overlay network

root@sathish-vm1:/home/sathish# docker  network create web --driver overlay
s7uhj2h4m6talob0z6yxl0l4t

root@sathish-vm1:/home/sathish# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
01f45c79f691        bridge              bridge              local
bf422f628443        docker_gwbridge     bridge              local
a2eeae8a2473        host                host                local
szmt721xjbw5        ingress             overlay             swarm
62ca27e1244c        none                null                local
s7uhj2h4m6ta        web                 overlay             swarm

The overlay driver option in the above command creates an overlay network and as we can see its scope is Docker swarm.


Let's create 6 web replicas in swarm cluster attaching the newly created web overlay network



root@sathish-vm1:/home/sathish# docker service create --replicas 6 --network web --name web -p 80:80 httpd
thb128sv9vx5u5ln26bwiohoy
overall progress: 0 out of 6 tasks
overall progress: 6 out of 6 tasks
1/6: running   [==================================================>]
2/6: running   [==================================================>]
3/6: running   [==================================================>]
4/6: running   [==================================================>]
5/6: running   [==================================================>]
6/6: running   [==================================================>]
verify: Service converged

root@sathish-vm1:/home/sathish# docker service ps web
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE               ERROR               PORTS
jaliyp3uy95b        web.1               httpd:latest        sathish-vm1         Running             Running about an hour ago
mwhpczmpiy6n        web.2               httpd:latest        sathish-vm2         Running             Running about an hour ago
jnayg3jrhvmi        web.3               httpd:latest        sathish-vm1         Running             Running about an hour ago
93j50f81fp2d        web.4               httpd:latest        sathish-vm2         Running             Running about an hour ago
h1ovrklo2ras        web.5               httpd:latest        sathish-vm1         Running             Running about an hour ago
ntp9u8ljk9ij        web.6               httpd:latest        sathish-vm2         Running             Running about an hour ago

Now that 6 instances are created and distributed across VM1 and VM2, let's check out how they are attached to be "web" network with inspect.



On VM1

root@sathish-vm1:/home/sathish# docker network  inspect  web
[
    {
        "Name": "web",
        "Id": "uecxc14ur67nvolfqtsimuodz",
        "Created": "2020-08-26T05:10:35.625292416Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.1.0/24",
                    "Gateway": "10.0.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "57720052786d5181f833898285c50ec95df57a55bb74814b30eb2ca16ba0f01a": {
                "Name": "web.5.h1ovrklo2ras0chpvku37qbe8",
                "EndpointID": "ea9a043b4c6e27cad92be2542a66b7779ba70a9d7ad5a1e39d7f751c52a8aba3",
                "MacAddress": "02:42:0a:00:01:04",
                "IPv4Address": "10.0.1.4/24",
                "IPv6Address": ""
            },
            "613d727ca1069fb557dd96f3b8a5aec50ba6891890221a331e2e9e3fe09bea3b": {
                "Name": "web.1.jaliyp3uy95bae70y07rrcsjw",
                "EndpointID": "9153d6dcd59b832d40630e2d71f1eef2ad150cc2d274fd25130bd98a35047d1f",
                "MacAddress": "02:42:0a:00:01:06",
                "IPv4Address": "10.0.1.6/24",
                "IPv6Address": ""
            },
            "effff3309e01f9cf7d93964fd9e9c7353a3667c701db24d3c91e552e61708aee": {
                "Name": "web.3.jnayg3jrhvmio4t7q25ksv5yp",
                "EndpointID": "16cad75311929bf0d19d9ba1ad529dc8b0ee6ff3fdb86812f5c521ccac4339c1",
                "MacAddress": "02:42:0a:00:01:08",
                "IPv4Address": "10.0.1.8/24",
                "IPv6Address": ""
            },
            "lb-web": {
                "Name": "web-endpoint",
                "EndpointID": "c05a2798b05513f6195a93b28eec2ce7c3b12e8866210a3ab45005baa7aa56bb",
                "MacAddress": "02:42:0a:00:01:0a",
                "IPv4Address": "10.0.1.10/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "1fc3633ec5ac",
                "IP": "192.168.68.109"
            },
            {
                "Name": "7147a50520de",
                "IP": "192.168.68.110"
            }
        ]
    }
]

and on VM2



root@sathish-vm2:/home/sathish# docker network  inspect web
[
    {
        "Name": "web",
        "Id": "uecxc14ur67nvolfqtsimuodz",
        "Created": "2020-08-26T05:10:35.621205724Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.1.0/24",
                    "Gateway": "10.0.1.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "5f8954fd3645e29252d77b6324290ad5c43d449a53d12a8f300e1ac85680a0eb": {
                "Name": "web.4.93j50f81fp2deru56arz0uhjx",
                "EndpointID": "2af0d83e92cb33bcb167120973f0838602785287ca1a7d6cb0d4cdb7696c64e6",
                "MacAddress": "02:42:0a:00:01:03",
                "IPv4Address": "10.0.1.3/24",
                "IPv6Address": ""
            },
            "857e10dde48fe938f466d581d732ed6282723c5cc53f0a96272acb7dc4d2f61d": {
                "Name": "web.6.ntp9u8ljk9ijj0ld96382k627",
                "EndpointID": "b31559d19d6de83534730b2b3fa33d557d308d9ed74d0eddb4b644d619877217",
                "MacAddress": "02:42:0a:00:01:05",
                "IPv4Address": "10.0.1.5/24",
                "IPv6Address": ""
            },
            "9a7e34cca22ad78ec9850970cda50ac0f93b364fbd0b4340ca391f8d69c5f285": {
                "Name": "web.2.mwhpczmpiy6nvrqdhrgxcgwr4",
                "EndpointID": "e339bb448daaf5c5efb5040c7db481122e7b48c811cc51510a5a66aac83fc4a4",
                "MacAddress": "02:42:0a:00:01:07",
                "IPv4Address": "10.0.1.7/24",
                "IPv6Address": ""
            },
            "lb-web": {
                "Name": "web-endpoint",
                "EndpointID": "b1115a371c6141db627677fada4b6f1104bc616a5b3722f44131d39e0fef0c5a",
                "MacAddress": "02:42:0a:00:01:09",
                "IPv4Address": "10.0.1.9/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "7147a50520de",
                "IP": "192.168.68.110"
            },
            {
                "Name": "1fc3633ec5ac",
                "IP": "192.168.68.109"
            }
        ]
    }
]



From the output we can see, there are 2 nodes in the swarm cluster- 192.168.68.109, 192.168.68.110. Each of the containers has a name and an IP address. For example, web.2.mwhpczmpiy6nvrqdhrgxcgwr4 running on VM2 has an IP of 10.0.1.7. Containers running in VM1 can talk to web2 containers with either IP address or the name.


Name resolution is possible because docker runs an internal name resolution service.



root@sathish-vm1:/home/sathish# cat /etc/resolv.conf
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.11
options edns0

This resolv.conf of docker host is "mounted" inside the container


Let's get a shell inside the container and check this out



root@sathish-vm1:/home/sathish# docker container  exec -it effff3309e01 /bin/bash
root@effff3309e01:/usr/local/apache2# cat /etc/resolv.conf
nameserver 127.0.0.11
options ndots:0
root@effff3309e01:/usr/local/apache2# mount | grep resolv.conf
/dev/mapper/ubuntu--vg-ubuntu--lv on /etc/resolv.conf type ext4 (rw,relatime)

Now let's test our deployment


The web service can be accessed with any docker swarm host's IP address.


root@sathish-vm1:/home/sathish# curl 192.168.68.109
<html><body><h1>It works!</h1></body></html>

root@sathish-vm1:/home/sathish# curl 192.168.68.110
<html><body><h1>It works!</h1></body></html>
root@sathish-vm1:/home/sathish#

Let's get a shell inside the container and try to ping containers running on other hosts. Note that, you must install ping using apt-get from container shell.


root@sathish-vm1:/home/sathish# docker container  exec -it effff3309e01 /bin/bash
root@effff3309e01:/usr/local/apache2# apt-get update
....output deleted for clarity........
root@effff3309e01:/usr/local/apache2# apt-get install iputils-ping
....output deleted for clarity........


Now let's ping web4 container running on VM2

root@effff3309e01:/usr/local/apache2# ping web.4.93j50f81fp2deru56arz0uhjx
PING web.4.93j50f81fp2deru56arz0uhjx (10.0.1.3) 56(84) bytes of data.
64 bytes from web.4.93j50f81fp2deru56arz0uhjx.web (10.0.1.3): icmp_seq=1 ttl=64 time=0.428 ms
64 bytes from web.4.93j50f81fp2deru56arz0uhjx.web (10.0.1.3): icmp_seq=2 ttl=64 time=3.59 ms

As we can see name resolution works- this is due to resolv.conf pointing to DNS service running on the docker host. But how does ping from one container to another one work?


Traffic between containers on different swarm hosts uses VxLAN. VxLAN is UDP in IP encapsulation protocol. Hosts belonging to the same VxLAN network are identified by a 24-bit identifier called VNI- Virtual Network Identifier. The VNI used for "web" network service is 4097 as indicted by "com.docker.network.driver.overlay.vxlanid_list": "4097" line in docker network inspect web output.


With the ping running on the container in VM1, let's capture packets on VM2 and examine its contents.


root@sathish-vm2:/home/sathish# tcpdump -eni  eth0 -w vxlan.pcap
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C1571 packets captured
1574 packets received by filter
0 packets dropped by kernel





As we can see, the ICMP request from container (10.0.1.8) was encapsulated with a VxLAN header with a VNI ID of 4097. The outer IP header uses the IP address of VM's as source and destination IP. The destination host will decapsulate the packet and hand it over to the docker container with an IP address of 10.0.1.3. The ping response (ICMP reply) will follow a similar path to reach 10.0.1.8.



Note: I was running docker swarm hosts as VM's and they share the same subnet (Hyper-V V-Switch). This is neither a requirement with docker swarm or VxLAN. You can run swarm cluster with overlay network across different ip subnets.

That's all for today folks, thanks for your time.

1,115 views0 comments

Comments


bottom of page