Rachel's Yard

| A New Continuation
Tags OpenNebula
Aug 2 2016

For the sake of humanity, let's point out the caveats first:

  1. If you ever change the CA or apiserver TLS certificate, remember to delete default secret tokens for ALL namespaces, and recreate services/pods. Your applications in each pods can access the API with a serviceaccount credentials, and it is only generated once. Therefore, if you change the TLS certs, the old secrets will be invalid.
  2. If you are running multiple master components, remember to add --apiserver-count=<count> flag in your kube-apiserver. Otherwise, apiserver will fight to get control of the service endpoints.

Now, let's get to the topic.

Billions of $$$ are awesome, but how do you invest? (joke

Docker is awesome (late to the party again), but how to manage?

TL;DR Kubernetes is a management tool (sort of) of containers.

I will just skip the introduction, since there are many articles out there already.

Installing Kubernetes

The best way of running Kubernetes is to deploy it on CoreOS, period.

https://coreos.com/kubernetes/docs/latest/getting-started.html

See, all other OSs are too heavyweight. CoreOS (other than SmartOS) is highly specialized to run containers, so there's that. Plus, folks at Quay.io is kind enough to have kubelet image ready. Kubelet is a binary that contains all the components that you will need to run Kubernetes (Golang is awesome), and Quay/CoreOS team makes it running in a rkt container, which makes updates/upgrades easy.

Of course, all components are stateless. Persistent states are stored in a etcd cluster (that seems to be the trend).

What's my environment?

An obligatory graph of architecture:

Kubernetes

Here are my cloud-config files for my deployment on OpenNebula: https://coreos-opennebula.s3.fmt01.sdapi.net/cloud-config/

What am I running?

Currently only a project written for an econ professor, but I will containerize more of my projects.

Ingress

The nginx controller on kubernetes/contrib kind of blows. So I (sort of) compiled the newest nginx and CHACHA20 ciphers:

Docker: https://hub.docker.com/r/zllovesuki/nginx-slim/

Git: https://git.fm/zllovesuki/nginx-slim/

You will need to recompile the controller with this tag.

Continuing from the last post... Seriously, CoreOS is like the perfect blend to my taste.

SmartOS

In short, SmartOS

  1. Boots up a kernel from PXE/USB, and the rootfs stays in memory
  2. Persistent storage is provided by ZFS (mounted on the fly)
  3. Zones to isolate resources
  4. Awesome people from the community and Joyent ports QEMU so you can run Windows
  5. But LX branded zones using syscall translation makes your Linux runs on Illumos kernel

However, as my coworker said, SmartOS was not designed to accommodate "the least common denominator", meaning that it has a very steep learning curve for those who were born in the world of Linux (well SmartOS has the Solaris blood there).

CoreOS

In short, CoreOS

  1. Boots up a kernel from PXE/USB/Disk, and the /usr is read-only
  2. Persistent storage is provided locally or remotele (NFS, etc)
  3. Linux namespace to isolate resources (hence Docker, rkt, etc)
  4. Awesome people from the community and Quay the CoreOS team to make it dead simple (For example, flannel for dead simple overlay network, etcd2 for dead simple distributed key-value, and fleet for dead simple cluster management, well not quite).
  5. It just feels easy to use.

But how exactly easy?

For the purpose of this post, I will create five (5) machines. Three of which are etcd2 central services, and two of which are workers.

etcd2 central

First, we will create three central machines:

OpenNebula

...and now we have our lovely three nodes etcd2 cluster: Three nodes

...and journal should tell us something interesting: etcd2 journal

etcd2 worker

Now we will create two worker machines: OpenNebula

In the cloud-config, we will also configure flannel:

1
2
3
4
5
6
7
- name: flanneld.service
drop-ins:
- name: 50-network-config.conf
content: |
[Service]
ExecStartPre=/usr/bin/etcdctl set /coreos.com/network/config '{ "Network": "10.160.0.0/16", "Backend": { "Type": "vxlan", "VNI": 160, "Port": 8472 } }'
command: start

...and fleetctl is going to give us very interesting things: fleetctl

WTF is "flannel"?

See CoreOS blog post and github repo.

Basically, it is a simplified version of OVS. Well, that's an over-simplification, but the idea is that flannel reads network config from etcd2, create an overlay network with udp or vxlan (or other backends) for the containers running on the host, and the containers can communicate across hosts. Without docker's default networking, containers in one host can't talk to containers in another host without some crazy setup. flannel simplifies everything.

In my case, I'm using vxlan, since udp will encapsulate in userspace, whereas vxlan will encapsulate in kernel.

... but the CoreOS private network on OpenNebula is already running VXLAN... So VXLAN inside VXLAN... :P (Thank you, 10G switch for saving my life.)

Let's try launching an instance with fleet

Create a file apache@.service:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[Unit]
Description=My Apache Frontend
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill apache%i
ExecStartPre=-/usr/bin/docker rm apache%i
ExecStartPre=/usr/bin/docker pull coreos/apache
ExecStart=/usr/bin/docker run --rm --name apache%i -p ${COREOS_PUBLIC_IPV4}:%i:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND
ExecStop=/usr/bin/docker stop apache%i

[X-Fleet]
MachineMetadata=role=worker
Conflicts=apache@*.service

The X-Fleet part will instruct fleet to only deploy on worker nodes, and each node will only have one of this service.

Then we submit the .service:

1
2
3
4
5
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 submit apache\@.service 
Unit apache@.service inactive
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 list-unit-files
UNIT HASH DSTATE STATE TARGET
apache@.service de71c97 inactive inactive -

Now let's try starting an Apache instance on port 1222:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 start apache@1222
Unit apache@1222.service inactive
Unit apache@1222.service launched on 12bc3357.../[redacted]
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 status apache@1222
● apache@1222.service - My Apache Frontend
Loaded: loaded (/run/fleet/units/apache@1222.service; linked-runtime; vendor preset: disabled)
Active: activating (start-pre) since Tue 2016-07-26 23:05:05 UTC; 10s ago
Process: 1692 ExecStartPre=/usr/bin/docker rm apache%i (code=exited, status=1/FAILURE)
Process: 1674 ExecStartPre=/usr/bin/docker kill apache%i (code=exited, status=1/FAILURE)
Control: 1702 (docker)
Tasks: 9
CGroup: /system.slice/system-apache.slice/apache@1222.service
└─control
└─1702 /usr/bin/docker pull coreos/apache

Jul 26 23:05:05 coreos-worker-1 docker[1674]: Failed to kill container (apache1222): Error response from daemon: Cannot kill container apache1222: No such container: apache1222
Jul 26 23:05:05 coreos-worker-1 docker[1692]: Failed to remove container (apache1222): Error response from daemon: No such container: apache1222
Jul 26 23:05:06 coreos-worker-1 docker[1702]: Using default tag: latest
Jul 26 23:05:07 coreos-worker-1 docker[1702]: latest: Pulling from coreos/apache
Jul 26 23:05:07 coreos-worker-1 docker[1702]: a3ed95caeb02: Pulling fs layer
Jul 26 23:05:07 coreos-worker-1 docker[1702]: 5e160ca0bb5a: Pulling fs layer
Jul 26 23:05:07 coreos-worker-1 docker[1702]: 1f92e2761bfd: Pulling fs layer
Jul 26 23:05:08 coreos-worker-1 docker[1702]: a3ed95caeb02: Download complete
Jul 26 23:05:08 coreos-worker-1 docker[1702]: a3ed95caeb02: Pull complete
Jul 26 23:05:08 coreos-worker-1 docker[1702]: a3ed95caeb02: Pull complete
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 ssh apache@1222 docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7904bc4ae5bf coreos/apache "/usr/sbin/apache2ctl" 16 seconds ago Up 15 seconds 0.0.0.0:1222->80/tcp apache1222

Now let's curl it:

1
2
3
4
5
eduroam-169-233-215-73:coreos Jerry$ curl http://[worker-1-ip]:1222
<html><body><h1>It works!</h1>
<p>This is the default web page for this server.</p>
<p>The web server software is running but no content has been added, yet.</p>
</body></html>

Let's try launching another one on port 1229:

1
2
3
4
5
6
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 start apache@1229
Unit apache@1229.service inactive
Unit apache@1229.service launched on 218714a4.../[redacted]
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 ssh apache@1229 docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b611e41e4481 coreos/apache "/usr/sbin/apache2ctl" 17 seconds ago Up 16 seconds 0.0.0.0:1229->80/tcp apache1229

...and curl that again:

1
2
3
4
5
eduroam-169-233-215-73:coreos Jerry$ curl http://[worker-2-ip]:1229
<html><body><h1>It works!</h1>
<p>This is the default web page for this server.</p>
<p>The web server software is running but no content has been added, yet.</p>
</body></html>

Testing inter-host communications between containers

Let's get the IP address of the container running apache1229:

1
2
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 ssh apache@1229 docker inspect b611e41e4481 | grep -w "IPAddress" | awk '{ print $2 }' | head -n 1 | cut -d "," -f1
"10.160.8.2"

And the IP of apache1222:

1
2
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 ssh apache@1222 docker inspect 7904bc4ae5bf | grep -w "IPAddress" | awk '{ print $2 }' | head -n 1 | cut -d "," -f1
"10.160.10.2"

Let's trying pinging from apache1222 to apache1229:

1
2
3
4
5
6
7
8
9
eduroam-169-233-215-73:coreos Jerry$ fleetctl --driver=etcd --endpoint=http://[redacted]:2379 ssh apache@1222 docker exec 7904bc4ae5bf ping -c 3 10.160.8.2
PING 10.160.8.2 (10.160.8.2) 56(84) bytes of data.
64 bytes from 10.160.8.2: icmp_req=1 ttl=62 time=3.60 ms
64 bytes from 10.160.8.2: icmp_req=2 ttl=62 time=0.914 ms
64 bytes from 10.160.8.2: icmp_req=3 ttl=62 time=0.870 ms

--- 10.160.8.2 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.870/1.797/3.609/1.281 ms

Awesome

High availability is a topic that I really want to explore. My last project Dermail has failover in mind when designing the architecture. However, I'm more interested in the infrastructure side of things.

This is my attemp of trying CoreOS. I know, I'm very late to the party.

Environment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
System Information
Operating System Linux 3.16.0-4-amd64 x86_64
Model Supermicro Super Server
Motherboard Supermicro X10SDV-16C+-TLN4F
Processor Intel(R) Xeon(R) CPU D-1587 @ 1.70GHz @ 2.30 GHz
1 Processor, 16 Cores, 32 Threads
Processor ID GenuineIntel Family 6 Model 86 Stepping 4
L1 Instruction Cache 32.0 KB x 16
L1 Data Cache 32.0 KB x 16
L2 Cache 256 KB x 16
L3 Cache 24.0 MB
Memory 126 GB
BIOS American Megatrends Inc. 1.1a
Compiler Clang 3.3 (tags/RELEASE_33/final)

Storage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@puma:/tmp/igb-5.3.5.3/src# zpool status
pool: Storage
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
Storage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F1VPH0LZ ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F1VPH3NT ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3F7PDCJL6 ONLINE 0 0 0
ata-WDC_WD1003FZEX-00MK2A0_WD-WCC3FH3Y2ZT5 ONLINE 0 0 0
logs
nvme0n1 ONLINE 0 0 0
cache
ata-Samsung_SSD_850_PRO_128GB_S1SMNWAF720227X ONLINE 0 0 0

errors: No known data errors
root@puma:/tmp/igb-5.3.5.3/src# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
Storage 1.81T 53.9G 1.76T - 0% 2% 1.00x ONLINE -

Wait, how are you running it on a single machine?

With 16 cores, 32 threads, and 128GB of RAM, this machine is a perfect home lab machine for testing. This machine will be running OpenNebula in a VM, as OpenNebula crashes on architecture newer than Haswell for some reasons.

OpenNebula and CoreOS

Compiling CoreOS Image for OpenNebula

This repo has a very good script to compile CoreOS for OpenNebula. You need packer for the script, and qemu.

Since the author is on vacation right now, I forked the repo and merge the PRs from other contributors, so that I can build the latest image. I also changed the channel from alpha to stable.

At the time of writing, CoreOS Stable channel is at version 1068.8.0. in builds/coreos-stable-1068.8.0-qemu/ we find coreos-stable-1068.8.0.

That script does not work. coreos-install with the install.yml did nothing.

Therefore, I re-created a similar script on my private git (or on github). You need Packer and QEMU. make latest to build the latest CoreOS image for OpenNebula.

This will make sure that OpenNebula contextualization works.

Your VM template will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
CONTEXT = [
EC2_USER_DATA = "$EC2_USER_DATA",
NETWORK = "YES",
SET_HOSTNAME = "$NAME",
SSH_PUBLIC_KEY = "$USER[SSH_PUBLIC_KEY]" ]
CPU = "2"
DISK = [
IMAGE = "CoreOS Stable 1068.8.0" ]
FEATURES = [
ACPI = "yes",
APIC = "yes",
HYPERV = "yes",
PAE = "yes" ]
GRAPHICS = [
LISTEN = "0.0.0.0",
TYPE = "VNC" ]
HYPERVISOR = "kvm"
MEMORY = "8192"
NIC = [
NETWORK = "Internal" ]
NIC = [
NETWORK = "External" ]
NIC_DEFAULT = [
MODEL = "virtio" ]
OS = [
ARCH = "x86_64" ]
USER_INPUTS = [
EC2_USER_DATA = "M|text64|cloud-config" ]
VCPU = "4"

The image supports etcd2 configuration. However, you will use $COREOS_PUBLIC_IPV4 and $COREOS_PRIVATE_IPV4. For example, this is your cloud-config:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#cloud-config

coreos:
units:
- name: etcd2.service
runtime: true
drop-ins:
- name: 10-oem.conf
content: |
[Service]
Environment=ETCD_ELECTION_TIMEOUT=1200
- name: reload.service
command: start
content: |
[Unit]
Description=reload systemd

[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl daemon-reload
- name: start.service
command: start
content: |
[Unit]
Description=start etcd2

[Service]
Type=oneshot
ExecStart=/usr/bin/systemctl start etcd2

etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new?size=3
discovery: "https://discovery.etcd.io/<token>"
advertise-client-urls: "http://$COREOS_PUBLIC_IPV4:2379"
initial-advertise-peer-urls: "http://$COREOS_PRIVATE_IPV4:2380"
listen-client-urls: "http://0.0.0.0:2379,http://0.0.0.0:4001"
listen-peer-urls: "http://$COREOS_PRIVATE_IPV4:2380,http://$COREOS_PRIVATE_IPV4:7001"

fleet:
public-ip: "$COREOS_PRIVATE_IPV4"
metadata: "region=fmt01"

update:
reboot-strategy: "best-effort"

Spin Them Up

Sunstone

etcd2

(To be continued)

Weightless Theme
Rocking Basscss
RSS