[Howto] Installing Cilium with Minikube on Fedora

Cilium is a networking plugin for Kubernetes based on eBPF. If you want to give it a try, Minikube is a good option to get started.

Background

I just started with Isovalent – and since I am very much a beginner regarding everything related to Kubernetes I decided to get some hands-on experience with the technology I am going to work with for the foreseeable future.

Isovalent’s offering is an Enterprise version of Cilium which basically manages and secures connections between containers and adds observability to it. It all runs on eBPF and thus is pretty performant. eBPF can run sandboxed programs in Linux kernel space without the need to recompile the kernel; A tiny bit like a “Kernel VM”. I always wanted to get my hands dirty with eBPF anyway, and Cilium is a very good way to approach it. But where to start? The answer is: with a small Kubernetes setup based on Minikube, a tiny Kubernetes distribution for testing and fooling around which leaves your main system almost unchanged.

Preparing the environment

Minikube runs itself in a tightly confined environment to not disturb your other systems. This abstraction is done via containers or VMs realized via so called “drivers”. Drivers are available for Docker, VMWare, KVM, Podman and others. I decided to go with the KVM driver, so the virtualization bits need to be installed:

❯ sudo dnf install @virtualization
[...]
❯ sudo systemctl start libvirtd
❯ sudo systemctl enable libvirtd
❯ sudo usermod --append --groups libvirt ( whoami )

Note in the above commands that the last command only works in Nushell and has to be slightly adjusted for Bash or Zsh.

Next we have to install Minikube itself:

❯ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-latest.x86_64.rpm
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15.1M  100 15.1M    0     0  8304k      0  0:00:01  0:00:01 --:--:-- 8300k

❯ sudo rpm -Uvh minikube-latest.x86_64.rpm
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:minikube-1.22.0-0                ################################# [100%]

Also, to install and manage Cilium easily it makes sense to use the Cilium CLI. Unfortunately the CLI is currently not available as a RPM package for Fedora, so we have to install the binary and move it to /usr/local/bin:

❯ curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
[...]
❯ sha256sum --check cilium-linux-amd64.tar.gz.sha256sum
cilium-linux-amd64.tar.gz: OK
❯ sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

Starting Minikube with CNI

We now need to start up our Kubernetes cluster, and it needs to be in a way that we can install and use Cilium in it. So we set the network configuration to CNI:

❯ minikube start --network-plugin=cni
😄  minikube v1.22.0 on Fedora 34
✨  Automatically selected the kvm2 driver. Other choices: podman, ssh
💾  Downloading driver docker-machine-driver-kvm2:
    > docker-machine-driver-kvm2....: 65 B / 65 B [----------] 100.00% ? p/s 0s
    > docker-machine-driver-kvm2: 11.47 MiB / 11.47 MiB  100.00% 12.50 MiB p/s 
❗  With --network-plugin=cni, you will need to provide your own CNI. See --cni flag as a user-friendly alternative
💿  Downloading VM boot image ...
    > minikube-v1.22.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
    > minikube-v1.22.0.iso: 242.95 MiB / 242.95 MiB [ 100.00% 20.05 MiB p/s 12s
👍  Starting control plane node minikube in cluster minikube
🔥  Creating kvm2 VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.21.2 on Docker 20.10.6 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Installing Cilium into Kubernetes

Since Cilium CLI is already installed, it is fairly easy to install Cilium into the cluster itself. The installation is done into the current kubectl context, so make sure you are running in the right context for example with kubectl get nodes. Afterwards, fire up the installation:

❯ cilium install
🔮 Auto-detected Kubernetes kind: minikube
✨ Running "minikube" validation checks
✅ Detected minikube version "1.22.0"
ℹ️  using Cilium version "v1.10.2"
🔮 Auto-detected cluster name: minikube
🔮 Auto-detected IPAM mode: cluster-pool
🔮 Auto-detected datapath mode: tunnel
🔮 Custom datapath mode: tunnel
🔑 Generating CA...
2021/07/13 14:09:33 [INFO] generate received request
2021/07/13 14:09:33 [INFO] received CSR
2021/07/13 14:09:33 [INFO] generating key: ecdsa-256
2021/07/13 14:09:33 [INFO] encoded CSR
2021/07/13 14:09:33 [INFO] signed certificate with serial number 122640105911298337607907666763746132599853501126
🔑 Generating certificates for Hubble...
2021/07/13 14:09:33 [INFO] generate received request
2021/07/13 14:09:33 [INFO] received CSR
2021/07/13 14:09:33 [INFO] generating key: ecdsa-256
2021/07/13 14:09:33 [INFO] encoded CSR
2021/07/13 14:09:33 [INFO] signed certificate with serial number 459020519400202498147292503280351877404424824247
🚀 Creating Service accounts...
🚀 Creating Cluster roles...
🚀 Creating ConfigMap...
🚀 Creating Agent DaemonSet...
🚀 Creating Operator Deployment...
⌛ Waiting for Cilium to be installed...
⌛ Waiting for Cilium to become ready before restarting unmanaged pods...
♻️  Restarting unmanaged pods...
♻️  Restarted unmanaged pod kube-system/coredns-558bd4d5db-8s4f6
♻️  Restarted unmanaged pod kubernetes-dashboard/dashboard-metrics-scraper-7976b667d4-ctq4p
♻️  Restarted unmanaged pod kubernetes-dashboard/kubernetes-dashboard-6fcdf4f6d-5wkbx
✅ Cilium was successfully installed! Run 'cilium status' to view installation health

The installation went through flawlessly. But does it really work? As mentioned in the last line of the above listing, we can check the status of Cilium easily:

❯ cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 1, Ready: 1/1, Available: 1/1
Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Containers:       cilium             Running: 1
                  cilium-operator    Running: 1
Image versions    cilium-operator    quay.io/cilium/operator-generic:v1.10.2: 1
                  cilium             quay.io/cilium/cilium:v1.10.2: 1

We can even get one step further and check the connectivity of the cluster – after all, Cilium is all about proper networking:

❯ cilium connectivity test
ℹ️  Single-node environment detected, enabling single-node connectivity test
ℹ️  Monitor aggregation detected, will skip some flow validation steps
[...]
..
✅ All 11 tests (76 actions) successful, 0 tests skipped, 0 scenarios skipped.

As you see Cilium creates a set of pods and a service in a dedicated namespace and runs tests on them afterwards.

Interacting with the Cilium agent

Let’s have a first look at our installed Cilium environment by running a few commands on the local Cilium agent. First we have to figure out the name of the actual Cilium pod:

❯ minikube kubectl -- -n kube-system get pods -l k8s-app=cilium
NAME           READY   STATUS    RESTARTS   AGE
cilium-8hx2v   1/1     Running   0          35m

With the name of the pod we can now reach into the pod and execute the Cilium command right inside, for example querying the list of endpoints:

❯ minikube kubectl -- -n kube-system exec cilium-8hx2v -- cilium endpoint list
Defaulted container "cilium-agent" out of: cilium-agent, ebpf-mount (init), clean-cilium-state (init)
ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                           IPv6   IPv4         STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                                            
208        Disabled           Disabled          7182       k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system                   10.0.0.70    ready   
                                                           k8s:io.cilium.k8s.policy.cluster=minikube                                                                         
                                                           k8s:io.cilium.k8s.policy.serviceaccount=coredns                                                                   
                                                           k8s:io.kubernetes.pod.namespace=kube-system                                                                       
                                                           k8s:k8s-app=kube-dns                                                                                              
452        Disabled           Disabled          4506       k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=cilium-test                   10.0.0.254   ready   
                                                           k8s:io.cilium.k8s.policy.cluster=minikube                                                                         
                                                           k8s:io.cilium.k8s.policy.serviceaccount=default                                                                   
                                                           k8s:io.kubernetes.pod.namespace=cilium-test                                                                       
                                                           k8s:kind=client                                                                                                   
                                                           k8s:name=client2                                                                                                  
                                                           k8s:other=client                                             
[...]  

This list is long, detailed and only really makes sense on a wide monitor. But it already tells us a lot about the current enforcement of ingress and egress policies (here they are not enforced as of yet).

But there is more: since Cilium is eBPF based, we can go one layer deeper, and for example look at the policy related eBPF maps:

❯ minikube kubectl -- -n kube-system exec cilium-8hx2v -- cilium bpf policy get --all
Defaulted container "cilium-agent" out of: cilium-agent, ebpf-mount (init), clean-cilium-state (init)
/sys/fs/bpf/tc/globals/cilium_policy_00208:

POLICY   DIRECTION   LABELS (source:key[=value])   PORT/PROTO   PROXY PORT   BYTES     PACKETS   
Allow    Ingress     reserved:unknown              ANY          NONE         16959     183       
Allow    Ingress     reserved:host                 ANY          NONE         1098509   4452      
Allow    Egress      reserved:unknown              ANY          NONE         393706    4204  
[...]

Note that the policy number is related to the endpoint ID in the Cilium endpoint list above.

We now have a running Cilium setup which can be used to run tests and examples!

Next: write and enforce policies, add observability

Doing a policy enforcement test goes beyond of the scope of this blog post – but it certainly is worth a look in the future. Also with all the data already shown above it makes sense to make a deep-dive into the topic of observation in the future.

If you already want to check policy enforcement out on your own the Cilium documentation has a beautiful example prepared which walks through some policy challenges and how those can be answered with Cilium.

The same is true for observability: if you wonder how deep the rabbit hole really is there is Hubble which provides serious observability into the Kubernetes network, services and security, comes with a UI and can be quickly installed since it is tightly integrated with Cilium.

And if you have stories to share around eBPF, Cilium and similar topics I am finally getting an idea of what you are talking about. 😉

Image by stux from Pixabay

[Howto] Create your own cloud gaming server to stream games to Fedora

A few months back I wanted to give a game a try which only runs on Windows and requires a dedicated GPU. Since I have neither of those, a decided to set up my own Windows cloud gaming server to stream the game to my Linux machine.

A few months back I wanted to give a game a try which only runs on Windows and requires a dedicated GPU. Since I have neither of those, a decided to set up my own Windows cloud gaming server to stream the game to my Linux machine.

Dozens of years ago there was one game I played day and night. For weeks, months, maybe even years. Till today I can still remember the distinct soundtrack which makes the hair stand up on the back of my neck: UFO: Enemy Unknown. I loved the game! A few years ago I also played one of the open source games inspired by UFO quite some time, UFO: AI. That was fun.

Sequels to the original game were released, two over the last couple of years. But they never really were an option since they required Windows (or so I thought) and above all, time. However, few months ago I first realized that one of the sequels, XCOM: Enemy Unknown, was available for Android. Since I have a brand new flagship Android tablet I gave it a shot – and it was great! But since the Android version was seriously limited, I played it again on Linux. That barely worked with my limited Intel GPU. But it was playable, and I had fun.

I was infected with the urge to play the game more – and when a thid sequel was announced, I at least wanted to play the second one, XCOM 2. But how? My GPU was too limited and eGPUs are expensive and often involve a lot of hassle – even if I would be willing to buy a Windows license. So I searched if cloud gaming could do the trick.

Cloud Gaming Services

The idea of cloud gaming is that heavy machines in the data center do the rendering, and the client machine only displays the end result. That shifts the burden of the powerful GPU towards the data center, and the client only needs to have simple graphics to show a stream of images. This does however require a rather responsive broad band connection between the client and the data center.

This principle is not new, but got new attention recently when Google announced their cloud gaming offer Stadia. I checked if any cloud gaming services offered my game of choice – and was available on Linux. Unfortunately, the results were disappointing:

  • Stadia: no XCOM2, no Linux client via Chrome Browser (thanks to zesoup)
  • GeForce Now: no XCOM2, no Linux client
  • Playstation Now: XCOM2 available, but no Linux client
  • Vortex: no XCOM2, no Linux client

Some of the above can be used on Linux with the help of Lutris, which uses Wine in the background. But for me that would only count as a last resort. I was not that desperate yet.

However, not all was lost yet: some services are not tied to a certain game catalog, but instead offer a generic server and client onto which you can install your games. The research results were first promising: shadow.tech offers machines for just that and a working Linux client! However, they are not available at my place.

The solution: Parsec

So with all ready-to-consume options out of the picture, I was almost willing to give up (or give Lutris and Playstation Now a chance, or even buy a eGPU). But then I stumbled upon something interesting: Parsec, a client for interactive game streaming.

Parsec is a high performance, low latency 60 FPS remote access product connecting you to your computer from anywhere.

Parsec features

That itself didn’t solve my problem. But it opened a window to a new solution: in the past, the company offered cloud hosted game servers on their own. Players could connect to it with their Parsec client and play games on them together – or on their own. The Parsec promise is that their client is fast enough for a reasonable good experience.

The server offer was canceled some time ago – but there was no one stopping me launching my own server and connect the Parsec client to it. And that is what I did. Read on to learn how to do that yourself.

Step 1: Getting a Windows cloud server with a reasonable GPU

What is needed is a cloud hosted Windows machine with a reasonable GPU. In best case the data center hosting the machine should not be on the other side of the planet. AWS, Azure, GCP and other have such offers. But there is even a better route: during my research I found Paperspace, a company specialized on providing access to GPU or AI cloud platforms. That is perfect for this use case!

Paperspace does not really advertise their support for gaming platforms. But after I signed up and looked what was needed to create my first cloud server I found a Parsec template:

That makes the entire process very easy!

  • Sign up with Paperspace, get billing sorted out (yes, this stuff costs money)
  • Get to Core -> Compute -> Machines, create a new machine
  • From Public Templates, get the Parsec cloud gaming template
  • Pick the right size for your games; for me a P4000 was enough.
  • Make sure to add a public IP and enough storage. Many today’s games easily consume dozens of GB
  • Set the auto-shutdown timer. No need to waste money.
  • Start the machine.

And that’s it already. Once the machine starts, you will notice a Parsec icon on the home screen. Time to get that working.

Step 2: Get Parsec

Parsec has clients for Linux based operating systems such as Ubuntu and Raspberry. There is even an AppImage or a Snap – unfortunately not a Flatpak yet. Update: there is now even a Flatpak package available! Thanks Sheogorath for the hint!

And if you are not willing to use Flatpak, AppImage or Snap for whatever reason, you can download the Ubuntu deb and create a RPM out of it. There is even a handy script for that. Any way, get it installed.

Sign up to Parsec, start the client, log in, and you are almost there:

Step 3: Play

After Parsec is all set, just start the cloud server, start Parsec there (maybe log in to your Parsec account), connect to the session on your client – and you are good to go: You can start playing!

For a first test I just watched some Youtube videos and was surprised by the quality. Next I logged in to my Steam account, got my XCOM2 installed and played along happily!

Performance and user experience

But how good is the performance? Well, that depends mostly on one factor: network. Due to unfortunate circumstances I was “able” to test this setup with three very distinct networks in a short time frame:

  • A rather slowish, unstable WiFi with a lot of jitter
  • A LTE connection, provided to me via WiFi hotspot
  • A top-notch, high performance mesh WiFi

When you have slow pings (everything below 25 ms) and/or a lot of jitter, I cannot recommend that you go this path. Otherwise it can be a serious option!

The first network I was on was horrible slow, and the experience was horrible. XCOM2 has basically permanent background music, and the constant interruptions in the music and audio sequences were in fact the worst for me.

The LTE based network was slightly better, but still far from a native feeling. I was able to get a good experience out of this and have fun, but that about was it.

However, the third option, WiFi on almost wired quality, was so good that in times I forgot that I was not playing the game natively. There was no visible lag, the graphics were crystal clear, the music was never interrupted, etc. I was impressed – and had great sessions that way!

I can only recommend to always keep an eye on the connection quality reported in the Parsec overlay:

As Parsec mentions:

At 60 frames per second, 1 frame is around 16ms. By combining decode, encode and network, you’ll have the amount of frames the client lags behind.

Parsec about lag latency

Having this in mind, the above screenshot shows a connection with an unfortunate lag, leading to a not-that-good experience.

Recap

If you don’t have the hardware and/or software to play your favorite game, cloud gaming can be a solution for your problem. And if there is no proper offering out there, it is possible to get this working on your own.

Running your own cloud gaming server is surprisingly easy and not too expensive. It does feel somewhat weird in the beginning especially if you usually only use clouds for your professional work. But it is a fun experience, and the results can be staggering – if your network is up for the job!

Featured image by Martin Str from Pixabay

[Howto] Launch traefik as a docker container in a secure way

Traefik is a great reverse proxy solution, and a perfect tool to direct traffic in container environments. However, to do that, it needs access to docker – and that is very dangerous and must be tightly secured!

The problem: access to the docker socket

Containers offer countless opportunities to improve the deployment and management of services. However, having multiple containers on one system, often re-deploying them on the fly, requires a dynamic way of routing traffic to them. Additionally, there might be reasons to have a front end reverse proxy to sort the traffic properly anyway.

In comes traefik – “the cloud native edge router”. Among many supported backends it knows how to listen to docker and create dynamic routes on the fly when new containers come up.

To do so traefik needs access to the docker socket. Many people decide to just provide that as a volume to traefik. This usually does not work because SELinux prevents it for a reason. The apparent workaround for many is to run traefik in a privileged container. But that is a really bad idea:

Docker currently does not have any Authorization controls. If you can talk to the docker socket or if docker is listening on a network port and you can talk to it, you are allowed to execute all docker commands. […]
At which point you, or any user that has these permissions, have total control on your system.

http://www.projectatomic.io/blog/2014/09/granting-rights-to-users-to-use-docker-in-fedora/

The solution: a docker socket proxy

But there are ways to securely provide traefik the access it needs – without exposing too much permissions. One way is to provide limited access to the docker socket via tcp via another container which cannot be reached from the outside that easily.

Meet Tecnativa’s docker-socket-proxy:

What?
This is a security-enhaced proxy for the Docker Socket.
Why?
Giving access to your Docker socket could mean giving root access to your host, or even to your whole swarm, but some services require hooking into that socket to react to events, etc. Using this proxy lets you block anything you consider those services should not do.

https://github.com/Tecnativa/docker-socket-proxy/blob/master/README.md

It is a container which connects to the docker socket and exports the API features in a secured and configurable way via TCP. At container startup it is configured with booleans to which API sections access is granted.

So basically you set up a docker proxy to support your proxy for docker containers. Well…

How to use it

The docker socket proxy is a container itself. Thus it needs to be launched as a privileged container with access to the docker socket. Also, it must not publish any ports to the outside. Instead it should run on a dedicated docker network shared with the traefik container. The Ansible code to launch the container that way is for example:

- name: ensure privileged docker socket container
  docker_container:
    name: dockersocket4traefik
    image: tecnativa/docker-socket-proxy
    log_driver: journald
    env:
      CONTAINERS: 1
    state: started
    privileged: yes
    exposed_ports:
      - 2375
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:z"
    networks:
      - name: dockersocket4traefik_nw

Note the env right in the middle: that is where the exported permissions are configured. CONTAINERS: 1  provides access to container relevant information. There are also SERVICES: 1 and SWARM: 1 to manage access to docker services and swarm.

Traefik needs to have access to the same network. Also, the traefik configuration needs to point to the docker container via tcp:

[docker]
endpoint = "tcp://dockersocket4traefik:2375"

Conclusion

This setup works surprisingly easy. And it allows traefik to access the docker socket for the things it needs without exposing critical permissions to take over the system. At the same time, full access to the docker socket is restricted to a non-public container, which makes it harder for attackers to exploit it.

If you have a simple container setup and use Ansible to start and stop the containers, I’ve written a role to get the above mentioned setup running.

[Howto] Automated DNS resolution for KVM/libvirt guests with a local domain [Update]

I often run demos on my laptop with the help of libvirt. Managing 20+ machines that way is annoying when you have no DNS resolution for those. Luckily, with libvirt and NetworkManager, that can be easily solved.

libvirt_logo-svg

I often run demos on my laptop with the help of libvirt. Managing 20+ machines that way is annoying when you have no DNS resolution for those. Luckily, with libvirt and NetworkManager, that can be easily solved.

The problem

Imagine you want to test something in a demo setup with 5 machines. You create the necessary VMs in your local KVM/libvirt environment – but you cannot address them properly by name. With 5 machines you also need to write down the appropriate IP addresses – that’s hardly practical.

It is possible to create static entries in the libvirt network configuration – however, that is still very inflexible, difficult to automate and only works for name resolution inside the libvirt environment. When you want to ssh into a running VM from the host, you again have to look up the IP.

Name resolution in  the host network would be possible by adding each entry to /etc/hosts additionally. But that would require the management of two lists at the same time. Not automated, far from dynamic, and very ponderous.

The solution

Luckily, there is an elegant solution: libvirt comes with its own in-build DNS server, dnsmasq. Configured properly, that can be used to serve DHCP and DNS to servers respecting a previous defined domain. Additionally, NetworkManager can be configured to use its own dnsmasq instance to resolve DNS entries – forwarding requests to the libvirt instance if needed.

That way, the only thing which has to be done is setting a proper host name inside the VMs. Everything else just works out of the box (with a recently Linux, see below).

The solution presented here is based on great post from Dominic Cleal.

Configuring libvirt

First of all, libvirt needs to be configured. Given that the network “default” is assigned to the relevant VMs, the configuration should look like this:

$ sudo virsh net-dumpxml default
<network connections='1'>
  <name>default</name>
  <uuid>158880c3-9adb-4a44-ab51-d0bc1c18cddc</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:fa:cb:e5'/>
  <domain name='qxyz.de' localOnly='yes'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.128' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>

You can modify the network for example with the command virsh net-edit default. The interesting part is below the mac address: a local domain is defined and marked as localOnly. That domain will be the authoritative domain for the relevant VMs, and libvirt will configure dnsmasq to act as a resolver for that domain. The attribute makes sure that DNS requests regarding that domain will never be forwarded upstream. This is important to avoid loop holes.

Note, however: as mentioned in the comment by taurus, your domain should not be named “local” because this might cause trouble in relation to mDNS.

Configuring the VM guests

When the domain is set, the guests inside the VMs need to be defined. With recent Linux releases this is as simple as setting the host name:

$ sudo hostnamectl set-hostname neon.qxyz.de

There is no need to enter the host name anywhere else: the command above takes care of that. And the default configuration of DHCP clients of recent Linux releases sends this hostname together with the DHCP request – dnsmasq picks the host name automatically  up if the domain matches.

If you are on a Linux where the hostnamectl command does not work, or where the DHCP client does not send the host name with the request – switch to a recent version of Fedora or RHEL 😉

Because with such systems the host name must be set manually. To do so follow the documentation of your OS. Just ensure that the resolution of the name works locally. Additionally, besides the hostname itself the DHCP configuration must be altered to send along the hostname. For example, in older RHEL and Fedora versions the option

DHCP_HOSTNAME=neon.qxyz.de

has to be added to /etc/sysconfig/network-scripts/ifcfg-eth0.

At this point automatic name resolution between VMs should already work after a restart of libvirt.

Configuring NetworkManager

The last missing piece is the configuration of the actual KVM/libvirt host, so that the local domain, here qxyz.de, is properly resolved. Adding another name server to /etc/resolv.conf might work for a workstation with a fixed network connection, but certainly does not work for laptops which have changing network connections and DNS servers all the time. In such cases, the NetworkManager is often used anyway so we take advantage of its capabilities.

First of all, NetworkManager needs to start its own version of dnsmasq. That can be achieved with a simple configuration option:

$ cat /etc/NetworkManager/conf.d/localdns.conf 
[main]
dns=dnsmasq

This second dnsmasq instance just works out of the box. All DNS requests will automatically be forwarded to DNS servers acquired by NetworkManager via DHCP, for example. The only notable difference is that the entry in /etc/resolv.conf is different:

# Generated by NetworkManager
search whatever
nameserver 127.0.0.1

Now as a second step the second dnsmasq instance needs to know that for all requests regarding qxyz.de the libvirt dnsmasq instance has to be queried. This can be achieved with another rather simple configuration option, given the domain and the IP from the libvirt network configuration at the top of this blog post:

$ cat /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf 
server=/qxyz.de/192.168.122.1

And that’s it, already. Restart NetworkManager and everything should be working fine.

As a side node: if the attribute localOnly would not have been set in the libvirt network configuration, queries for unknown qxyz.de entries would be forwarded from the libvirt dnsmasq to the NetworkManager dnsmasq – which would again forward them to the libvirt dnsmasq, and so on. That would quickly overload your dnsmasq servers, resulting in error messages:

dnsmasq[15426]: Maximum number of concurrent DNS queries reached (max: 150)

Summary

With these rather few and simple changes a local domain is established for both guest and host, making it easy to resolve their names everywhere. There is no need to maintain one or even two lists of static IP entries, everything is done automatically.

For me this is a huge relief, making it much easier in the future to set up demo and test environments. Also, it looks much nicer during a demo if you have FQDNs and not IP addresses. I can only recommend this setup to everyone who often uses libvirt/KVM on a local machine for test/demo environments.

[Short Tip] Fix mount problems in RHV during GlusterFS mounts

Gluster Logo

When using Red Hat Virtualization or oVirt together with GLusterFS, there might be a strange error during the first creation of a storage domain:

Failed to add Storage Domain xyz.

One of the rather easy to fix reasons might be a permission problem: an initial Gluster exported file system belongs to the user root. However, the virtualization manager (ovirt-m bzw. RHV-M) does not have root rights and such needs another ownership.

In such cases, the fix is to mount the exported volume & set the user rights to the rhv-m user.

$ sudo mount -t glusterfs 192.168.122.241:my-vol /mnt
# cd /mnt/
# chown -R 36.36 .

Afterwarsd, the volume can be mounted properly. Some more general details can be found at RH KB 78503.