[Howto] Using the new Podman API

Podman is a daemonless container engine to develop, run and manage OCI containers. In a recent version the API was rewritten and now offers a REST interface as well as a docker compatible endpoint.

Podman is a daemonless container engine to develop, run and manage OCI containers. In a recent version the API was rewritten and now offers a REST interface as well as a docker compatible endpoint.

In case you never heard of Podman before, it is certainly worth a look. Besides offering a more secure drop-in-replacement for many docker functions, it can also manage pods and thus provides a container experience more aligned with what Kubernetes uses. It even can understand Kubernetes yaml (see podman-play-kube), easing the transition from single host container development over to fully fledged container management environments. Last but not least it is among the tools supporting newest features in the container space like cgroups v2.

Background: Podman API

Of course Podman is not perfect – due to the focus on Kubernetes yaml there is no support for docker-compose files (though alternatives exist), networking and routing based on names is not as simple as on Docker (read more about Podman container networking) and last but not least, the API was different – making it hard to migrate solutions dependent on the docker API.

This changed: recently, a new API was merged:

The new API is a simpler implementation based on HTTP/REST. We provide two basic groups of endpoints. The first one is for libpod; the second is for Docker compatibility, to ease adoption. 

New API coming for Podman

So how can I access the new API and fool around with it?

If you are familiar with Podman, or read carefully, the first question is: where is this API running if Podman is daemonless? And in fact, an API service needs to be started explicitly:

$ podman system service --timeout 5000

This starts the API on a UNIX socket. Other options, like a TCP socket or to run this without a timeout are also possible, the documentation provides examples.

How to use the Docker API endpoint

Let’s use the Docker API endpoint. To talk to a UNIX socket based REST API a recent curl (version >= 7.40) is quite helpful:

$ curl --unix-socket /$XDG_RUNTIME_DIR/podman/podman.sock http://localhost/images/json
[{"Containers":1,"Created":1583300892,"Id":"8c2e0da7c436e45be5ebf2adf26b41d13939190bd186214a4d45c30485071f9f","Labels":{"license":"MIT","name":"fedora","vendor":"Fedora Project","version":"31"},"ParentId":...

Note that here we are speaking to the rootless container, thus the unix domain socket is in the user runtime directory. Also, localhost has to be provided in the URL for very recent curl versions, otherwise it does not output anything!

The answer is a JSON listing, which is not easily readable. Simplify it with the help of Python (and silence curl info with the silent flag):

$ curl -s --unix-socket /$XDG_RUNTIME_DIR/podman/podman.sock http://localhost/containers/json|python -m json.tool
[
    {
        "Id": "4829e030ab1beb83db07dbc5e51481cb66562f57b79dd9eb3069dfcde91019ed",
        "Names": [
            "/87faf76aea6a-infra"
...

So what can you do with the API? Podman tries to recreate most of the docker API, so you can basically use the docker API documentation to see what should be possible. Note though that not all API endpoints are supported since Podman does not provide all functions Docker offers.

How to use the Podman API endpoint

As mentioned the API does provide two endpoints: the Docker endpoint, and a Podman specific endpoint. This second API is necessary for multiple reasons: first, Podman has functions which are alien to Docker and thus not part of the Docker API. The pod function is the most notable here. Another reason is that an independent API enables the Podman developers to further innovate in their own way and velocity, and to change the API when needed or wanted.

The API for Podman can be reached via curl as mentioned above. However, there are two notable differences: first, the Podman endpoint is marked via an additional “podman” string in the API URI, and second the Podman API is always versioned. To list the images as shown above, but via podman’s own API, the following call is necessary:

$ curl -s --unix-socket /$XDG_RUNTIME_DIR/podman/podman.sock http://localhost/v1.24/libpod/images/json
[{"Id":"8c2e0da7c436e45be5ebf2adf26b41d13939190bd186214a4d45c30485071f9f","RepoTags":["registry.fedoraproject.org/fedora:latest"],"Created":1583300892,"Size":199632198,"Labels":{"license":"MIT","name":"fedora","vendor":"Fedora ...

For pods, the endpoint is for example /pods instead of /images:

$ curl -s --unix-socket /$XDG_RUNTIME_DIR/podman/podman.sock http://localhost/v1.24/libpod/pods/json|python -m json.tool
[
    {
        "Cgroup": "user.slice",
        "Containers": [
            {
                "Id": "1510dca23d2d15ae8be1eeadcdbfb660cbf818a69d5780705cd6535d97a4a578",
                "Names": "wonderful_ardinghelli",
                "Status": "running"
            },
            {
                "Id": "6c05c20a42e6987ac9f78b277a9d9152ab37dd05e3bfd5ec9e675979eb93bf0e",
                "Names": "eff81a37b4b8-infra",
                "Status": "running"
            }
        ],
        "Created": "2020-04-19T21:45:17.838549003+02:00",
        "Id": "eff81a37b4b85e92916613239001cddc2ba42f3595236586f7462492be0ac5fc",
        "InfraId": "6c05c20a42e6987ac9f78b277a9d9152ab37dd05e3bfd5ec9e675979eb93bf0e",
        "Name": "testme",
        "Namespace": "",
        "Status": "Running"
    }
]

Currently there is no documentation of the API available – or at least none of the level of the current Docker API documentation. But hopefully that will change soon.

Takeaways

Podman providing a Docker API is a great step for people who are dependent on the Docker API but nevertheless want switch to Podman. But providing a unique, but simple to consume REST API for Podman itself is equally great because it makes it easy to integrate Podman processes into existing tools and frameworks.

Just don’t forget that the API is still in development!

Featured image by Magnascan from Pixabay

[Howto] Using toolbox in Fedora / RHEL 8 for easy management of CLI tools

Running CLI tools like ansible often requires a specific environment with dependencies on the core operating system libraries. That makes it hard to run different versions in parallel – or test the newest updates. And it might clutter the OS. Toolbox offers simple container management to avoid these shortcomings.

Running CLI tools like ansible often requires a specific environment with dependencies on the core operating system libraries. That makes it hard to run different versions in parallel – or test the newest updates. And it might clutter the OS. Toolbox offers simple container management to avoid these shortcomings.

The recent development of Linux distributions has seen a shift away from all-purpose distributions towards stable core distributions with limited packages and additional sand-boxed tooling running on top to enable management of applications. One of the most advanced distributions here is for sure Fedora Silverblue, but even the enterprise distribution Red Hat Enterprise Linux 8 brings a lot of changes which aim into the right direction. Technologies in this context are for example rpm-ostree for the management of immutable OS images and Flatpak for the management of GUI applications. Additionally, RHEL 8 comes along with so called app-streams – and of course there is always the option of using containers with for example podman.

In this blog post I want to focus on the last one: using containers to manage your CLI tools, thus keeping them independent of your operating system packaging and libraries. With Fedora and RHEL, there is tooling provided which makes this even easier: Toolbox.

The rational

The basic idea for using containers, and especially Toolbox, is similar to the one about Flatpak: it solves many problems of the Linux packaging problem. This means essentially:

  • Independence from OS libraries and their versions
  • Sand-boxing, meaning better protection of the OS
  • Multi-version support
  • Less OS clutter through isolated installation of dependencies
  • Easy to recreate environments (think of “works on my machine”)
  • Immutable environments possible

Think of it that way: with complex applications, behavior sometimes depends on certain versions of some libraries. When those are managed by the OS packaging system, it is hard to keep them up2date or just in the same version across multiple machines, not to speak about multiple distributions. Also, I don’t want my OS to be cluttered with weird dependencies which I might not even trust just to justify a weird application’s requirements. And I might want to install different versions of a tool to test them, – with different libraries as well, which is often impossible with OS package management.

Toolbox

In comes Toolbox:

Toolbox is a tool that offers a familiar package based environment for developing and debugging software that runs fully unprivileged using Podman.

The toolbox container is a fully mutable container; when you see yum install ansible for example, that’s something you can do inside your toolbox container, without affecting the base operating system.

Toolbox on Github

While Toolbox is particularly interesting for immutable systems like Fedora Silverblue, it even makes sense to run it on other distributions. I started using it on my regular Fedora for example just to have certain tools available in certain versions for tests.

And why use Toolbox, and not just the usual container tools? Toolbox takes care of volume mounting and all the other necessary bits of container management, and enables you to just use a very basic set of commands to create – and reuse – your tool containers. It is simpler and easier than always typing in fully fledged podman or docker commands all the time.

You can read more about Toolbox in the Fedora Silverblue Toolbox docs or the Red Hat Enterprise Linux 8 Toolbox docs.

Getting started

It is very easy to get started with Toolbox. First, it needs to be installed on the system. For example, on Fedora 31, this can be done via:

$ sudo dnf install toolbox

After that, you are good to go. Since the idea is to have re-usable containers, let’s create the first. In my example I want to have a container with the newest Ansible version to run some automation. So we just create a new container called ansible:

$ toolbox create --container ansible
Image required to create toolbox container.
Download registry.fedoraproject.org/f31/fedora-toolbox:31 (500MB)? [y/N]: y
Created container: ansible

As you see, a base image for my distribution was downloaded, and the container created. Next, let’s access it and look around:

$ toolbox enter --container ansible

Welcome to the Toolbox; a container where you can install and run
all your tools.

 - Use DNF in the usual manner to install command line tools.
 - To create a new tools container, run 'toolbox create'.

For more information, see the documentation.

⬢[liquidat@toolbox ~]$

We are greeted with a short message and then dropped to a shell. Note the bubble at the start of the command prompt – a nice touch to differentiate if you are inside a toolbox or not. Next, let’s look at our environment:

⬢[liquidat@toolbox ~]$ pwd
/home/liquidat
⬢[liquidat@toolbox ~]$ ls
bin  development  documents  downloads  ...
⬢[liquidat@toolbox ~]$ ls /
README.md  bin  boot  dev  etc  home  lib  lib64  lost+found  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
⬢[liquidat@toolbox ~]$ cat /README.md 
# Toolbox — Unprivileged development environment

[Toolbox](https://github.com/debarshiray/toolbox) is a tool that offers a
[...]

As you see, the toolbox has actual access to the file system. That way we can use the tools just like normal shell tools, interact with things we have in our environment. However, at the same time we have limited access to the root system since we see the container root system (as identified by the readme), not the host root system.

Getting my first tool ready

As mentioned I’d like to have a container with the newest Ansible. Let’s install it:

⬢[liquidat@toolbox ~]$ pip install --user ansible
Collecting ansible
Using cached https://files.pythonhosted.org/packages/ae/b7/c717363f767f7af33d90af9458d5f1e0960db9c2393a6c221c2ce97ad1aa/ansible-2.9.6.tar.gz
Collecting jinja2 (from ansible)
[...]
Running setup.py install for ansible … done
Successfully installed MarkupSafe-1.1.1 PyYAML-5.3 ansible-2.9.6 cffi-1.14.0 cryptography-2.8 jinja2-2.11.1 pycparser-2.20
⬢[liquidat@toolbox ~]$ ansible --version
ansible 2.9.6
config file = /home/liquidat/.ansible.cfg
configured module search path = ['/home/liquidat/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/liquidat/.local/lib/python3.7/site-packages/ansible
executable location = /home/liquidat/.local/bin/ansible
python version = 3.7.6 (default, Jan 30 2020, 09:44:41) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]

As you see, Ansible was properly installed. And with this we are already done – we have our first tool ready, name “ansible”.

Using our tool

Now let’s assume I use the container for some things, exit it – and want to reuse it later on. This is no problem at all, since that is exactly what Toolbox was built for. And we have a name, which makes it fairly easy to remember how to access it. But even if we do not remember the name, we can easily list all available tools:

$ toolbox list
IMAGE ID      IMAGE NAME                                        CREATED
64e68e194389  registry.fedoraproject.org/f31/fedora-toolbox:31  2 weeks ago

CONTAINER ID  CONTAINER NAME  CREATED         STATUS             IMAGE NAME
8ec117845e06  ansible         47 minutes ago  Up 47 minutes ago  registry.fedoraproject.org/f31/fedora-toolbox:31
$ toolbox enter -c ansible
⬢[liquidat@toolbox ~]$ ansible --version
ansible 2.9.6
  config file = /home/liquidat/.ansible.cfg
  configured module search path = ['/home/liquidat/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/liquidat/.local/lib/python3.7/site-packages/ansible
  executable location = /home/liquidat/.local/bin/ansible
  python version = 3.7.6 (default, Jan 30 2020, 09:44:41) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]

As you see the container is in the same state as we left it: Ansible is still installed in the proper way, and ready to be used. And we can do this now with all kinds of other tools: be it another version of Ansible, or even some daemon we want to experiment with. It can all be easily installed and run and re-used, without worrying of cluttering the OS, or having the wrong library versions installed, or not being able to update some library because of a system dependency.

Summary

Toolbox is an interesting approach to simplify container management to fool around with CLI based tools. If you have an immutable environment like Fedora Silverblue, it might become a crucial piece in your daily operations since it is a pain to install additional packages on top of Silverblue’s ostree infrastructure. But even for “normal” distributions it is worth a try!

Of debugging Ansible Tower and underlying cloud images

Recently I was experimenting with Tower’s isolated nodes feature – but somehow it did not work in my environment. Debugging told me a lot about Ansible Tower – and also why you should not trust arbitrary cloud images.

Ansible Logo

Recently I was experimenting with Tower’s isolated nodes feature – but somehow it did not work in my environment. Debugging told me a lot about Ansible Tower – and also why you should not trust arbitrary cloud images.

Background – Isolated Nodes

Ansible Tower has a nice feature called “isolated nodes”. Those are dedicated Tower instances which can manage nodes in separated environments – basically an Ansible Tower Proxy.

An Isolated Node is an Ansible Tower node that contains a small piece of software for running playbooks locally to manage a set of infrastructure. It can be deployed behind a firewall/VPC or in a remote datacenter, with only SSH access available. When a job is run that targets things managed by the isolated node, the job and its environment will be pushed to the isolated node over SSH, where it will run as normal.

Ansible Tower Feature Spotlight: Instance Groups and Isolated Nodes

Isolated nodes are especially handy when you setup your automation in security sensitive environments. Think of DMZs here, of network separation and so on.

I was fooling around with a clustered Tower installation on RHEL 7 VMs in a cloud environment when I run into trouble though.

My problem – Isolated node unavailable

Isolated nodes – like instance groups – have a status inside Tower: if things are problematic, they are marked as unavailable. And this is what happened with my instance isonode.remote.example.com running in my lab environment:

Ansible Tower showing an instance node as unavailable

I tried to turn it “off” and “on” again with the button in the control interface. It made the node available, it was even able to executed jobs – but it became quickly unavailable soon after.

Analysis

So what happened? The Tower logs showed a Python error:

# tail -f /var/log/tower/tower.log
fatal: [isonode.remote.example.com]: FAILED! => {"changed": false,
"module_stderr": "Shared connection to isonode.remote.example.com
closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n
File \"/var/lib/awx/.ansible/tmp/ansible-tmp-1552400585.04
-60203645751230/AnsiballZ_awx_capacity.py\", line 113, in <module>\r\n
_ansiballz_main()\r\n  File \"/var/lib/awx/.ansible/tmp/ansible-tmp
-1552400585.04-60203645751230/AnsiballZ_awx_capacity.py\", line 105, in
_ansiballz_main\r\n    invoke_module(zipped_mod, temp_path,
ANSIBALLZ_PARAMS)\r\n  File \"/var/lib/awx/.ansible/tmp/ansible-tmp
-1552400585.04-60203645751230/AnsiballZ_awx_capacity.py\", line 48, in
invoke_module\r\n    imp.load_module('__main__', mod, module, MOD_DESC)\r\n
File \"/tmp/ansible_awx_capacity_payload_6p5kHp/__main__.py\", line 74, in
<module>\r\n  File \"/tmp/ansible_awx_capacity_payload_6p5kHp/__main__.py\",
line 60, in main\r\n  File
\"/tmp/ansible_awx_capacity_payload_6p5kHp/__main__.py\", line 27, in
get_cpu_capacity\r\nAttributeError: 'module' object has no attribute
'cpu_count'\r\n", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact
error", "rc": 1}

PLAY RECAP *********************************************************************
isonode.remote.example.com : ok=0    changed=0    unreachable=0    failed=1  

Apparently a Python function was missing. If we check the code we see that indeed in line 27 of file awx_capacity.py the function psutil.cpu_count() is called:

def get_cpu_capacity():
    env_forkcpu = os.getenv('SYSTEM_TASK_FORKS_CPU', None)
    cpu = psutil.cpu_count()

Support for this function was added in version 2.0 of psutil:

2014-03-10
Enhancements
424: [Windows] installer for Python 3.X 64 bit.
427: number of logical and physical CPUs (psutil.cpu_count()).

psutil history

Note the date here: 2014-03-10 – pretty old! I check the version of the installed package, and indeed the version was pre-2.0:

$ rpm -q --queryformat '%{VERSION}\n' python-psutil
1.2.1

To be really sure and also to ensure that there was no weird function backporting, I checked the function call directly on the Tower machine:

# python
Python 2.7.5 (default, Sep 12 2018, 05:31:16) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import inspect
>>> import psutil as module
>>> functions = inspect.getmembers(module, inspect.isfunction)
>>> functions
[('_assert_pid_not_reused', <function _assert_pid_not_reused at
0x7f9eb10a8d70>), ('_deprecated', <function deprecated at 0x7f9eb38ec320>),
('_wraps', <function wraps at 0x7f9eb414f848>), ('avail_phymem', <function
avail_phymem at 0x7f9eb0c32ed8>), ('avail_virtmem', <function avail_virtmem at
0x7f9eb0c36398>), ('cached_phymem', <function cached_phymem at
0x7f9eb10a86e0>), ('cpu_percent', <function cpu_percent at 0x7f9eb0c32320>),
('cpu_times', <function cpu_times at 0x7f9eb0c322a8>), ('cpu_times_percent',
<function cpu_times_percent at 0x7f9eb0c326e0>), ('disk_io_counters',
<function disk_io_counters at 0x7f9eb0c32938>), ('disk_partitions', <function
disk_partitions at 0x7f9eb0c328c0>), ('disk_usage', <function disk_usage at
0x7f9eb0c32848>), ('get_boot_time', <function get_boot_time at
0x7f9eb0c32a28>), ('get_pid_list', <function get_pid_list at 0x7f9eb0c4b410>),
('get_process_list', <function get_process_list at 0x7f9eb0c32c08>),
('get_users', <function get_users at 0x7f9eb0c32aa0>), ('namedtuple',
<function namedtuple at 0x7f9ebc84df50>), ('net_io_counters', <function
net_io_counters at 0x7f9eb0c329b0>), ('network_io_counters', <function
network_io_counters at 0x7f9eb0c36500>), ('phymem_buffers', <function
phymem_buffers at 0x7f9eb10a8848>), ('phymem_usage', <function phymem_usage at
0x7f9eb0c32cf8>), ('pid_exists', <function pid_exists at 0x7f9eb0c32140>),
('process_iter', <function process_iter at 0x7f9eb0c321b8>), ('swap_memory',
<function swap_memory at 0x7f9eb0c327d0>), ('test', <function test at
0x7f9eb0c32b18>), ('total_virtmem', <function total_virtmem at
0x7f9eb0c361b8>), ('used_phymem', <function used_phymem at 0x7f9eb0c36050>),
('used_virtmem', <function used_virtmem at 0x7f9eb0c362a8>), ('virtmem_usage',
<function virtmem_usage at 0x7f9eb0c32de8>), ('virtual_memory', <function
virtual_memory at 0x7f9eb0c32758>), ('wait_procs', <function wait_procs at
0x7f9eb0c32230>)]

Searching for a package origin

So how to solve this issue? My first idea was to get this working by updating the entire code part to the multiprocessor lib:

# python
Python 2.7.5 (default, Sep 12 2018, 05:31:16) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import multiprocessing
>>> cpu = multiprocessing.cpu_count()
>>> cpu
4

But while I was filling a bug report I wondered why RHEL shipped such an ancient library. After all, RHEL 7 was released in June 2014, and psutil had cpu_count available since early 2014! And indeed, a quick search for the package via the Red Hat package search showed a weird result: python-psutil was never part of base RHEL 7! It was only shipped as part of some very, very old OpenStack channels:

access.redhat.com package search, results for python-psutil

Newer OpenStack channels in fact come along with newer versions of python-psutil.

So how did this outdated package end up on this RHEL 7 image? Why was it never updated?

The cloud image is to blame! The package was installed on it – most likely during the creation of the image: python-psutil is needed for OpenStack Heat, so I assume that these RHEL 7 images where once created via OpenStack and then used as the default image in this demo environment.

And after the initial creation of the image the Heat packages were forgotten. In the meantime the image was updated to newer RHEL versions, snapshots were created as new defaults and so on. But since the package in question was never part of the main RHEL repos, it was never changed or removed. It just stayed there. Waiting, apparently, for me 😉

Conclusion

This issue showed me how tricky cloud images can be. Think about your own cloud images: have you really checked all all of them and verified that no package, no start up script, no configuration was changed from the Linux distribution vendor’s base setup?

With RPMs this is still manageable, you can track if packages are installed which are not present in the existing channels. But did someone install something with pip? Or any other way?

Take my case: an outdated version of a library was called instead of a much, much more recent one. If there would have been a serious security issue with the library in the meantime, I would have been exposed although my update management did not report any library to be updated.

I learned my lesson to be more critical with cloud images, checking them in more detail in the future to avoid having nasty surprises during production. And I can just recommend that you do that as well.

[Howto] Adopting Ansible Galaxy roles for Solaris

Ansible LogoIt is pretty easy to manage Solaris with Ansible. However, the Ansible roles available at Ansible Galaxy usually target Linux based OS only. Luckily, adopting them is rather simple.

Background

As mentioned earlier Solaris machines can be managed via Ansible pretty well: it works out of the box, and many already existing modules are incredible helpful in managing Solaris installations.

At the same time, the Ansible Best Practices guide strongly recommends using roles to organize your IT with Ansible. Many roles are already available at the Ansible Galaxy ready to be used by the admin in need. Ansible Galaxy is a central repository for various roles written by the community.

However, Ansible Galaxy only recently added support for Solaris. There are currently hardly any roles with Solaris platform support available.

Luckily expanding existing Ansible roles towards Solaris is not that hard.

Example: Apache role

For example, the Apache role from geerlingguy is one of the highest rated roles on Ansible Galaxy. It installs Apache, starts the service, has support for vhosts and custom ports and is above all pretty well documented. Yet, there is no Solaris support right now… Although geerlingguy just accepted a pull request, so it won’t be long until the new version will surface at Ansible Galaxy.

The best way to adopt a given role for another OS is to extend the current role for an additional OS – in contrast to deleting the original OS support an replacing it by new, again OS specific configuration. This keeps the role re-usable on other OS and enables the community to maintain and improve a shared, common role.

With a bit of knowledge about how services are started and stopped on Linux as well as on Solaris, one major difference quickly comes up: on Linux usually the name of the controlled service is the exact same name as the one of the binary behind the service. The same name string is also part of the path to the usr files of the program and for example to the configuration files. On Solaris that is often not the case!

So the best is to check the given role if it starts or stops the service at any given point, if a variable is used there, and if this variable is used somewhere else but for example to create a path name or identify a binary.

The given example indeed controls a service. Thus we add another variable, the service name:

tasks/main.yml
@@ -41,6 +41,6 @@
 - name: Ensure Apache has selected state and enabled on boot.
   service:
-    name: "{{ apache_daemon }}"
+    name: "{{ apache_service }}"
     state: "{{ apache_state }}"
     enabled: yes

Next, we need to add the new variable to the existing OS support:

vars/Debian.yml
@@ -1,4 +1,5 @@
 ---
+apache_service: apache2
 apache_daemon: apache2
 apache_daemon_path: /usr/sbin/
 apache_server_root: /etc/apache2
vars/RedHat.yml
@@ -1,4 +1,5 @@
 ---
+apache_service: httpd
 apache_daemon: httpd
 apache_daemon_path: /usr/sbin/
 apache_server_root: /etc/httpd

Now would be a good time to test the role – it should work on the suported platforms.

The next step is to add the necessary variables for Solaris. The best way is to copy an already existing variable file and to modify it afterwards to fit Solaris:

vars/Solaris.yml
@@ -0,0 +1,19 @@
+---
+apache_service: apache24
+apache_daemon: httpd
+apache_daemon_path: /usr/apache2/2.4/bin/
+apache_server_root: /etc/apache2/2.4/
+apache_conf_path: /etc/apache2/2.4/conf.d
+
+apache_vhosts_version: "2.2"
+
+__apache_packages:
+  - web/server/apache-24
+  - web/server/apache-24/module/apache-ssl
+  - web/server/apache-24/module/apache-security
+
+apache_ports_configuration_items:
+  - regexp: "^Listen "
+    line: "Listen {{ apache_listen_port }}"
+  - regexp: "^#?NameVirtualHost "
+    line: "NameVirtualHost *:{{ apache_listen_port }}"

This specific role provides two playbooks to setup and configure each supported platform. The easiest way to create these two files for a new platform is again to copy existing ones and to modify them afterwards according to the specifics of Solaris.

The configuration looks like:

tasks/configure-Solaris.yml
@@ -0,0 +1,19 @@
+---
+- name: Configure Apache.
+  lineinfile:
+    dest: "{{ apache_server_root }}/conf/{{ apache_daemon }}.conf"
+    regexp: "{{ item.regexp }}"
+    line: "{{ item.line }}"
+    state: present
+  with_items: apache_ports_configuration_items
+  notify: restart apache
+
+- name: Add apache vhosts configuration.
+  template:
+    src: "vhosts-{{ apache_vhosts_version }}.conf.j2"
+    dest: "{{ apache_conf_path }}/{{ apache_vhosts_filename }}"
+    owner: root
+    group: root
+    mode: 0644
+  notify: restart apache
+  when: apache_create_vhosts

The setup thus can look like:

tasks/setup-Solaris.yml
@@ -0,0 +1,6 @@
+---
+- name: Ensure Apache is installed.
+  pkg5:
+    name: "{{ item }}"
+    state: installed
+  with_items: apache_packages

Last but not least, the platform support must be activated in the main/task.yml file:

tasks/main.yml
@@ -15,6 +15,9 @@
 - include: setup-Debian.yml
   when: ansible_os_family == 'Debian'
 
+- include: setup-Solaris.yml
+  when: ansible_os_family == 'Solaris'
+
 # Figure out what version of Apache is installed.
 - name: Get installed version of Apache.
   shell: "{{ apache_daemon_path }}{{ apache_daemon }} -v"

When you now run the role on a Solaris machine, it should install Apache right away.

Conclusion

Adopting a given role from Ansible Galaxy for Solaris is rather easy – if the given role is already prepared for multi OS support. In such cases adding another role is a trivial task.

If the role is not prepared for multi OS support, try to get in contact with the developers, often they appreciate feedback and multi OS support pull requests.

[Howto] Managing Solaris 11 via Ansible

Ansible LogoAnsible can be used to manage various kinds of Server operating systems – among them Solaris 11.

Managing Solaris 11 servers via Ansible from my Fedora machine is actually less exciting than previously thought. Since the amount of blog articles covering that is limited I thought it might be a nice challenge.

However, the opposite is the case: it just works. On a fresh Solaris installation, out of the box. There is not even need for additional configuration or additional software. Of course, ssh access must be available – but the same is true on Linux machines as well. It’s almost boring 😉

Here is an example to install and remove software on Solaris 11, using the new package system IPS which was introduced in Solaris 11:

$ ansible solaris -s -m pkg5 -a "name=web/server/apache-24"
$ ansible solaris -s -m pkg5 -a "state=absent name=/text/patchutils"

While Ansible uses a special module, pkg5, to manage Solaris packages, service managing is even easier because the usual service module is used for Linux as well as Solaris machines:

$ ansible solaris -s -m service -a "name=apache24 state=started"
$ ansible solaris -s -m service -a "name=apache24 state=stopped"

So far so good – of course things get really interesting if playbooks can perform tasks on Solaris and Linux machines at the same time. For example, imagine Apache needs to be deployed and started on Linux as well as on Solaris. Here conditions come in handy:

---
- name: install and start Apache
  hosts: clients
  vars_files:
    - "vars/{{ ansible_os_family }}.yml"
  sudo: yes

  tasks:
    - name: install Apache on Solaris
      pkg5: name=web/server/apache-24
      when: ansible_os_family == "Solaris"

    - name: install Apache on RHEL
      yum:  name=httpd
      when: ansible_os_family == "RedHat"

    - name: start Apache
      service: name={{ apache }} state=started

Since the service name is not the same on different operating systems (or even different Linux distributions) the service name is a variable defined in a family specific Yaml file.

It’s also interesting to note that the same Ansible module works different on the different operating systems: when a service is ordered to be stopped, but is not even available because the corresponding package and thus service definition is not even installed, the return code on Linux is OK, while on Solaris an error is returned:

TASK: [stop Apache on Solaris] ************************************************
failed: [argon] => {"failed": true}
msg: svcs: Pattern 'apache24' doesn't match any instances

FATAL: all hosts have already failed -- aborting

It would be nice to catch the error, however as far as I know error handling in Ansible can only specify when to fail, and not which messages/errors should be ignored.

But besides this problem managing Solaris via Ansible works smoothly for me. And it even works on Ansible Tower, of course:

Tower-Ansible-Solaris.png

I haven’t tried to install Ansible on Solaris itself, but since packages are available that shouldn’t be much of an issue.

So in case you have a mixed environment including Solaris and Linux machines (Red Hat, Fedora, Ubuntu, Debian, Suse, you name it) I can only recommend to start using Ansible as soon as you possible. It simply works and can ease the pain of day to day tasks substantially.