Insights into Ansible: environments of executed playbooks

Usually when Ansible Tower executes a playbook everything works just as on the command line. However, in some corner cases the behavior might be different: Ansible Tower runs its playbooks in a specific environment.

Different playbook results in Tower vs CLI

Ansible is a great tool for automation, and Ansible Tower enhances these capabilities by adding centralization, a UI, role based access control and a REST API. To take advantage of Tower, just import your playbooks and press start – it just works.

At least most of the time: lately I was playing around with the Google Cloud Engine, GCE. Ansible provides several GCE modules thus writing playbooks to control the setup was pretty easy. But while GCE related playbooks worked on the plain command line, they failed in Tower:

PLAY [create node on GCE] ******************************************************

TASK [launch instance] *********************************************************
task path: /var/lib/awx/projects/_43__gitolite_gce_node_tower_pem_file/gce-node.yml:13
An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 2573, in <module>
    main()
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 506, in main
    module, gce, inames)
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 359, in create_instances
    external_ip=external_ip, ex_disk_auto_delete=disk_auto_delete, ex_service_accounts=ex_sa_perms)
TypeError: create_node() got an unexpected keyword argument 'ex_can_ip_forward'

fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_name": "gce"}, "parsed": false}

NO MORE HOSTS LEFT *************************************************************
	to retry, use: --limit @gce-node.retry

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1

To me that didn’t make sense at all: the exact same playbook was running on command line. How could that fail in Tower when Tower is only a UI to Ansible itself?

Environment variables during playbook runs

The answer is that playbooks are run by Tower within specific environment variables. For example, the GCE login credentials are provided to the playbook and thus to the modules via environment variables:

GCE_EMAIL
GCE_PROJECT
GCE_PEM_FILE_PATH

That means, if you want to debug a playbook and want to provide the login credentials just the way Tower does, the shell command has to be:

GCE_EMAIL=myuser@myproject.iam.gserviceaccount.com GCE_PROJECT=myproject GCE_PEM_FILE_PATH=/tmp/mykey.pem ansible-playbook myplaybook.yml

The error at hand was also caused by an environment variable, though: PYTHONPATH. Tower comes along with a set of Python libraries needed for Ansible. Among them some which are required by specific modules. In this case, the GCE modules require the Apache libcloud, and that one is installed with the Ansible Tower bundle. The libraries are installed at /usr/lib/python2.7/site-packages/awx/lib/site-packages – which is not a typical Python path.

For that reason, each playbook is run from within Tower with the environment variable PYTHONPATH="/usr/lib/python2.7/site-packages/awx/lib/site-packages:". Thus, to run a playbook just the same way it is run from within Tower, the shell command needs to be:

PYTHONPATH="/usr/lib/python2.7/site-packages/awx/lib/site-packages:" ansible-playbook myplaybook.yml

This way the GCE error shown above could be reproduced on the command line. So the environment provided by Tower was a problem, while the environment of plain Ansible (and thus plain Python) caused no errors. Tower does bundle the library because you cannot expect the library for example in the RHEL default repositories.

The root cause is that right now Tower still ships with an older version of the libcloud library which is not fully compatible with GCE anymore (GCE is a fast moving target). If you run Ansible on the command line you most likely install libcloud via pip or RPM which in most cases provides a pretty current version.

Workaround for Tower

While upgrading the library makes sense in the mid term, a short term workaround is needed as well. The best way is to first install a recent version of libcloud and second identify the actual task which fails and point that exact task to the new library.

In case of RHEL, enable the EPEL repository, install python-libcloud and then add the environment path PYTHONPATH: "/usr/lib/python2.7/site-packages" to the task via the environment option.

- name: launch instance
  gce:
    name: "{{ node_name }}"
    zone: europe-west1-c
    machine_type: "{{ machine_type }}"
    image: "{{ image }}"
  environment:
    PYTHONPATH: "/usr/lib/python2.7/site-packages"

[Short Tip] Query all registered repositories in Red Hat Satellite

The idea of RESTful APIs is pretty appealing: using the basic components of the WWW as APIs to bring together services. Operations like HTTP GET and POST, base URIs and media types like JSON are supported almost everywhere simply because the web is supported almost everywhere, it is pretty easy to provide REST enabled servers, services and clients with a few clicks and calls. For this reason the API of Red Hat Satellite – and most of the other Red Hat products – is built as REST API.

I’ve already written an article about how to access the Satellite REST API via Ansible. Today I came across a rather handy example: sometimes you need to know the URLs of the Satellite provided repos. This can of course be queried via the API. But in contrast to my old article, we do not query the Foreman part of the api ($SATELLITE_URL/api/) but the Katello part: /katello/api/.

All repositories can be shown via the URL /katello/api/repositories?organization_id=1. To query URLs on the command line I recommend Ansible:

$ ansible localhost -m uri -a "method=GET user=admin password=$PASSWORD force_basic_auth=yes validate_certs=no url=https://satellite-server.example.com/katello/api/repositories?organization_id=1&full_results=true"
localhost | SUCCESS => {
    "apipie_checksum": "7cd3aad709af2f1ae18a3daa0915d712", 
    "cache_control": "must-revalidate, private, max-age=0", 
    "changed": false,
...
    "id": 45, 
    "label": "EPEL_7_-_x86_64", 
...
    "product": {
      "cp_id": "1452001252604", 
      "id": 127, 
      "name": "EPEL", 
      "sync_plan": [
        "name", 
        "description", 
        "sync_date", 
        "interval", 
        "next_sync"
      ]
    }, 
    "relative_path": "Platin/Library/custom/EPEL/EPEL_7_-_x86_64", 
    "url": "http://dl.fedoraproject.org/pub/epel/7/x86_64/"
...

The option full_results just ensures that the entire result is shown even if it is pretty long. Note that the product I can be used to query the entire product information:

$ ansible localhost -m uri -a "method=GET user=admin password=$PASSWORD force_basic_auth=yes validate_certs=no url=https://satellite-server.example.com/katello/api/products/127"
localhost | SUCCESS => {
...
  "id": 127, 
  "label": "EPEL", 
  "last_sync": "2016-01-05 13:43:38 UTC", 
  "last_sync_words": "about 1 month", 
  "name": "EPEL", 
  "organization": {
...

The id of the repository can be used to query the full repository information, including a full repo path:

$ ansible localhost -m uri -a "method=GET user=admin password=$PASSWORD force_basic_auth=yes validate_certs=no url=https://satellite-server.example.com/katello/api/repositories/45"      
localhost | SUCCESS => {
...
  "content_type": "yum", 
  "full_path": "http://satellite-server.example.com/pulp/repos/Platin/Library/custom/EPEL/EPEL_7_-_x86_64",
...

If you want to skip the part figuring out the IDs manually but have a name you could search for, it is possible to filter the results. The search URL for this case would be: /katello/api/repositories?organization_id=1&full_results=true&search=*EPEL*" as shown in the following example:

$ ansible localhost -m uri -a "method=GET user=admin password=$PASSWORD force_basic_auth=yes validate_certs=no url=https://satellite-server.example.com/katello/api/repositories?organization_id=1&full_results=true&search=*EPEL*"
localhost | SUCCESS => {
...
  "relative_path": "Platin/Library/custom/EPEL/EPEL_7_-_x86_64", 
...

[Howto] Keeping temporary Ansible scripts

Ansible tasks are executed locally on the target machine. via generated Python scripts. For debugging it might make sense to analyze the scripts – so Ansible must be told to not delete them.

When Ansible executes a command on a remote host, usually a Python script is copied, executed and removed immediately. For each task, a script is copied and executed, as shown in the logs:

Feb 25 07:40:44 ansible-demo-helium sshd[2395]: Accepted publickey for liquidat from 192.168.122.1 port 54108 ssh2: RSA 78:7c:4a:15:17:b2:62:af:0b:ac:34:4a:00:c0:9a:1c
Feb 25 07:40:44 ansible-demo-helium systemd[1]: Started Session 7 of user liquidat
Feb 25 07:40:44 ansible-demo-helium sshd[2395]: pam_unix(sshd:session): session opened for user liquidat by (uid=0)
Feb 25 07:40:44 ansible-demo-helium systemd-logind[484]: New session 7 of user liquidat.
Feb 25 07:40:44 ansible-demo-helium systemd[1]: Starting Session 7 of user liquidat.
Feb 25 07:40:45 ansible-demo-helium ansible-yum[2399]: Invoked with name=['httpd'] list=None install_repoquery=True conf_file=None disable_gpg_check=False state=absent disablerepo=None update_cache=False enablerepo=None exclude=None
Feb 25 07:40:45 ansible-demo-helium sshd[2398]: Received disconnect from 192.168.122.1: 11: disconnected by user
Feb 25 07:40:45 ansible-demo-helium sshd[2395]: pam_unix(sshd:session): session closed for user liquidat

However, for debugging it might make sense to keep the script and execute it locally. Ansible can be persuaded to keep a script by setting the variable ANSIBLE_KEEP_REMOTE_FILES to true at the command line:

$ ANSIBLE_KEEP_REMOTE_FILES=1 ansible helium -m yum -a "name=httpd state=absent"

The actually executed command – and the created temporary file – is revealed when ansible is executed with the debug option:

$ ANSIBLE_KEEP_REMOTE_FILES=1 ansible helium -m yum -a "name=httpd state=absent" -vvv
...
<192.168.122.202> SSH: EXEC ssh -C -vvv -o ForwardAgent=yes -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -tt 192.168.122.202 'LANG=de_DE.UTF-8 LC_ALL=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 /usr/bin/python -tt /home/liquidat/.ansible/tmp/ansible-tmp-1456498240.12-1738868183958/yum'
...

Note that here the script is executed directly via Python. If the “become” flag i set, the Python execution is routed through a shell, the command looks like /bin/sh -c 'sudo -u $SUDO_USER /bin/sh -c "/usr/bin/python $SCRIPT"'.

The temporary file is a Python script, as the header shows:

$ head yum 
#!/usr/bin/python -tt
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-

# (c) 2012, Red Hat, Inc
# Written by Seth Vidal <skvidal at fedoraproject.org>
# (c) 2014, Epic Games, Inc.
#
# This file is part of Ansible
...

The script can afterwards be executed by /usr/bin/python yum or /bin/sh -c 'sudo -u $SUDO_USER /bin/sh -c "/usr/bin/python yum"' respectively:

$ /bin/sh -c 'sudo -u root /bin/sh -c "/usr/bin/python yum"'
{"msg": "", "invocation": {"module_args": {"name": ["httpd"], "list": null, "install_repoquery": true, "conf_file": null, "disable_gpg_check": false, "state": "absent", ...

More detailed information about debugging Ansible can be found at Will Thames’ article “Debugging Ansible for fun and no profit”.

[Short Tip] Use Red Hat Satellite 6 as an inventory resource in Ansible

Besides static file inventories, Ansible can use custom scripts to dynamically generate inventories or access other sources, for example a CMDB or a system management server – like Red Hat Satellite.
Luckily, Nick Strugnell has already written a custom script to use Satellite as an inventory source in Ansible.

After checking out the git, the hammer.ini needs to be adjusted: at least host, username, password and organization must be adjusted.

Afterwards, the script can be invoked directly to show the available hosts:

$ ansible -i ~/Github/ansible-satellite6/satellite-inventory.py all --list-hosts
    argon.example.com
    satellite-server.example.com
    helium.example.com
...

This works with ansible CLI and playbook calls:

$ ansible-playbook -i ~/Github/ansible-satellite6/satellite-inventory.py apache-setup.yml
PLAY [apache setup] *********************************************************** 

GATHERING FACTS *************************************************************** 
...

The script works quite well – as long as the certificate you use on the Satellite server is trusted. Otherwise the value for self.ssl_verify must be set to False. Besides, it is a nice and simple way to access already existing inventory stores. This is important because Ansible is all about integration, and not about “throwing away and making new”.

[Howto] Look up of external sources in Ansible

Part of Ansible’s power comes from an easy integration with other systems. In this post I will cover how to look up data from external sources like DNS or Redis.

Background

A tool for automation is only as good as it is capable to integrate it with the already existing environment – thus with other tools. Among various ways Ansible offers the possibility to look up Ansible variables from external stores like DNS, Redis, etcd or even generic INI or CSV files. This enables Ansible to easily access data which are stored – and changed, managed – outside of Ansible.

Setup

Ansible’s lookup feature is already installed by default.

Queries are executed on the host where the playbook is executed – in case of Tower this would be the Tower host itself. Thus the node needs access to the resources which needs to be queried.

Some lookup functions for example for DNS or Redis servers require additional python libraries – on the host actually executing the queries! On Fedora, the python-dns package is necessary for DNS queries and the package python-redis for Redis queries.

Generic usage

The lookup function can be used the exact same way variables are used: curly brackets surround the lookup function, the result is placed where the variable would be. That means lookup functions can be used in the head of a playbook, inside the tasks, even in templates.

The lookup command itself has to list the plugin as well as the arguments for the plugin:

{{ lookup('plugin','arguments') }}

Examples

Files

Entire files can be used as content of a variable. This is simply done via:

vars:
  content: "{{ lookup('file','lorem.txt') }}"

As a result, the variable has the entire content of the file. Note that the lookup of files always searches the files relative to the path of the actual playbook, not relative to the path where the command is executed.

Also, the lookup might fail when the file itself contains quote characters.

CSV

While the file lookup is pretty simple and generic, the CSV lookup module gives the ability to access values of given keys in a CSV file. An optional parameter can identify the appropriate column. For example, if the following CSV file is given:

$ cat gamma.csv
daytime,time,meal
breakfast,7,soup
lunch,12,rice
tea,15,cake
dinner,18,noodles

Now the lookup function for CSV files can access the lines identified by keys which are compared to the values of the first column. The following example looks up the key dinner and gives back the entry of the third column: {{ lookup('csvfile','dinner file=gamma.csv delimiter=, col=2') }}.

Inserted in a playbook, this looks like:

ansible-playbook examples/lookup.yml

PLAY [demo lookups] *********************************************************** 

GATHERING FACTS ***************************************************************
ok: [neon]

TASK: [lookup of a csv file] **************************************************
ok: [neon] => {
    "msg": "noodles"
}

PLAY RECAP ********************************************************************
neon                       : ok=2    changed=0    unreachable=0    failed=0

The corresponding playbook gives out the variable via the debug module:

---
- name: demo lookups
  hosts: neon

  tasks:
    - name: lookup of a csv file
      debug: msg="{{ lookup('csvfile','dinner file=gamma.csv delimiter=, col=2') }}"

DNS

The DNS lookup is particularly interesting in cases where the local DNS provides a lot of information like SSH fingerprints or the MX record.

The DNS lookup plugin is called dig – like the command line client dig. As arguments, the plugin takes a domain name and the DNS type: {{ lookup('dig', 'redhat.com. qtype=MX') }}. Another way to hand over the type argument is via slash: {{ lookup('dig', 'redhat.com./MX') }}

The result for this example is:

TASK: [lookup of dns dig entries] *********************************************
ok: [neon] =&amp;amp;gt; {
    "msg": "10 int-mx.corp.redhat.com."
}

Redis

It gets even more interesting when existing databases are queried. Ansible lookup supports for example Redis databases. The plugin takes as argument the entire URL: redis://$URL:$PORT,$KEY.

For example, to query a local Redis server for the key dinner:

---
tasks:
  - name: lookup of redis entries
    debug: msg="{{ lookup('redis_kv', 'redis://localhost:6379,dinner') }}"

The result is:

TASK: [lookup of redis entries] ***********************************************
ok: [neon] =&amp;amp;gt; {
    "msg": "noodles"
}

Template

As already mentioned, lookups can not only be used in Playbooks, but also directly in templates. For example, given the template code:

$ cat templatej2
...
Red Hat MX: {{ lookup('dig', 'redhat.com./MX') }}
$ cat template.conf
...
Red Hat MX: 10 mx2.redhat.com.,5 mx1.redhat.com.

Conclusion

As shown the lookup plugin of Ansible provides many possibilities to integrate Ansible with existing tools and environments which already contain valuable data about the systems. It is easy to use, integrates well with the existing Ansible concepts and can quickly be integrated. Just drop it where a variable would be dropped, and it already works.

I am looking forward to more lookup modules support in the future – I’d love to see a generic “http” and a generic “SQL” plugin, even with the ability to provide credentials, although these features can be somewhat realized with already existing modules.