world-1264062_1920

Insights into Ansible: environments of executed playbooks

Ansible LogoUsually when Ansible Tower executes a playbook everything works just as on the command line. However, in some corner cases the behavior might be different: Ansible Tower runs its playbooks in a specific environment.

Different playbook results in Tower vs CLI

Ansible is a great tool for automation, and Ansible Tower enhances these capabilities by adding centralization, a UI, role based access control and a REST API. To take advantage of Tower, just import your playbooks and press start – it just works.

At least most of the time: lately I was playing around with the Google Cloud Engine, GCE. Ansible provides several GCE modules thus writing playbooks to control the setup was pretty easy. But while GCE related playbooks worked on the plain command line, they failed in Tower:

PLAY [create node on GCE] ******************************************************

TASK [launch instance] *********************************************************
task path: /var/lib/awx/projects/_43__gitolite_gce_node_tower_pem_file/gce-node.yml:13
An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 2573, in <module>
    main()
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 506, in main
    module, gce, inames)
  File "/var/lib/awx/.ansible/tmp/ansible-tmp-1461919385.95-6521356859698/gce", line 359, in create_instances
    external_ip=external_ip, ex_disk_auto_delete=disk_auto_delete, ex_service_accounts=ex_sa_perms)
TypeError: create_node() got an unexpected keyword argument 'ex_can_ip_forward'

fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_name": "gce"}, "parsed": false}

NO MORE HOSTS LEFT *************************************************************
	to retry, use: --limit @gce-node.retry

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1 

To me that didn’t make sense at all: the exact same playbook was running on command line. How could that fail in Tower when Tower is only a UI to Ansible itself?

Environment variables during playbook runs

The answer is that playbooks are run by Tower within specific environment variables. For example, the GCE login credentials are provided to the playbook and thus to the modules via environment variables:

GCE_EMAIL
GCE_PROJECT
GCE_PEM_FILE_PATH

That means, if you want to debug a playbook and want to provide the login credentials just the way Tower does, the shell command has to be:

GCE_EMAIL=myuser@myproject.iam.gserviceaccount.com GCE_PROJECT=myproject GCE_PEM_FILE_PATH=/tmp/mykey.pem ansible-playbook myplaybook.yml

The error at hand was also caused by an environment variable, though: PYTHONPATH. Tower comes along with a set of Python libraries needed for Ansible. Among them some which are required by specific modules. In this case, the GCE modules require the Apache libcloud, and that one is installed with the Ansible Tower bundle. The libraries are installed at /usr/lib/python2.7/site-packages/awx/lib/site-packages – which is not a typical Python path.

For that reason, each playbook is run from within Tower with the environment variable PYTHONPATH="/usr/lib/python2.7/site-packages/awx/lib/site-packages:". Thus, to run a playbook just the same way it is run from within Tower, the shell command needs to be:

PYTHONPATH="/usr/lib/python2.7/site-packages/awx/lib/site-packages:" ansible-playbook myplaybook.yml

This way the GCE error shown above could be reproduced on the command line. So the environment provided by Tower was a problem, while the environment of plain Ansible (and thus plain Python) caused no errors. Tower does bundle the library because you cannot expect the library for example in the RHEL default repositories.

The root cause is that right now Tower still ships with an older version of the libcloud library which is not fully compatible with GCE anymore (GCE is a fast moving target). If you run Ansible on the command line you most likely install libcloud via pip or RPM which in most cases provides a pretty current version.

Workaround for Tower

While upgrading the library makes sense in the mid term, a short term workaround is needed as well. The best way is to first install a recent version of libcloud and second identify the actual task which fails and point that exact task to the new library.

In case of RHEL, enable the EPEL repository, install python-libcloud and then add the environment path PYTHONPATH: "/usr/lib/python2.7/site-packages" to the task via the environment option.

- name: launch instance
  gce:
    name: "{{ node_name }}"
    zone: europe-west1-c
    machine_type: "{{ machine_type }}"
    image: "{{ image }}"
  environment:
    PYTHONPATH: "/usr/lib/python2.7/site-packages"

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s