[Short Tip] Exit bad/broken/locked ssh sessions

Sometimes it happens that SSH connections lock up. For example due to weird SSH server configuration or bad connectivity on your side, suddenly your SSH connection is broken. You cannot send any more comments via the SSH connection. The terminal just doesn’t react.

And that includes the typical exit commands: Ctrl+z or Ctrl+d are not working anymore. So you are only left with the choice to close the terminal – right? In fact, no, you can just exist the SSH session.

The trick is:

Why does this work? Because it is one of the defined escape sequences:

The supported escapes (assuming the default ‘~’) are:
~^Z Background ssh.
~# List forwarded connections.
~& Background ssh at logout when waiting for forwarded connection / X11 sessions to terminate.


To many of you this is probably nothing new – but I never knew that, even after years of using SSH on a daily base, so I had the urge to share this.


[Howto] Adding SSH keys to Ansible Tower via tower-cli [Update]

Ansible Logo

The tool tower-cli is often used to pre-configure Ansible Tower in a scripted way. It provides a convenient way to boot-strap a Tower configuration. But adding SSH keys as machine credentials is far from easy.

Boot-strapping Ansible Tower can become necessary for testing and QA environments where the same setup is created and destroyed multiple times. Other use cases are when multiple Tower installations need to be configured in the same way or share at least a larger part of the configuration.

One of the necessary tasks in such setups is to create machine credentials in Ansible Tower so that Ansible is able to connect properly to a target machine. In a Linux environment, this is often done via SSH keys.

However, tower-cli calls the Tower API in the background – and JSON POST data need to be in one line. But SSH keys come in multiple lines, so providing the file via a $(cat ssh_file) does not work:

tower-cli credential create --name "Example Credentials" \
                     --organization "Default" --credential-type "Machine" \
                     --inputs="{\"username\":\"ansible\",\"ssh_key_data\":\"$(cat .ssh/id_rsa)\",\"become_method\":\"sudo\"}"

Multiple workarounds can be found on the net, like manually editing the file to remove the new lines or creating a dedicated variables file containing the SSH key. There is even a bug report discussing that.

But for my use case I needed to read an existing SSH file directly, and did not want to add another manual step or create an additional variables file. The trick is a rather complex piece of SED:

$(sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g' /home/ansible/.ssh/id_rsa)

This basically reads in the entire file (instead of just line by line), removes the new lines and replaces them with \n. To be precise:

  • we first create a label "a"
  • append the next line to the pattern space ("N")
  • find out if this is the last line or not ("$!"), and if not
  • branch back to label a ("ba")
  • after that, we search for the new lines ("\r{0,1}")
  • and replace them with the string for a new line, "\n"

Note that this needs to be accompanied with proper line endings and quotation marks. The full call of tower-cli with the sed command inside is:

tower-cli credential create --name "Example Credentials" \
                     --organization "Default" --credential-type "Machine" \
                     --inputs="{\"username\":\"ansible\",\"ssh_key_data\":\"$(sed -E ':a;N;$!ba;s/\r{0,1}\n/\\n/g' /home/ansible/.ssh/id_rsa)\n\",\"become_method\":\"sudo\"}"

Note all the escaped quotations marks.


Another way to add the keys is to provide yaml in the shell command:

tower-cli credential create --name "Example Credentials" \
                     --organization "Default" --credential-type "Machine" \
                     --inputs='username: ansible
become_method: sudo
ssh_key_data: |
'"$(sed 's/^/    /' /home/ansible/.ssh/id_rsa)"

This method is appealing since the corresponding sed call is a little bit easier to understand. But make sure to indent the variables exactly like shown above.

Thanks to the @ericzolf of the Red Hat Automation Community of Practice hinting me to that solution. If you are interested in the Red Hat Communities of Practice, you can read more about them in the blog “Communities of practice: Straight from the open source”.

[Howto] Fix ldap “protocol error” in Gitea (and other Go based apps)

I prefer self hosted solution for some tasks. But this also means that I have to troubleshoot my problems on my own. Recently a go-ldap error gave me a headache. Here is the analysis of the protocol error – and how to solve it.

For certain projects I prefer a self hosted Git server. Solutions like Gitea, the fast developing and striving fork of Gogs, make this painless and easy to do – especially in a containerized environment.

My users are managed in a FreeIPA, and Gitea connects to it via LDAP. And this is a constant source for trouble. Gitea is written in Go, and the go ldap libraries seem to be far from perfect.

For example, after a recent update of my environment, login at Gitea stopped working:

[...gitea/models/user.go:1544 SyncExternalUsers()] [E] LDAP Search failed unexpectedly! (LDAP Result Code 2 "Protocol Error": )

The FreeIPA server at the same time showed indeed malformed requests:

[170978469] fd=112 slot=112 connection from to
[171199824] op=0 BIND dn="uid=system,cn=sysaccounts,cn=etc,dc=bayz,dc=de" method=128 version=3
[223472706] op=0 RESULT err=0 tag=97 nentries=0 etime=0.0052434415 dn="uid=system,cn=sysaccounts,cn=etc,dc=bayz,dc=de"
[223738210] op=1 SRCH base="cn=users,cn=accounts,dc=bayz,dc=de" scope=2 filter="(&(objectClass=person)(uid=rwo))" attrs=ALL
[225467030] op=1 RESULT err=0 tag=101 nentries=1 etime=0.0001797298
[226078299] op=2 BIND dn="uid=rwo,cn=users,cn=accounts,dc=bayz,dc=de" method=128 version=3
[278423889] op=2 RESULT err=0 tag=97 nentries=0 etime=0.0052380180 dn="uid=rwo,cn=users,cn=accounts,dc=bayz,dc=de"
[278705323] op=3 SRCH base="(null)" scope=2 filter="(&(objectClass=person)(uid=rwo))", invalid attribute request
[278722888] op=3 RESULT err=2 tag=101 nentries=0 etime=0.0000084787
[279051788] op=-1 fd=112 closed - B1

Since LDAP login still worked fine with other tools I assumed a problem in the new Gitea version and filled a bug report. Other users with the same problem joined soon after, but no one was able to provide a solution.

After some research I figured out that the problem appeared to be related to an update in the FreeIPA server: a security update in the underlying 389 server lead to protocol errors when empty attributes were part of the request.

An updated version of the go-ldap library was supposed to fix this – and indeed, after Gitea updated the library other users reported that the issue was fixed for them.

However, not for me: I still had the problem, and got frustrated over this for weeks.

It took me another evening of research until I found the important missing detail: a Grafana user had the same problem. The updated library did not help there either. But reducing the number of empty attributes by providing values for default, thus otherwise empty attributes did the trick:

For me the error occurs when I have less than 4 attributes.


In the end they figured out that one empty attribute was ok to be sent, but not two. With this information, the fix of my problem was easy: I previously had not set the attribute for First Name and Surname. I added those and immediately was able to login again.

Gitea with LDAP attributes

So: if you ever run into the same problem with Go and LDAP, check if you are indeed sending more than one empty attribute!

This was the second LDAP problem I encountered using Gitea. Due to this experience I do not really feel comfortable with using this combination – and will not be surprised if it breaks again with the next update.

On the other hand LDAP is a rather complicated protocol and probably a bit overkill for a simple use case like this. If I ever re-do my setup I might change over to mail based authentication.

[Short Tip] Provide dictionaries as default in Ansible variables

Ansible Logo

Ansible uses the Jinja2 template engine to handle variables. This includes the default filter, which sets a default value if a referenced variable is not explicitly defined somewhere else.

With Ansible it might happen that instead of a skalar variable a key-value is needed, a dictionary. If you just paste the plain text in there, you might run into trouble:

fatal: [test.example.com]: FAILED! => {"changed": false, "msg": "argument env is of type and we were unable to convert to dict: dictionary requested, could not parse JSON or key=value"}

The key-value pair needs to be properly formatted:

"{{ my_variable|default({'key':'value'}) }}"

Thanks to @bcoca for his post about this.

[Short Tip] Identify supported platforms of Ansible Galaxy

Ansible Logo

Ansible Galaxy recently got a fresh update and now has much more features worth a look. Among those are automatic quality scorings.

In a recent role upload my scoring was only 4.5. One of the problems was a “invalid platform”. I wondered which platforms are supported, and how the strings for those are, but the documentation is sparse in this regard.

However, Ansible Galaxy does feature an API to query those things. And in fact galaxy.ansible.com/api/v1/platforms/ shows the appropriate Fedora versions:

        "id": 143,
        "url": "/api/v1/platforms/143/",
        "related": {},
        "summary_fields": {},
        "created": "2018-01-15T11:54:54.212531Z",
        "modified": "2018-01-15T11:54:54.212560Z",
        "name": "Fedora",
        "release": "27",
        "active": true
        "id": 162,
        "url": "/api/v1/platforms/162/",
        "related": {},
        "summary_fields": {},
        "created": "2018-04-30T16:35:24.066120Z",
        "modified": "2018-04-30T16:35:24.066153Z",
        "name": "Fedora",
        "release": "28",
        "active": true
        "id": 61,
        "url": "/api/v1/platforms/61/",
        "related": {},
        "summary_fields": {},
        "created": "2016-02-04T06:29:41.226911Z",
        "modified": "2016-02-04T06:29:41.226980Z",
        "name": "FreeBSD",
        "release": "10.0",
        "active": true

So Fedora 29 is not supported right now, but there is even a bug report already.

[Short Tip] Use Ansible with managed nodes running Python3

Ansible Logo

Python 3 is becoming the default Python version on more and more distributions. Fedora 28 ships Python 3, and RHEL 8 is expected to ship Python 3 as well.

With Ansible this can lead to trouble: some of these distributions do not ship a default /usr/bin/python but instead insist on picking either /usr/bin/python2 or /usr/bin/python3 thus leading to errors when Ansible is called to manage such machines:

TASK [Gathering Facts] 
fatal: []: FAILED! => {"changed": false, "module_stderr": "Connection to closed.\r\n", "module_stdout": "/bin/sh: /usr/bin/python: No such file or directory\r\n", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 127}

The fix is to define the Python interpreter in additional variables. They can even be provided on the command line:

$ ansible-playbook -i, mybook.yml -e ansible_python_interpreter="/usr/bin/python3"

[Short Tip] Call Ansible or Ansible Playbooks without an inventory

Ansible Logo

Ansible is a great tool to automate almost anything in IT. However, one of the core concepts of Ansible is the inventory where the to be managed nodes are listed. However, in some situations setting up a dedicated inventory is overkill.

For example there are many situation where admins just want to ssh to a machine or two to figure something out. Ansible modules can often make such SSH calls in a much more efficient way, making them unnecessary – but creating a inventory first is a waste of time for such short tasks.

In such cases it is handy to call Ansible or Ansible playbooks without an inventory. In case of plain Ansible this can be done by  addressing all nodes while at the same time limiting them to an actual hostslist:

$ ansible all -i jenkins.qxyz.de, -m wait_for -a "host=jenkins.qxyz.de port=8080"
jenkins.qxyz.de | SUCCESS => {
    "changed": false, 
    "elapsed": 0, 
    "path": null, 
    "port": 8080, 
    "search_regex": null, 
    "state": "started"

The comma is needed since Ansible expects a list of hosts – and a list of one host still needs the comma.

For Ansible playbooks the syntax is slightly different:

$ ansible-playbook -i neon.qxyz.de, my_playbook.yml

Here the “all” is missing since the playbook already contains a hosts directive. But the comma still needs to be there to mark a list of hosts.