Monitoring OpenVPN ports and the ways of Open Source

openvpnYears ago I wrote a small OpenVPN port monitoring script for my former employer. Over the time, it got multiple contributions from various users, evolving it into quite some sophisticated piece of software. For me, this is a powerful example how Open Source works even in small ways.

Years ago I created a small Python script to monitor OpenVPN ports for my employer of that time, credativ – under an Open Source license, of course. To be frank: for me it was one of my first Nagios/Icinga monitoring scripts, and one of my first serious Python attempts, thus the code was rather simple. Others probably would have done it in half the time with much better code. But it worked, it met the requirements, the monitoring people were happy.

Over the years, even when I left credativ and stopped working regularly on monitoring environments I carried on to be the maintainer of the code base.

And in fact over time it evolved quite a bit and got many more features:

  • IPv6 support
  • Python 3 support
  • UDP retries
  • response validation
  • dynamic HMAC digests
  • proper Python packaging structure

There were even packages created for Gentoo.

All those features and additions were not written by me, but by multiple contributors. This was only possible because the script was released under an Open Source license, here MIT, to begin with.

For me this rather small, simple example shows one particular way of how Open Source can work: different people had rather similar problems. Instead of re-inventing the wheel and writing their own scripts each time they picked something (I) which already existed and (II) solved parts of their problems, in this case my script. They extended it to fulfil their needs, and submitted the changes. Over time, this lead to a surprisingly sophisticated and powerful script which can be used by many others to solve an even broader range of similar problems. The process was not coordinated, unplanned, but created a worthwhile result from which all parties benefit.

This way of developing Open Source software is quite common – the Linux kernel is arguably the most prominent example, but a broad range of other projects are developed that way as well: Ansible, PostgreSQL, Apache, Kubernetes, etc. But as shown above, this development model does not only benefit the really large, well known projects, but works for small, specialized solutions as well.

To me, this is one of the most preferred ways to show and explain the benefits of Open Source to others: different parties working together – not even necessarily at the same time – on the same source to solve similar problems, extending the quality and capabilities of the solution over time, creating worth for all parties involved and even everyone else who just wants to use the solution.

[Howto] Monitoring Puppet agent run with NRPE-Plugin [Update]

920839987_135ba34fffCentralized configuration management like Puppet is a bless. If it runs properly. So it makes sense to monitor the run of the Puppet agent, and I wrote NRPE plugin to do just that.

Puppet is great, and many of our customers use it, often in combination with a Icinga monitoring setup. However, it might happen that the Puppet agent, for some reason, does not run, or does not run properly. If the infrastructure is large enough, that might slip through your fingers. Thus it makes sense to monitor the Puppet client.

There are already several solutions out there to do just that. Since the Puppet agent does write plenty of status information to /var/lib/puppet/state/last_run_summary.yaml current solutions check the last run time stamp of the file, or try to verify the validity of the Yaml structure. However, a correct Yaml structure does not tell anything about when the Puppet agent actually run last time. Also, the time stamp is also written even if the Puppet agent run fails in the end. There is even a Bash script which does both – but it is a difficult-to maintain piece of code and cannot really speak Yaml, it just greps for certain elements.

Thus I wrote my own script inspired by the solutions mentioned above, but checking the last run state as well as verifying that the Yaml file has proper content – and written in Python, by the way. The script can be tested on command line:

$ sudo /usr/local/lib/nagios/plugins/check_puppetagent -w 3600 -c 9000
OK: Puppet was last run 13 minutes and 21 seconds ago

If the Yaml file is not properly formatted, the script throws an error:

$ python check_puppetagent -w 3600 -c 9000
CRIT: Yaml file not properly formatted, last puppet run failed.

The script does not support any further options or functions. Since the Yaml file does contain much more information it might make sense to give more information back to the monitoring server, or for example also give back the number of failures given in the status file if there are any. But for now, that is not implemented.

The script was also uploaded to Monitoringexchange. Since my employer strongly supports the ideas behind Open Source, I was able to publish the script under the MIT licence. I also wrote a blog post about the script on my German company’s blog.

Update
What I totally forgot: there is also a ruby check script which does mainly the same as the Python script I wrote and was a good inspiration for my code.

[Howto] Monitoring OpenVPN ports with Nagios/Icinga

920839987_135ba34fffA OpenVPN server is usually a crucial part of the IT infrastructure, and thus should be monitored properly. But monitoring UDP is sometimes not that easy, so I wrote a script which can be used in Nagios/Icinga.

OpenVPN is usually accessed via UDP. Since UDP is not as easy to monitor as TCP ports are, many administrators restrain themselves to just monitor if an OpenVPN process is running on the OpenVPN server. However, that does not unveil network problems, and can only be used on machines where you have proper access to: 3rd party machines or appliances are out of your reach with this attempt. Another attempt is to monitor the management port. However, that requires that the port is reachable by the monitoring server which might not be the best idea in case of distributed monitoring. And this is still no option in case of 3rd party machines or other black boxes.

A customer of my employer credativ GmbH had exactly that kind of problem, so I wrote a script in Python. It checks the UDP port of a given server. If the server does respond, the script gives back the state “OK” together with the hex form of the response. The script can be tested on command line:

$ python check_openvpn openvpn.example.com
OK: OpenVPN server response (hex): 4018062d97f85c21d50000000000

The port can be changed by the flag “-p”:

$ python check_openvpn -h
usage: check_openvpn [-h] [-p PORT] [-t] host

positional arguments:
  host                  the OpenVPN host name or ip

optional arguments:
  -h, --help            show this help message and exit
  -p PORT, --port PORT  set port number
  -t, --tcp             use tcp instead of udp

As you see, it also supports testing TCP ports. However, in that case we do not have a return code, we effectively just test if the given tcp port can be reached. Here we switch on TCP support and also modify the port to 443:

$ python check_openvpn -t -p 443 openvpn-tcp.example.com
OK: OpenVPN tcp port reachable.

If the server does not respond within a given time period – 5 seconds – the server throws an error:

$ python check_openvpn slowserver.example.com
CRIT: Request timed out

The script was also uploaded to Monitoringexchange. Since my employer strongly supports the ideas behind Open Source, I was able to publish the script under the MIT licence. I also wrote a blog post about the script on my German company’s blog.