[Howto] Launch traefik as a docker container in a secure way

Traefik is a great reverse proxy solution, and a perfect tool to direct traffic in container environments. However, to do that, it needs access to docker – and that is very dangerous and must be tightly secured!

The problem: access to the docker socket

Containers offer countless opportunities to improve the deployment and management of services. However, having multiple containers on one system, often re-deploying them on the fly, requires a dynamic way of routing traffic to them. Additionally, there might be reasons to have a front end reverse proxy to sort the traffic properly anyway.

In comes traefik – “the cloud native edge router”. Among many supported backends it knows how to listen to docker and create dynamic routes on the fly when new containers come up.

To do so traefik needs access to the docker socket. Many people decide to just provide that as a volume to traefik. This usually does not work because SELinux prevents it for a reason. The apparent workaround for many is to run traefik in a privileged container. But that is a really bad idea:

Docker currently does not have any Authorization controls. If you can talk to the docker socket or if docker is listening on a network port and you can talk to it, you are allowed to execute all docker commands. […]
At which point you, or any user that has these permissions, have total control on your system.

http://www.projectatomic.io/blog/2014/09/granting-rights-to-users-to-use-docker-in-fedora/

The solution: a docker socket proxy

But there are ways to securely provide traefik the access it needs – without exposing too much permissions. One way is to provide limited access to the docker socket via tcp via another container which cannot be reached from the outside that easily.

Meet Tecnativa’s docker-socket-proxy:

What?
This is a security-enhaced proxy for the Docker Socket.
Why?
Giving access to your Docker socket could mean giving root access to your host, or even to your whole swarm, but some services require hooking into that socket to react to events, etc. Using this proxy lets you block anything you consider those services should not do.

https://github.com/Tecnativa/docker-socket-proxy/blob/master/README.md

It is a container which connects to the docker socket and exports the API features in a secured and configurable way via TCP. At container startup it is configured with booleans to which API sections access is granted.

So basically you set up a docker proxy to support your proxy for docker containers. Well…

How to use it

The docker socket proxy is a container itself. Thus it needs to be launched as a privileged container with access to the docker socket. Also, it must not publish any ports to the outside. Instead it should run on a dedicated docker network shared with the traefik container. The Ansible code to launch the container that way is for example:

- name: ensure privileged docker socket container
  docker_container:
    name: dockersocket4traefik
    image: tecnativa/docker-socket-proxy
    log_driver: journald
    env:
      CONTAINERS: 1
    state: started
    privileged: yes
    exposed_ports:
      - 2375
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:z"
    networks:
      - name: dockersocket4traefik_nw

Note the env right in the middle: that is where the exported permissions are configured. CONTAINERS: 1  provides access to container relevant information. There are also SERVICES: 1 and SWARM: 1 to manage access to docker services and swarm.

Traefik needs to have access to the same network. Also, the traefik configuration needs to point to the docker container via tcp:

[docker]
endpoint = "tcp://dockersocket4traefik:2375"

Conclusion

This setup works surprisingly easy. And it allows traefik to access the docker socket for the things it needs without exposing critical permissions to take over the system. At the same time, full access to the docker socket is restricted to a non-public container, which makes it harder for attackers to exploit it.

If you have a simple container setup and use Ansible to start and stop the containers, I’ve written a role to get the above mentioned setup running.

Advertisements

[Howto] Automated DNS resolution for KVM/libvirt guests with a local domain [Update]

libvirt_logo-svg

I often run demos on my laptop with the help of libvirt. Managing 20+ machines that way is annoying when you have no DNS resolution for those. Luckily, with libvirt and NetworkManager, that can be easily solved.

The problem

Imagine you want to test something in a demo setup with 5 machines. You create the necessary VMs in your local KVM/libvirt environment – but you cannot address them properly by name. With 5 machines you also need to write down the appropriate IP addresses – that’s hardly practical.

It is possible to create static entries in the libvirt network configuration – however, that is still very inflexible, difficult to automate and only works for name resolution inside the libvirt environment. When you want to ssh into a running VM from the host, you again have to look up the IP.

Name resolution in  the host network would be possible by adding each entry to /etc/hosts additionally. But that would require the management of two lists at the same time. Not automated, far from dynamic, and very ponderous.

The solution

Luckily, there is an elegant solution: libvirt comes with its own in-build DNS server, dnsmasq. Configured properly, that can be used to serve DHCP and DNS to servers respecting a previous defined domain. Additionally, NetworkManager can be configured to use its own dnsmasq instance to resolve DNS entries – forwarding requests to the libvirt instance if needed.

That way, the only thing which has to be done is setting a proper host name inside the VMs. Everything else just works out of the box (with a recently Linux, see below).

The solution presented here is based on great post from Dominic Cleal.

Configuring libvirt

First of all, libvirt needs to be configured. Given that the network “default” is assigned to the relevant VMs, the configuration should look like this:

$ sudo virsh net-dumpxml default
<network connections='1'>
  <name>default</name>
  <uuid>158880c3-9adb-4a44-ab51-d0bc1c18cddc</uuid>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='virbr0' stp='on' delay='0'/>
  <mac address='52:54:00:fa:cb:e5'/>
  <domain name='qxyz.de' localOnly='yes'/>
  <ip address='192.168.122.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='192.168.122.128' end='192.168.122.254'/>
    </dhcp>
  </ip>
</network>

You can modify the network for example with the command virsh net-edit default. The interesting part is below the mac address: a local domain is defined and marked as localOnly. That domain will be the authoritative domain for the relevant VMs, and libvirt will configure dnsmasq to act as a resolver for that domain. The attribute makes sure that DNS requests regarding that domain will never be forwarded upstream. This is important to avoid loop holes.

Note, however: as mentioned in the comment by taurus, your domain should not be named “local” because this might cause trouble in relation to mDNS.

Configuring the VM guests

When the domain is set, the guests inside the VMs need to be defined. With recent Linux releases this is as simple as setting the host name:

$ sudo hostnamectl set-hostname neon.qxyz.de

There is no need to enter the host name anywhere else: the command above takes care of that. And the default configuration of DHCP clients of recent Linux releases sends this hostname together with the DHCP request – dnsmasq picks the host name automatically  up if the domain matches.

If you are on a Linux where the hostnamectl command does not work, or where the DHCP client does not send the host name with the request – switch to a recent version of Fedora or RHEL 😉

Because with such systems the host name must be set manually. To do so follow the documentation of your OS. Just ensure that the resolution of the name works locally. Additionally, besides the hostname itself the DHCP configuration must be altered to send along the hostname. For example, in older RHEL and Fedora versions the option

DHCP_HOSTNAME=neon.qxyz.de

has to be added to /etc/sysconfig/network-scripts/ifcfg-eth0.

At this point automatic name resolution between VMs should already work after a restart of libvirt.

Configuring NetworkManager

The last missing piece is the configuration of the actual KVM/libvirt host, so that the local domain, here qxyz.de, is properly resolved. Adding another name server to /etc/resolv.conf might work for a workstation with a fixed network connection, but certainly does not work for laptops which have changing network connections and DNS servers all the time. In such cases, the NetworkManager is often used anyway so we take advantage of its capabilities.

First of all, NetworkManager needs to start its own version of dnsmasq. That can be achieved with a simple configuration option:

$ cat /etc/NetworkManager/conf.d/localdns.conf 
[main]
dns=dnsmasq

This second dnsmasq instance just works out of the box. All DNS requests will automatically be forwarded to DNS servers acquired by NetworkManager via DHCP, for example. The only notable difference is that the entry in /etc/resolv.conf is different:

# Generated by NetworkManager
search whatever
nameserver 127.0.0.1

Now as a second step the second dnsmasq instance needs to know that for all requests regarding qxyz.de the libvirt dnsmasq instance has to be queried. This can be achieved with another rather simple configuration option, given the domain and the IP from the libvirt network configuration at the top of this blog post:

$ cat /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf 
server=/qxyz.de/192.168.122.1

And that’s it, already. Restart NetworkManager and everything should be working fine.

As a side node: if the attribute localOnly would not have been set in the libvirt network configuration, queries for unknown qxyz.de entries would be forwarded from the libvirt dnsmasq to the NetworkManager dnsmasq – which would again forward them to the libvirt dnsmasq, and so on. That would quickly overload your dnsmasq servers, resulting in error messages:

dnsmasq[15426]: Maximum number of concurrent DNS queries reached (max: 150)

Summary

With these rather few and simple changes a local domain is established for both guest and host, making it easy to resolve their names everywhere. There is no need to maintain one or even two lists of static IP entries, everything is done automatically.

For me this is a huge relief, making it much easier in the future to set up demo and test environments. Also, it looks much nicer during a demo if you have FQDNs and not IP addresses. I can only recommend this setup to everyone who often uses libvirt/KVM on a local machine for test/demo environments.

[Short Tip] Fix mount problems in RHV during GlusterFS mounts

Gluster Logo

When using Red Hat Virtualization or oVirt together with GLusterFS, there might be a strange error during the first creation of a storage domain:

Failed to add Storage Domain xyz.

One of the rather easy to fix reasons might be a permission problem: an initial Gluster exported file system belongs to the user root. However, the virtualization manager (ovirt-m bzw. RHV-M) does not have root rights and such needs another ownership.

In such cases, the fix is to mount the exported volume & set the user rights to the rhv-m user.

$ sudo mount -t glusterfs 192.168.122.241:my-vol /mnt
# cd /mnt/
# chown -R 36.36 .

Afterwarsd, the volume can be mounted properly. Some more general details can be found at RH KB 78503.

Impressions from #AnsibleFest London 2016 [Update]

Ansible LogoThe #AnsibleFest was taking place in London, and I was luckily able to attend. This post shares some impressions from the event, together with interesting announcements and stories.

Update: The slides of the various presentations are now available.

Preface

The #AnsibleFest London 2016 took place near the O2 Arena and lasted the entire day. The main highlight of the conference was the network automation coming along with Ansible now. Other very interesting talks covered very helpful tips about managing Windows Servers, the 101 on modules, how to implement continuous deployment, the journey of a french bank towards DevOps, how Cisco devices can be managed and how to handle immutable infrastructure. All focused on Ansible, of course.

But while the conference took place during Thursday, the #AnsibleFest started already the evening before: at the social event Ansible Social.
Ansible Social
And it was a wonderful evening: many people from Ansible, partners, coworkers from Red Hat and others were there to enjoy drinks, food and chatting through the evening. Getting to know many of the people there went pretty well, it was a friendly bunch meeting at a pretty nice place.
Ansible Social

Keynote

Upon arrival at the conference area one of the sponsor desks immediately caught the eye: Cisco!
20160218_084833-01
For everyone following Ansible news closely it was obvious that networking would be a big topic, especially since it was about to be featured twice during the day, once by Peter Sprygada from Ansible and later on by Fabrizio Maccioni from Cisco.

And this impression was confirmed when Todd Barr came to the stage and talked about the current state of Ansible and what to expect in the near future: networking is a big topic for Ansible right now, they are pushing resources into the topic and already hinted that there would be a larger announcement during the #AnsibleFest. During the presentation the strengths of Ansible were of course emphasized again: that it is simple to setup, to understand and to deploy. And that it does not require agents. While I do have my past with Puppet and still like it as a tool in certain circumstances, I must admit that I had to smile at the slide about agents.
Todd Barr at AnsibleFest
I have to admit, for many customers and many setups this is in fact true: they do not want agents for various reasons. And Ansible can deliver actual results without any need for a client.

The future of Ansible

Next up was Bill Nottingham talking setting the road for the future of Ansible. A focus is certainly better integration of Windows (no beta tag anymore!), better testing – and Python 3 support! It was acknowledged that there are more and more distributions out there not providing any Python 2 anymore and that they need to be catered.
Future of Ansible by Bill Nottingham
Ansible Tower was also covered, of course, and has very promising improvements coming up as well: the interface will be streamlined, the credentials and rights system will be improved, and there will be (virtual) appliances to get Ansible Tower out of the box in an instant. But the really exciting part is more large-scale, enterprise focused: Ansible Tower will be able to cater federated setups, meaning distributed replication of Ansible Tower commands via proxy Towers.
Federated Ansible Tower
Don’t expect this all in the next weeks, but we might see many of these features already in Ansible Tower 3.0. And it was mentioned that there might be a release in early fall.

Scaling abilities are indeed needed – many data centers these days have more than one location, or are spread over several departments and thus need partially independent setups to manage the infrastructure. At the same time, there are Ansible customers who are using Ansible with 50k nodes and more out there, and they have a demand for fine grained, federated infrastructure setups as well.

Networking with Ansible

While the upcoming Ansible Tower had some exciting news, the talk about networking support by Peter Sprygada really blew everyone away. Right at the moment of presentation Red Hat issued a press release that they bring DevOps to the network via Ansible:

[Red Hat] is bringing DevOps to networking by extending Ansible – its powerful IT automation and DevOps platform – to include native agentless support for automating heterogeneous network infrastructure devices using the same simple human and machine readable automation language that Ansible provides to IT teams.

Peter picked that up and presented a whole lot of technical details. The most important one was that there are now several networking core modules for commands, configuration and templates.
Ansible networking automation support
They cover a huge load of devices:

  • Arista EOS
  • Cisco NXOS
  • Cisco IOS
  • Cisco IOSXR
  • Cumulus Linux
  • Juniper Junos
  • OpenSwitch

While some of these devices were already supported by the raw module or some libraries out there, but fully integrated modules supported by Ansible and the network device manufacturers themselves takes networking automation to another new level. If you are interested, get the latest Ansible networking right away.

Ansible in a visual effects studio

The next talk was by the customer “Industrial Light and Magic”, a visual effects studio using Ansible to handle there massive setup via Ansible. It showed in particular how many obstacles you face in your daily routine running data centers and deploying software all the time – and how to tackle them using Ansible and Ansible’s features. I forgot to take a photo, though…

Ansible & Windows

John Hawkesworth from M*Modal came up to the stage next and delivered a brilliant speech about all the things needed to know when managing Windows with Ansible. Talking about the differences of Ansible 1.9 vs 2.0 briefly, he went over lessons learned like why the backslash should be escaped every time just to be sure (\t …) but also gave his favourite development and modules quite some attention. Turns out the registration module can come in very handy!
Ansible and Windows

Writing modules, 101

Next up James Cammarata introduced how to write modules for Ansible. And impressively, this was live demonstrated by a module he had written the days before to control his Philips Hue lights. They could be controlled via Ansible live on stage.
Ansible Modules 101
Besides the great live demo the major points of the presentation were:

  • It is quite easy to develop modules.
  • Start of simply, get more complex the further you go down the road.
  • Write a module when your Playbook for a single task exceeds ten lines of code.
  • Write in Python/Powershell when you want it to be integrated with Ansible Core.
  • Write in any language you want if you won’t share it anyway.

While I am sure that other module developers might see some of these points different, it gives a rather good idea what to keep in mind when the topic is approached.

Of course, the the code for the Philips Hue Ansible module is available on Github.

Continuous deployment

Continuous integration is a huge topic in DevOps, and thus especially with Ansible. Steve Smith of Atlassian picked up the topic and discussed what needs to be taken into account when Ansible is used to enable continuous integration.
Continous Integration with Ansible
And there were many memorable quotes during the talk which made it simply fun to watch. I particularly like this one:

Release features, not dumps.

It means: do release when you have something worth releasing – not at arbitrary dates. It is a strong statement against release or maintenance windows and does make sense: after all, why should you release when its not worth? And you certainly will not wait if it is important!

Also, since many maintenance windows are implemented because doing maintenance is hard:

Everything which is hard should be done more often, not less.

Combined with the fact that very complex, but successful enterprises do 300 releases an hour it is clear that continuous deployment is possible – but what often is needed is the right culture and probably at some point a great, simple to use tool able to cater the needs of complex infrastructure.

Ansible accelerates deployment

The next talk focused on a vertical which might usually say that they are too regulated and “special” to integrate DevOps: financial. Fabrice Bernhard presented the journey of the Bank Société Générale introducing DevOps principles with the help of Ansible to become more agile, more flexible and to be able to respond quicker to changes. The reason for that was summarized in a very good quote:

It’s not the big that eat the small. It’s the fast that eat the slow.

This is true for all the enterprises out there: software enabled companies have attacked almost any given business out there (Amazon vs Walmart, Uber vs cabs, Airbnb vs hotels and hostels, etc.). And there are enough analysts right now who see the banking market as the next big thing which might be seriously disrupted due to mobile payment, blockchain technology and other IT based developments.
Ansible and the challenges for businesses

But that also shows what the actual change must be about: the new companies do not take over because they have the better technology. They take over because they have a different culture, and approach problems totally different. And thus, to keep up with the development, change your culture. Or, as said on stage:

Automation is about cultural change. Move fast and break things!

DevOps discussion

After these two powerful talks the audience had a chance to catch some breath during the interactive DevOps discussion. It mainly picked up the topics from the previous talks, and it showed that everyone in the room is pretty sure that DevOps as such is more or less a name on the underlying situation that enterprises need to adopt – or they will fail in the long term, no matter how big they are.

Managing your Cisco data center – with Ansible

As already mentioned, Fabrizio Maccioni from Cisco had the second talk about managing networks with Ansible.
Ansible and Cisco
Interestingly enough, he mentioned that the interest to support Ansible was brought to them by customers who were already managing part of their infrastructure with Ansible. A key point is that Ansible does not require an agent. While Cisco does support some configuration management agents on their hardware, it seems that most of the customers would not do that.
Ansible is good becaue agentless

Immutable infrastructure

The last presentation was held by Vik Bhatti from Beamly. Their problem is that sometimes they have to massively scale in seconds. Literally, in seconds. That requires them to have images of machines up and running in no time. They do this with Ansible, having the playbooks right on the images on one hand, and using Ansible to control their image build process on the other. Actually, the image builder is Packer and it uses Ansible to partially build the image.

As a result, down the line they have images ready to deploy and can extend their environment very, very, very quickly. Since they are able to respond that fast, they were able to cut down hardware costs massively.

Final discussions, happy hour

The final panel dealt mainly with questions about Open Source Tower (it will be there eventually, but no fixed date) and similar questions. After that, everyone went to enjoy drinks and a beautiful skyline.
AnsibleFest skyline and happy hour

Conclusion

In conclusion the #AnsibleFest was a great success, in terms of the people I met as well as in terms of the technical discussions. I can’t wait to get my hand on the networking modules. I’d like to thank the people from Ansible making this event possible, and of course my employer Red Hat for making it possible to visit this event.

[Short Tip] Debug Spamassassin within Amavisd

920839987_135ba34fff
Filtering e-mail for spam and viruses can be done efficiently with Amavisd-New. Besides its own technologies to identify and filter out Spam it can also make use of Spamassassin and its results. However, since Amavisd starts Spamassassin itself, it sometimes is hard to debug when something is not working.

For example in a recent case I saw that the Bayes database was not used when checking for spam characteristics, although the database was properly trained with ham and spam.

Thus first I checked Spamassassin itself:

$ su -s /bin/bash mailuser -c "spamassassin -D -t < ExampleSpam.eml 2>&1"  | tee sa.out

That worked well, the Bayes database was queried, results were shown.

Next, I added $sa_debug = '1,all'; to the Amavisd configuration and run Amavisd in debug mode:

$ amavisd -c /etc/amavisd/amavisd.conf debug

And that showed the problem: one of the Bayes files had wrong permissions. After fixing those, the filter run again properly.

[Howto] Solaris 11 on KVM

solarisRecently I had to test a few things on Solaris 11 and wondered how well it works virtualized with KVM. It does – with a few tweaks.

Preface

Testing various different versions of operating systems is easy these days thanks to virtualization. However, I’m mainly used to Linux variants and hardly ever install any other kind of UNIX based OS. Thus I was curious if an installation of Solaris 11 on KVM / libvirt works.

For the test I actually used virt-manager since it does provide neat defaults during the VM setup. But the same comments and lessons learned are true for the command line tool as well.

Setting up the VM

virt-manager usually does not provide Solaris as an operating system type by default in the VM setup dialog. You first have to click on “OS Type”, “Show all OS options” as shown here:
virt-manager Solaris guest picker

Note, a Solaris 11 should have at least 2 GB RAM, otherwise the installation and also booting might take very long or run into their very own problems.

The installation runs through – although quite some errors clutter the screen (see below).

Errors and problems

As soon as the machine is started several error messages are shown:

WARNING: /pci@0,0/pci1af4,1100@6,1 (uhci1): No SOF interrupts have been received, this USB UHCI host controller is unusable
WARNING: /pci@0,0/pci1af4,1100@6,2 (uhci2): No SOF interrupts have been received, this USB UHCI host controller is unusable

This shows that something is wrong with the interrupts and thus withe the “hardware” of the machine – or at least with the way the guest machine discovers the hardware.

Additionally, even if DHCP is configured, the machine is unable to obtain the networking configuration. A fixed IP address and gateway do not help here, either. The host system might even report that it provides DHCP data, but the guest system continues to request these:

Dez 23 11:11:05 liquidat dnsmasq-dhcp[13997]: DHCPDISCOVER(virbr0) 52:54:00:31:31:4b
Dez 23 11:11:05 liquidat dnsmasq-dhcp[13997]: DHCPOFFER(virbr0) 192.168.122.205 52:54:00:31:31:4b
Dez 23 11:11:09 liquidat dnsmasq-dhcp[13997]: DHCPDISCOVER(virbr0) 52:54:00:31:31:4b
Dez 23 11:11:09 liquidat dnsmasq-dhcp[13997]: DHCPOFFER(virbr0) 192.168.122.205 52:54:00:31:31:4b
Dez 23 11:11:17 liquidat dnsmasq-dhcp[13997]: DHCPDISCOVER(virbr0) 52:54:00:31:31:4b
Dez 23 11:11:17 liquidat dnsmasq-dhcp[13997]: DHCPOFFER(virbr0) 192.168.122.205 52:54:00:31:31:4b
...

Also, when the machine is shutting down and ready to be powered off, the CPU usage spikes to 100 %.

The solution: APIC

The solution for the “hardware” problems mentioned above and also for the networking trouble is to deactivate a APIC feature inside the VM: x2APIC, Intel’s programmable interrupt controller. Some more details about the problem can be found in the Red Hat Bugzilla entry #1040500.

To apply the fix the virtual machine definition needs to be edited to disable the feature. The xml definition can be edited with the command sudo virsh edit with the machine name as command line option, the change needs to be done in the section cpu as shown below. Make sure tha VM is stopped before the changes are done.

$ sudo virsh edit krypton
...
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Broadwell</model>
    <feature policy='disable' name='x2apic'/>
  </cpu>
$ sudo virsh start krypton

After this changes Solaris does not report any interrupt problems anymore and DHCP works without flaws. Note however that the CPU still spikes at power off. If anyone knows a solution to that problem I would be happy to hear about it and add it to this post.

[Howto] OpenSCAP – basics and how to use in Satellite

Open-SCAP logoSecurity compliance policies are common in enterprise environments and must be evaluated regularly. This is best done automatically – especially if you talk about hundreds of machines. The Security Content Automation Protocol provides the necessary standards around compliance testing – and OpenSCAP implements these in Open Source tools like Satellite.

Background

Security can be ensured by various means. One of the processes in enterprise environments is to establish and enforce sets of default security policies to ensure that all systems at least follow the same set of IT baseline protection.

Part of such a process is to check the compliance of the affected systems regularly and document the outcome, positive or negative.

To avoid checking each system manually – repeating the same steps again and again – a defined method to describe policies and how to test these was developed: the Security Content Automation Protocol, SCAP. In simple words, SCAP is a protocol that describes how to write security compliance checklists. In real worlds, the concept behind SCAP is little bit more complicated, and it is worth reading through the home page to understand it.

OpenSCAP is a certified Open Source implementation of the Security Content Automation Protocol and enables users to run the mentioned checklists against Linux systems. It is developed in the broader ecosystem of the Fedora Project.

How to use OpenSCAP on Fedora, RHEL, etc.

Checking the security compliance of systems requires, first and foremost, a given set of compliance rules. In a real world environment the requirements of the given business would be evaluated and the necessary rules would be derived. In industries there are also pre-defined rules.

For a start it is sufficient to utilize one of the existing rule sets. Luckily, the OpenSCAP packages in Fedora, Red Hat Enterprise Linux and relate distributions are shipped with a predefined set of compliance checks.

So, first install the necessary software and compliance checks:

$ sudo dnf install scap-security-guide openscap-scanner

Check which profiles (checklists, more or less) are installed:

$ sudo oscap info /usr/share/xml/scap/ssg/content/ssg-fedora-ds.xml
Document type: Source Data Stream
Imported: 2015-10-20T09:01:27

Stream: scap_org.open-scap_datastream_from_xccdf_ssg-fedora-xccdf-1.2.xml
Generated: (null)
Version: 1.2
Checklists:
Ref-Id: scap_org.open-scap_cref_ssg-fedora-xccdf-1.2.xml
Profiles:
xccdf_org.ssgproject.content_profile_common
Referenced check files:
ssg-fedora-oval.xml
system: http://oval.mitre.org/XMLSchema/oval-definitions-5
Checks:
Ref-Id: scap_org.open-scap_cref_ssg-fedora-oval.xml
No dictionaries.

Run a test with the available profile:

$ sudo oscap xccdf eval \
--profile xccdf_org.ssgproject.content_profile_common \
--report /tmp/report.html \
/usr/share/xml/scap/ssg/content/ssg-fedora-ds.xml

In this example, the result will be printed to /tmp/report.html and roughly looks like this:

Report

If a report is clicked, more details are shown:

Details

The details are particularly interesting if a test fails: they contain rich information about the test itself: the rationale behind the compliance policy itself to help auditors to understand the severity of the failing test, as well as detailed technical information about what was actually checked so that sysadmins can verify the test on their own. Also, linked identifiers provide further information like CVEs and other sources.

Usage in Satellite

Red Hat Satellite, Red Hat’s system management solution to deploy and manage RHEL instances has the ability to integrate OpenSCAP. The same is true for Foreman, one of the Open Source projects Satellite is based upon.

While the OpenSCAP packages need to be extra installed on a Satellite server, the procedure is fairly simple:

$ sudo yum install ruby193-rubygem-foreman_openscap puppet-foreman_scap_client -y
...
$ sudo systemctl restart httpd &amp;amp;amp;amp;&amp;amp;amp;amp; sudo systemctl restart foreman-proxy

Afterwards, SCAP policies can be configured directly in the web interface, under Hosts -> Policies:

Satellite-SCAP

Beforehand you might want to check if proper SCAP content is provided already under Hosts -> SCAP Contents. If no content is shown, change the Organization to “Any Context” – there is currently a bug in Satellite making this step necessary.

When a policy has been created, hosts need to be assigned to the policy. Also, the given hosts should be supplied with the appropriate Puppet modules:

SCAP-Puppet

Due to the Puppet class the given host will be configured automatically, including the SCAP content and all necessary packages. There is no need to do any task on the host.

However, SCAP policies are checked usually once a week, and shortly after installation the admin probably would like to test the new capabilities. Thus there is also a manual way to start a SCAP run on the hosts. First, Puppet must be triggered to run at least once to download the new module, install the packages, etc. Afterwards, the configuration must be checked for the internal policy id, and the OpenSCAP client needs to be run with the id as argument.

$ sudo puppet agent -t
...
$ sudo cat /etc/foreman_scap_client/config.yaml
...
# policy (key is id as in Foreman)

2:
:profile: 'xccdf_org.ssgproject.content_profile_stig-rhel7-server-upstream'
...
$ sudo foreman_scap_client 2
DEBUG: running: oscap xccdf eval --profile xccdf_org.ssgproject.content_profile_stig-rhel7-server-upstream --results-arf /tmp/d20151211-2610-1h5ysfc/results.xml /var/lib/openscap/content/96c2a9d5278d5da905221bbb2dc61d0ace7ee3d97f021fccac994d26296d986d.xml
DEBUG: running: /usr/bin/bzip2 /tmp/d20151211-2610-1h5ysfc/results.xml
Uploading results to ...

If a Capsule is involved as well, the proper command to upload the report to the central server is smart-proxy-openscap-send.

After these steps Satellite provides a good overview of all reports, even on the dashboard:

SCAP-Reports

As you see: my demo system is certainly out of shape! =D

Conclusion

SCAP is a very convenient and widely approved way to evaluate security compliance policies on given systems. The SCAP implementation OpenSCAP is not only compatible with the SCAP standards and even a certified implementation, it also provides appealing reports which can be used to document the compliance of systems while at the same time incorporates enough information to help sysadmins do their job.

Last but not least, the integration with Satellite is quite nice: proper checklists are already provided for RHEL and others, Puppet makes sure everything just falls into place, and there is a neat integration into the management interface which offers RBAC for example to enable auditors to access the reports.

So: if you are dealing with security compliance policies in your IT environment, definitely check out OpenSCAP. And if you have RHEL instances, take your Satellite and start using it – right away!!