Our GHOSTbusting DevOps Engineer Matt Macdonald-Wallace (@ProfFalken) on the recent Linux security scare. He talks us through how an innovative approach to DevOps can solve these problems and prevent them from happening in the future.
The recent GHOST vulnerability struck fear into the heart of a lot of Linux systems administrators recently when it was revealed that for a couple of years one of the core libraries in the Linux operating system (libc) had been vulnerable to attack.
The vulnerability revealed that you could easily take control of a Linux system attached to the internet if you could send a well-crafted message to a webserver or email server running on that host.
At DevOpsGuys we’re big fans of test-driven development (TDD) and we’re always looking for ways to share knowledge between Dev and Ops so when this gave us an opportunity to try out a new way of managing system upgrades across a large number of customer we leapt at the chance!
The plan was to follow TDD methods to create a monitoring check that would fail (the test) and then patch systems until it passed (the “development”).
The monitoring scripts were written as a “minimum-viable” check for those Linux distributions that we use in house and then added to our monitoring system.
We use Dataloop.io for monitoring and this meant that rolling out our scripts to all our hosts was easy – write the script, add it as a dataloop plugin, add that plugin to the relevant tag and wait for Dataloop to deploy the check. In total it took less than three minutes to roll out our check to all of our systems and have it reporting back.
The next step was to create a custom dashboard to see all the hosts that had the plugin enabled. Again, this was trivial to do using Dataloop’s drag and drop dashboard creation tool and within minutes we had a complete overview of all the servers under our control that were being monitored for GHOST – many of which were showing as “critical”, meaning that they needed the update applying.
The great thing about approaching the problem this way was that we were able to see if any hosts were already secure (anything running Ubuntu 14.10 or later was not affected by this patch) and that the check worked correctly.
We then ran our update procedures as normal, however we were able to prove that all of our servers had received the update by watching the dashboard turn green as each server updated and the check was executed again.
We’re going to keep the checks in place so that we know in future if any of our servers for some reason revert to a “bad” version and from now on, any major vulnerability like this will be dealt with in the same way.
Special thanks to Steven Acreman of Dataloop.io for helping us with assisting in modifications to the scripts and ensuring the data showed up on the dashboards in the way we required it.
If you want to follow our example and check your systems in the same way, then we’ve open-sourced the checks and you can download them from our Public Checks Git repository.