How to build a Puppet repo using r10k with roles and profiles

Puppet has evolved a lot over the years, battle worn and weary operations people have come up with methods and patterns for organising Puppet repos and new tools have emerged that take care of some of the annoying bits of managing Puppet. Avoiding the age old problem of monolithic, scary piles of code that everyone's afraid to touch is now possible as long as you take care to apply well worn patterns and ideas. In this article we'll take a look at r10k and some common design patterns for Puppet.

R10K for repo management

If you haven't heard of r10K yet please take a look at the projects homepage and also at Adrien Thebo's introductory post on the project. R10K tries to solve several problems that have cropped up over years of Puppet use in most ops teams, among these are mangaging modules, managing environments and deployment.

r10K comes in the form of a Ruby Gem that provides several commands for managing a Puppet repo, to use it we must split a Puppet repo into two parts, a control repo that contains some configuration, including an r10k config file that points to the second repo, which will contain our actual Puppet code. Each branch of the second repo is checked out by r10k into a separate folder and functions as a directory environment, allowing you to have any set of environments that you need, dictated by the branches of your Puppet repo. You can assign agents to an environment through their own local puppet.conf file or on your puppet master using an ENC.

A Puppetfile is used to declare third party modules within your Puppet repo, r10k will downloadn and manage these modules for you in a similar vein to Librarian Puppet. This makes life much easier when working with version control and acts like a dependency manager in a similar vein to Bundler, Composer, Pip and the like.

Roles and profiles for config

Let's consider how we are going to structure our Puppet repo, I prefer to use the roles and profiles pattern that has become popular within the puppet community, to get a full idea of the pattern please read Craig Dunn's post on it. To summarise our own Puppet code will reside in role and profile modules, nodes will be assigned a 'role' class (such as webserver, vpnserver etc.), within this role class specific 'profile' classes will be included that will assign purpose/technology specific config, such as VPN server config, MySQL config and the like.

This method allows you to keep your own modules nicely organised and named in such a way that they shouldn't clash with any third party modules and should hopefully make your repo easy to read and understand, rather than being a winding mess of modules glued together in an incomprehensible fashion.

Hiera for node classification

We are nearly done with the theory, the last piece of the puzzle is to consider how we will assign config to each node. Rather than having node definition files in a nodes folder, or an ENC, we are going to keep things simple and just use Hiera for node classification. We'll have a yaml file named for each node's FQDN, in each yaml file we will assign a role class from our role module.

Attach the Puppet strings

It's time to get our hands dirty and start Puppetising some infrastructure.

If you've been through my previous tutorial on creating a VPN using Terraform then you may aleady have a a folder to house infratructure code. If not then you may want to create a folder to hold the various parts of code that will make up your infrastruture, such as Puppet, Vagrant and Terraform code. I tend to name my main infratructure folder in the form of a domain name based on the organisation that infrastructure is for, for instance I would use 'techpunch.co.uk-inf' as a base folder, in the following examples replace the example.domain.com-inf placeholder as needed.

Let's create the Puppet repo, we will create the control repo afterwards:

cd ~/example.domain.com-inf
mkdir -p puppet/manifests
cd puppet
touch environment.conf Puppetfile manifests/site.pp

In the environment.conf we will direct Puppet to the main manifest file (manifests/site.pp) and our module directories, which will be 'modules' and 'site', the modules directory will contain third party modules and the site directory will contain local/custom modules, which will be the roles and profiles modules mentioned earlier.

Go ahead and set the contents of environment.conf like so:

# ~/example.domain.com-inf/puppet/environment.conf
manifest = manifests/site.pp
modulepath = modules:site

Since we are going to be using hiera to assign classes and config to our nodes the main manifest file, site.pp, is going to be very simple, it will just include any classes that have been assigned by hiera, so set the contents like so:

# ~/example.domain.com-inf/puppet/manifests/site.pp
hiera_include('classes')

Puppetfiles are used to define third party modules that we want to use, r10k will read a Puppetfile and download listed modules for us so we don't have to worry about managing them. At present r10k does not automatically pull in module dependencies so you will need to ensure that you add these in yourself, most modules list any required modules in their readme file, this feature should be appearing in r10k fairly soon however.

First off let's add a few third party modules that we are going to need into our repo's Puppetfile:

# ~/example.domain.com-inf/puppet/Puppetfile
forge "http://forge.puppetlabs.com"

mod 'mthibaut/users'
mod 'puppetlabs/concat'
mod 'puppetlabs/firewall'
mod 'puppetlabs/stdlib'
mod 'saz/locales'

As mentioned earlier we are going to employ the profiles and roles pattern, each node will be assigned a class from our roles module by hiera, each role will include various classes from the profiles module, these classes will configure particular packages (such as OpenVPN) along with any related packages or configuration files.

Create the profiles and roles modules like so:

cd ~/example.domain.com-inf/puppet
mkdir -p site/profiles/manifests site/roles/manifests
touch site/roles/manifests/default.pp

The default.pp role class will just configure some common packages and config for us, later on more detailed roles can be added eg. vpn.pp, webserver.pp. For now let's just include a common profiles class that we will create shortly:

# ~/example.domain.com-inf/puppet/site/roles/manifests/default.pp
# default server role
class roles::default {
    include profiles::common
}

Let's go ahead and create the common profile class:

cd ~/example.domain.com-inf/puppet
touch site/profiles/manifests/common.pp

Set the contents like so:

# ~/example.domain.com-inf/puppet/site/profiles/manifests/common.pp
# Config common to all nodes
class profiles::common {
    # common users
    users { 'common': }

    # sshd config
    include profiles::ssh::server

    # base firewall config
    include profiles::firewall::setup

    # common packages needed everywhere
    package {[
            'vim',
            'sudo',
            'screen'
        ]:
        ensure => present,
    }

    # set locale
    class { 'locales':
        default_locale => 'en_GB.UTF-8',
        locales        => ['en_GB.UTF-8 UTF-8'],
    }
}

The idea behind the common profile class is to take care of basic config that we want on every node, this class sets up users (using the 'mthibaut/users' module that we added to the Puppetfile), ssh config, basic firewall settings, a few common packages and finally locale settings. We are going to need to setup a few things and create a few more profile classes to get everything in place for the common profile.

The users module that we are employing allows user config to be drawn from hiera, I find this to be a convenient way of creating user accounts, let's setup hiera and create the ubiquitous common.yaml file where we will place a user account.

cd ~/example.domain.com-inf/puppet
mkdir hiera
touch hiera/common.yaml

Add user details as needed into common.yaml, in this example I am just going to add one user for myself, I am giving this user sudo access along with a password, I am also adding an SSH key as we are going to turn off password based ssh logins completely.

To generate a password hash you can user the mkpasswd utility, you will need to install the whois package in order to use it:

sudo apt-get install whois
mkpasswd -m sha-512

For the ssh key you will need the contents of the public part of the key that you want to use to login to your nodes, leave off the ssh-rsa part at the front and the username/email address part at the end:

# ~/example.domain.com-inf/puppet/hiera/common.yaml
---
users_common:
  {username}:
    ensure: present
    uid: 500
    groups:
      - sudo
    comment: {your name}
    managehome: true
    password: '{password hash}'
    shell:  /bin/bash
    ssh_authorized_keys:
      userkey:
        type: 'ssh-rsa'
        key:  '{public key hash}'

Later on we will setup hiera in our control repo, we'll do that after we have created the other profile classes in common.pp.

Let's move on to the ssh profile class, I am going to put this into it's own subfolder just to keep things neat and tidy:

cd ~/example.domain.com-inf/puppet
mkdir site/profiles/manifests/ssh
touch site/profiles/manifests/ssh/server.pp

Set the contents of server.pp like so:

# ~/example.domain.com-inf/puppet/site/profiles/manifests/ssh/server.pp
# Sets ssh config for all instances
class profiles::ssh::server {
    package { 'ssh':
        ensure => present,
    } ->
    file { '/etc/ssh/sshd_config':
        ensure  => present,
        owner   => 'root',
        group   => 'root',
        mode    => '0600',
        content => file( 'profiles/ssh/sshd_config'),
    } ~>
    service { 'ssh':
        ensure     => running,
        hasstatus  => true,
        hasrestart => true,
        enable     => true,
    }
}

The chaining arrows between the declarations indicate dependencies and notifications. The key element here is the sshd_config file, we are going to turn off password based logins and remote root access. Puppet's File function will expect to find files in a 'files' subdirectory of the profiles module, I have popped the sshd_config file into an ssh subdirectory of this files directory to keep things organised, let's create the folders and the sshd_config file:

cd ~/example.domain.com-inf/puppet
mkdir -p site/profiles/files/ssh
touch site/profiles/files/ssh/sshd_config

Set the contents of the sshd_config file like so (this file is taken from Debian 8, you may want to use a different one depending on your distro):

# ~/example.domain.com-inf/puppet/site/profiles/files/ssh/sshd_config
Port 22
Protocol 2
HostKey /etc/ssh/ssh_host_rsa_key
HostKey /etc/ssh/ssh_host_dsa_key
HostKey /etc/ssh/ssh_host_ecdsa_key
UsePrivilegeSeparation yes

# Lifetime and size of ephemeral version 1 server key
KeyRegenerationInterval 3600
ServerKeyBits 768

# Logging
SyslogFacility AUTH
LogLevel INFO

# Authentication:
LoginGraceTime 20
# No remote root login
PermitRootLogin no
StrictModes yes
# No remote password login
PasswordAuthentication no

RSAAuthentication yes
PubkeyAuthentication yes

IgnoreRhosts yes
RhostsRSAAuthentication no
HostbasedAuthentication no
PermitEmptyPasswords no
ChallengeResponseAuthentication no
X11Forwarding yes
X11DisplayOffset 10
PrintMotd no
PrintLastLog yes
TCPKeepAlive yes

# Allow client to pass locale environment variables
AcceptEnv LANG LC_*

Subsystem sftp /usr/lib/openssh/sftp-server
UsePAM yes

We are nearly done with getting everything place for our common profile class, the last piece that we need is the firewall profile class. Puppetlabs have a nifty firewall module that we are going to use, create the folder and files for our firewall profile class like so:

cd ~/example.domain.com-inf/puppet
mkdir -p site/profiles/manifests/firewall
touch site/profiles/manifests/firewall/post.pp site/profiles/manifests/firewall/pre.pp site/profiles/manifests/firewall/setup.pp

We are going to use the recommended setup method for Puppetlab's firewall module, the pre.pp class will set the basic firewall rules and post.pp will be used to block all traffic unless explicitly allowed, a parent class (setup.pp) will be used to setup these two classes and clear out any firewall rules that are not declared in our puppet repo. Package specific firewall rules (such as allowing http traffic) will live within the profile class for a particular package, so if we create an apache profile class later on it will contain a firewall rule for http traffic as opposed to putting everything into the firewall profile class.

Set the contents of the firewall profile classes like so:

# ~/example.domain.com-inf/puppet/site/profiles/manifests/firewall/setup.pp
# Clears rules and sets up pre and post classes
class profiles::firewall::setup {
    resources { 'firewall':
        purge => true
    }

    Firewall {
        before  => Class['profiles::firewall::post'],
        require => Class['profiles::firewall::pre'],
    }

    class { ['profiles::firewall::pre', 'profiles::firewall::post']: }

    class { 'firewall': }
}

# ~/example.domain.com-inf/puppet/site/profiles/manifests/firewall/pre.pp
# First off, basic firewall rules
class profiles::firewall::pre {
    Firewall {
      require => undef,
    }

    # Default firewall rules
    firewall { '000 accept all icmp':
        proto  => 'icmp',
        action => 'accept',
    }

    firewall { '001 accept all to lo interface':
        proto   => 'all',
        iniface => 'lo',
        action  => 'accept',
    }

    firewall { '002 reject local traffic not on loopback interface':
        iniface     => '! lo',
        proto       => 'all',
        destination => '127.0.0.1/8',
        action      => 'reject',
    }

    firewall { '003 accept related established rules':
        proto  => 'all',
        state  => ['RELATED', 'ESTABLISHED'],
        action => 'accept',
    }

    firewall { '004 ssh 22':
        port   => '22',
        proto  => 'tcp',
        action => 'accept',
    }
}

# ~/example.domain.com-inf/puppet/site/profiles/manifests/firewall/post.pp
# Last in firewall rules
class profiles::firewall::post {
    firewall { '998 drop all':
        chain  => 'INPUT',
        proto  => 'all',
        action => 'drop',
        before => undef,
    }

    firewall { '999 drop all':
        chain  => 'FORWARD',
        proto  => 'all',
        action => 'drop',
        before => undef,
    }
}

These are fairly standard rules, pretty much all traffic is blocked apart from pings and ssh access.

So that's our common profile class and default role class finished, we just need to assign the default role class to a node. As mentioned previously we are going to use hiera for this, we will use the FQDN of a node to load a hiera yaml file that will assign classes, later on we will be using Vagrant to test our repo shortly so we can just use a dummy FQDN for now:

cd ~/example.domain.com-inf/puppet
mkdir -p hiera/nodes
touch hiera/nodes/default.role.com.yaml

Set the contents of default.role.com.yaml like so:

---
environment: production
classes:
  - roles::default

Before moving on you will need to add your Puppet repo into version control, this is so our Puppet control repo can manage it. As far as I know r10k will only work with Git, so you will need a Git based provider, I use Bitbucket since they allow free private repos.

r10k will make each branch of a repo into an environment, because of this you might want to avoid creating the standard master branch and use a traditional environment name such as production, qa, stage, dev and test. In my example puppet repo I have used production as my branch name and done away with the master branch completely.

Puppet control

Now we will create the control repo that will use r10k to manage the puppet repo that we just created.

Let's create the structure of the puppet control repo like so:

cd ~/example.domain.com-inf
mkdir -p puppet-ctrl
cd puppet-ctrl
touch Gemfile r10k.yaml hiera.yaml

We also need to install a few things on your local machine to work with r10k. r10k is a Ruby gem, so we are going to use Bundler to install it into our repo locally, this means that we will need to install Bundler globally for which you'll need Ruby, pretty much every distro comes with this pre-installed nowadays so this should already be present. Some people may prefer to install r10k globally, personally I like to keep as many dependencies within a project as possible so I prefer to bundle r10k into the control repo.

In addition to Bundler we will also need to install Puppet itself, r10k uses Puppet's module functions to install modules for you.

Let's go ahead and install the prerequisites:

sudo gem install bundler
sudo apt-get install puppet

Set the contents of Gemfile to include r10k like so:

# ~/example.domain.com-inf/puppet-ctrl/Gemfile
source "https://rubygems.org"
gem 'r10k'

The r10k.yaml file is where we setup the Puppet repo that will be managed, set it's contents like so:

# ~/example.domain.com-inf/puppet-ctrl/r10k.yaml
:cachedir: 'cache/r10k'

:sources:
  :base:
    remote: '{github repo address}'
    basedir: 'environments'

Install r10k locally and then run the deploy command to pull in our puppet repo and it's dependencies, the -p flag make this command pull in any modules from each environments Puppetfile, -v just switches on verbose output:

# ~/example.domain.com-inf/puppet-ctrl
bundle install --binstubs --path vendor
bin/r10k deploy environment -p -v

In the output you should see your Puppet repo and the various third party modules getting pulled down into the relevant directories, note that r10k will create the environment and cache directories for you so there is no need to check these into github, in fact I would create a .gitignore file to make sure that they don't get checked in by accident (along with the gems folders):

# ~/example.domain.com-inf/puppet-ctrl/.gitignore
cache/
.bundle/
bin/
environments/
vendor/

If you check the environments folder you should see your Puppet repo in there along with the third party modules in environments/production/modules.

Finally let's setup hiera and point it to the data folder in our Puppet repo, hiera.yaml should look like this:

# ~/example.domain.com-inf/puppet-ctrl/hiera.yaml
---
:backends: yaml
:yaml:
  :datadir: /etc/puppet/environments/%{environment}/hiera
:hierarchy:
  - "nodes/%{::fqdn}"
  - common

As you can see Puppet will attempt to load a node specifc yaml file based on the nodes FQDN and will also include common.yaml for all nodes.

Testing with Vagrant

Now that we have a basic r10k Puppet project setup we need a place to test any changes that we make before pushing them to a live server, Vagrant is the perfect tool for this. In this example I will be using Vagrant with Virtual Box, so go ahead and install both if you don't have them installed already.

Once you're setup create a vagrant directory in your infrastructure folder, initialise a Vagrant file, and bring up the box like so:

mkdir ~/example.domain.com-inf/vagrant
cd ~/example.domain.com-inf/vagrant
vagrant init -m debian/jessie64
vagrant up

After a few minutes the box should boot up, hopefully you won't see any errors. I am using a Debian 8 box here as it matches the distro that I have chosen in my previous tutorial on creating a VPN server, feel free to use another distro as needed, you can find other Vagrant boxes over at the Vagrant cloud site.

Now that the basics are setup let's configure the box to test our Puppet repo, Vagrant does come with Puppet integration out of the box, however at the time of wrting it doesn't play nicely with r10k and directory environments so we are going to write script that will apply our repo to the Vagrant box.

Let's make our Puppet repo available to Vagrant by symlinking it into the vagrant directory:

cd ~/example.domain.com-inf/vagrant
ln -s ~/example.domain.com-inf/puppet-ctrl puppet

Set the contents of your Vagrantfile like so:

# ~/example.domain.com-inf/vagrant/Vagrantfile
Vagrant.configure(2) do |config|
  config.vm.box = "debian/jessie64"
 
  # match the hiera node file from earlier
  config.vm.hostname = "default.role.com"

  config.vm.synced_folder "puppet", "/etc/puppet"

  config.vm.provision "shell",
    inline: "
        puppet apply /etc/puppet/environments/production/manifests/site.pp --confdir=/etc/puppet/ --environment=production --environmentpath=/etc/puppet/environments/
    "
end

As you can see we are manually using 'puppet apply' to provision the box, passing in the neccessary configuration settings to let Puppet know about our environment settings. Running 'vagrant provision' will kick off the Puppet run, you should see Puppet installl all the packages and config that we've specified, hopefully this happen without a hitch.

Deployment ideas

Before moving on you may be thinking about how to go about deploying your repo, if you take a look through r10k's documentation the recommended method is to use r10k's deploy command to update a repo on a remote server (your Puppet masters). I am not a big fan of this approach however, I prefer to keep my servers dumb, I don't even install Git on them unless I really have to and there's a high chance that my Puppet repo will be private and require a key for access. My preferred method for deploying repos is to use a a task runner such as Ansible, it's simple to use and you can easily write a task that will pull in the latest version of your puppet control repo, use r10k to build it, tar it up and then push it to a remote server.

Puppet lint

On a final note, it's worth taking the time to lint your puppet repo, it's easy to do and will make your code much easier to read. Puppet lint is a ruby gem that you can install locally into your repo, allowing you to lint on the command line. If you use sublime there is also a linter plugin that you can use.

Let's add the puppet-lint gem to our Puppet repo:

cd ~/example.domain.com-inf/puppet
vi Gemfile

Set the Gemfile to include puppet-lint:

# ~/example.domain.com-inf/puppet/Gemfile
source "https://rubygems.org"
gem 'puppet-lint'

Install the gem:

bundler install --binstubs --path vendor/

You can run the linter like so:

# lint and output errors within the site folder
bin/puppet-lint site/
# lint and auto fix errors within the site folder
bin/puppet-lint -f site/

Done and done

There you go, you now have a well structured, linted Puppet repo. If you want to write tests for your repo you can look into using rspec and hook your repo into Travis or another CI server. If you're interested in Terraform you may want to take a look at my shiney tutorial on it and also my follow up tutorial on using Puppet with Terraform.