icon-carat-right menu search cmu-wordmark

Writing Ansible Roles with Confidence

Headshot of Matthew Heckathorn
PUBLISHED IN

You’ve been automating your repetitive sysadmin tasks with Ansible for a while now. You’ve bounced troublesome services, deployed objects to Kubernetes, updated systems, and performed rolling restarts. Maybe you have even deployed and configured services using roles you found on Ansible Galaxy. However, it’s finally happened. You can’t find a role on Ansible Galaxy that does what you want it to (you checked, right?), or you have been tasked with writing a role for your organization’s app. Whatever the reason, it’s time for you to write your own role. How do you do that in a way where you can be confident your role works as intended? This post will answer that question by providing guidance on how to best begin developing Ansible roles.

Infrastructure as Code and DevOps

Before we dive into specifics, it would be helpful to define some terminology. Ansible falls under the infrastructure as code (IaC) tool umbrella. IaC is the process of managing and provisioning IT resources through machine readable definition files. This approach allows the storing of infrastructure configuration in git, gaining all the benefits of doing so, including branching, history, review and approval policies, etc.

DevOps is the set of practices that combines software development (dev) and IT operations (ops) to shorten the development lifecycle and provide continuous delivery of software. IaC tools are one piece of the puzzle that allow organizations to adopt a DevOps approach. The DevOps process is commonly depicted as follows:

Illustration of the DevOps process.
Figure 1: The DevOps Infinity Loop.

It is helpful to have this process in mind while developing your Ansible roles, because as software they can also benefit from a DevOps approach.

Ansible Roles

In Ansible, roles are a method of automatically loading certain variables, tasks, files, templates, and handlers based on a known file structure. Grouping content by roles allows for easy sharing and reuse. The Ansible documentation on roles outlines the file structure and other considerations.

When developing roles, you will need to deal with various concerns, including what operating system(s) and version(s) you will be supporting and whether you only need a single node or if you have to target a cluster of machines. It is also often important to start from a fresh state every time you rerun your role while developing it to ensure that (1) your roles complete successfully on their first run, and (2) changes performed on previous runs aren’t affecting the outcome. You should also verify that your role is idempotent to ensure that no matter how many times it is executed, you achieve the same result. It is also important to verify that things should only be changed if they need to be changed. For example, a service should only be restarted if configuration changes warrant a restart.

The final thing to consider is how to check that your role has done what you intend it to do. Logging in to a target node and manually checking is certainly one way to do it. However, it is better to write tests that can be automatically run after your Ansible role to verify the actual state.

To automate these types of tests, you’ll need a target host (or set of hosts) and you’ll need to destroy and recreate that host (or set of hosts) constantly during the development process. You will also be responsible for managing connections to the host (or hosts) in one of the Ansible-supported methods. In addition, you must hand off to your chosen testing tool and handle its connection to the node(s). Managing this development and testing infrastructure can be tedious and will eat up a lot of the time that could be better spent on role features.

Enter Molecule

Molecule is a project for facilitating the development and testing of Ansible roles by handling the previously outlined set of concerns and streamlining the entire role development process. Pairing Molecule with Docker as it’s provisioner enables you to quickly and easily develop your roles against any number of freshly deployed operating systems and versions simultaneously. Molecule also has built-in idempotence checking and support for a variety of verification testing methods.

To get started, first install Docker. Then assuming you have pip3, install Molecule and supporting dependencies:

pip3 install yamllint ansible molecule[docker] docker pytest-testinfra

With Molecule and its dependencies installed, it is time to begin developing an Ansible role. Molecule can assist here by building out your role’s file structure and Molecule’s own required config files for you:

$ molecule init role maheckathorn.example -d docker
$ tree
.
├── README.md
├── defaults
│ └── main.yml
├── files
├── handlers
│ └── main.yml
├── meta
│ └── main.yml
├── molecule
│ └── default
│      ├── converge.yml
│      ├── molecule.yml
│      └── verify.yml
├── tasks
│ └── main.yml
├── templates
├── tests
│ ├── inventory
│ └── test.yml
└── vars
└── main.yml
10 directories, 11 files

Within the Molecule folder, the following files have specific purposes:

  • converge.yml is the playbook file that contains the call for your role. Molecule will invoke this playbook with ansible-playbook and run it against an instance created by the driver, which is Docker in our scenario.
  • molecule.yml is the central configuration entrypoint for Molecule. With this file, you can configure each tool that Molecule will employ when testing your role.
  • verify.yml is the Ansible file used for testing as Ansible is the default verifier, which allows you to write specific tests against the state of the container after your role has finished executing. Other verifier tools are available (Note that TestInfra was the default verifier prior to Molecule version 3).

The molecule.yml file contains different sections for configuring how molecule components behave:

  • The dependency manager—Molecule uses Galaxy by default to resolve your role dependencies.
  • The driver provider. Molecule uses Docker by default. Molecule uses the driver to delegate the task of creating instances.
  • The lint command—Molecule can call external commands to ensure that best practices are encouraged. Note: Ansible-lint is not included with molecule or molecule[lint].
  • The platforms definitions—Molecule relies on this to know which instances to create and name and to identify which group each instance belongs in. If you need to test your role against multiple popular distributions (CentOS, Fedora, Debian), you can specify that in this section.
  • The provisionerMolecule only provides an Ansible provisioner. Ansible manages the lifecycle of the instance based on this configuration.
  • The scenario definition—Molecule relies on this configuration to control the scenario sequence order.
  • The verifier framework—Molecule uses Ansible by default to provide a way to write specific state checking tests (such as deployment smoke tests) on the target instance.

There are many options for configuring these sections to meet your needs. However, sticking to a common configuration file across projects helps to set standard expectations. The following is based on Jeff Geerling’s common molecule.yml file:

https://github.com/cmu-sei/ansible-role-silk/blob/master/molecule/default/molecule.yml

---
dependency:
  name: galaxy
driver:
  name: docker
platforms:
  - name: instance
    image: "geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest"
    command: ${MOLECULE_DOCKER_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
    env:
      http_proxy: "${http_proxy}"
      https_proxy: "${https_proxy}"
      no_proxy: "${no_proxy} "  
provisioner:
  name: ansible
  playbooks:
    converge: ${MOLECULE_PLAYBOOK:-converge.yml}
verifier:
  name: testinfra
  options:
    v: 1

The key thing to note in this file is the usage of Jeff Geerling’s Docker container(s) as the image source, the configuration of Testinfra as the verifier, and the MOLECULE_DISTRO environment variable with centos7 as the default. The custom premade Docker images already contain Python, Ansible, and systemd. They help to speed up test runs by not needing Molecule to do anything to use the image other than fetch it. The MOLECULE_DISTRO environment variable allows you to easily test against other OS types and versions by:

$ MOLECULE_DISTRO=ubuntu1804 molecule test

Other prebuilt images are listed here.

Finally, the Testinfra verifier configures Molecule to use Testinfra for verification testing, which was the default prior to molecule version 3. If you’ve been developing roles with verification tests for a while, it’s useful to be able to configure this setting, which means the tests directory that molecule creates is unneeded. You can also edit your converge.yml file to look like the following:

https://github.com/cmu-sei/ansible-role-silk/blob/master/molecule/default/converge.yml

---
- name: Converge
  hosts: all
  roles:
    - role: "{{ lookup('env', 'MOLECULE_PROJECT_DIRECTORY') | basename }}"
  environment:
    http_proxy: "{{ lookup('env', 'http_proxy') }}"
    https_proxy: "{{ lookup('env', 'https_proxy') }}"
    no_proxy: "{{ lookup('env', 'no_proxy') }}"

This configuration helps to avoid issues on continuous integration/continuous deployment (CI/CD) systems and also deals with Ansible project namespace issues.

With Molecule configured, we can run through the complete default Molecule scenario, which is a test suite for your new role:

$ molecule test
INFO     default scenario test matrix: dependency, lint, cleanup, destroy, syntax, create, prepare, converge, idempotence, side_effect, verify, cleanup, destroy
INFO     Performing prerun...
INFO     Set ANSIBLE_LIBRARY=/Users/maheckathorn/.cache/ansible-compat/50d858/modules:/Users/maheckathorn/.ansible/plugins/modules:/usr/share/ansible/plugins/modules
INFO     Set ANSIBLE_COLLECTIONS_PATH=/Users/maheckathorn/.cache/ansible-compat/50d858/collections:/Users/maheckathorn/.ansible/collections:/usr/share/ansible/collections
INFO     Set ANSIBLE_ROLES_PATH=/Users/maheckathorn/.cache/ansible-compat/50d858/roles:/Users/maheckathorn/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles
INFO     Using /Users/maheckathorn/.cache/ansible-compat/50d858/roles/maheckathorn.example symlink to current repository in order to enable Ansible to find the role using its expected full name.
INFO     Running default > dependency
WARNING  Skipping, missing the requirements file.
WARNING  Skipping, missing the requirements file.
INFO     Running default > lint
INFO     Lint is disabled.
INFO     Running default > cleanup
WARNING  Skipping, cleanup playbook not configured.
INFO     Running default > destroy
INFO     Sanity checks: 'docker'
 
PLAY [Destroy] *****************************************************************
 
TASK [Destroy molecule instance(s)] ********************************************
changed: [localhost] => (item=instance)
 
TASK [Wait for instance(s) deletion to complete] *******************************
FAILED - RETRYING: [localhost]: Wait for instance(s) deletion to complete (300 retries left).
ok: [localhost] => (item=instance)
 
TASK [Delete docker networks(s)] ***********************************************
 
PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
 
INFO     Running default > syntax
 
playbook: /Users/maheckathorn/test/example/molecule/default/converge.yml
INFO     Running default > create
 
PLAY [Create] ******************************************************************
 
TASK [Log into a Docker registry] **********************************************
skipping: [localhost] => (item=None)
skipping: [localhost]
 
TASK [Check presence of custom Dockerfiles] ************************************
ok: [localhost] => (item={'command': '', 'env': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']})
 
TASK [Create Dockerfiles from image names] *************************************
skipping: [localhost] => (item={'command': '', 'env': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']})
 
TASK [Discover local Docker images] ********************************************
ok: [localhost] => (item={'changed': False, 'skipped': True, 'skip_reason': 'Conditional result was False', 'item': {'command': '', 'env': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']}, 'ansible_loop_var': 'item', 'i': 0, 'ansible_index_var': 'i'})
 
TASK [Build an Ansible compatible image (new)] *********************************
skipping: [localhost] => (item=molecule_local/geerlingguy/docker-centos7-ansible:latest)
 
TASK [Create docker network(s)] ************************************************
 
TASK [Determine the CMD directives] ********************************************
ok: [localhost] => (item={'command': '', 'env': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']})
 
TASK [Create molecule instance(s)] *********************************************
changed: [localhost] => (item=instance)
 
TASK [Wait for instance(s) creation to complete] *******************************
FAILED - RETRYING: [localhost]: Wait for instance(s) creation to complete (300 retries left).
changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '429858788464.21737', 'results_file': '/Users/maheckathorn/.ansible_async/429858788464.21737', 'changed': True, 'item': {'command': '', 'env': '', 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']}, 'ansible_loop_var': 'item'})
 
PLAY RECAP *********************************************************************
localhost                  : ok=5    changed=2    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0
 
INFO     Running default > prepare
WARNING  Skipping, prepare playbook not configured.
INFO     Running default > converge
 
PLAY [Converge] ****************************************************************
 
TASK [Gathering Facts] *********************************************************
ok: [instance]
 
PLAY RECAP *********************************************************************
instance                   : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
 
INFO     Running default > idempotence
 
PLAY [Converge] ****************************************************************
 
TASK [Gathering Facts] *********************************************************
ok: [instance]
 
PLAY RECAP *********************************************************************
instance                   : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
 
INFO     Idempotence completed successfully.
INFO     Running default > side_effect
WARNING  Skipping, side effect playbook not configured.
INFO     Running default > verify
WARNING  Skipping, no tests found.
INFO     Running default > cleanup
WARNING  Skipping, cleanup playbook not configured.
INFO     Running default > destroy
 
PLAY [Destroy] *****************************************************************
 
TASK [Destroy molecule instance(s)] ********************************************
changed: [localhost] => (item=instance)
 
TASK [Wait for instance(s) deletion to complete] *******************************
FAILED - RETRYING: [localhost]: Wait for instance(s) deletion to complete (300 retries left).
changed: [localhost] => (item=instance)
 
TASK [Delete docker networks(s)] ***********************************************
 
PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=2    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
 
INFO     Pruning extra files from scenario ephemeral directory

Since we showed a lot of output above let’s breakdown what interesting things are happening here. Immediately after running the Molecule test, we get some output telling us what steps in the test process are going to be run:

INFO     default scenario test matrix: dependency, lint, cleanup, destroy, syntax, create, prepare, converge, idempotence, side_effect, verify, cleanup, destroy

As this output shows, by default Molecule runs these steps as part of the test matrix in the order shown. Any time we see a

INFO     Running default >

line in the output, we are looking at a different step in the matrix being run. If you dig through the output, you’ll see that many of the steps are actually skipped by default. For example:

INFO     Running default > dependency
WARNING  Skipping, missing the requirements file.

In our example, the first step in the matrix where Molecule actually does something is the destroy step. At this step, Molecule interacts with the configured driver, in our case Docker, and attempts to destroy any previous test environments to ensure a new clean testing environment is used. Molecule interacts with the Docker daemon and destroys any running container with our defined name from our molecule.yml file:

platforms:
- name: instance

If a running container with that name does not currently exist, as in our case, it simply moves on. The next step at which Molecule actually does something is the create step. At this point in the process, molecule interacts with the driver and attempts to create a test environment using the driver we told it to and configured in the manner outlined in the platform section of molecule.yml:

driver:
  name: docker
platforms:
  - name: instance
    image: "geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest"
    command: ${MOLECULE_DOCKER_COMMAND:-""}
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:ro
    privileged: true
    pre_build_image: true
    env:
      http_proxy: "${http_proxy}"
      https_proxy: "${https_proxy}"
      no_proxy: "${no_proxy} "

Since we are using Docker, Molecule handles pulling the container image we defined, setting any docker runtime options, and runs the container in the background. The following lines shows the successful creation of our desired test environment:

TASK [Wait for instance(s) creation to complete] *******************************
FAILED - RETRYING: [localhost]: Wait for instance(s) creation to complete (300 retries left).
changed: [localhost] => (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': '429858788464.21737', 'results_file': '/Users/maheckathorn/.ansible_async/429858788464.21737', 'changed': True, 'item': {'command': '', 'env': '' '}, 'image': 'geerlingguy/docker-centos7-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:ro']}, 'ansible_loop_var': 'item'})

With our test environment now in place, Molecule moves on to running the converge step of the process, which conveniently runs the playbook named converge.yml that we defined earlier. This playbook runs our role. As of right now, our role does nothing, as this output shows:

INFO     Running default > converge
 
PLAY [Converge] ****************************************************************
 
TASK [Gathering Facts] *********************************************************
ok: [instance]
 
PLAY RECAP *********************************************************************
instance                   : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

What this does show, however, is that Molecule was able to successfully connect to our test environment. Immediately after a successful converge step, Molecule automatically tests our role for idempotence. This consists of rerunning our converge.yml playbook and making sure nothing was changed.

If we had defined Testinfra tests, the verify step would have run them:

INFO     Running default > verify
WARNING  Skipping, no tests found.

An example of some simple Testinfra code can be seen in the Ansible SiLK github repository:

import os
 
import testinfra.utils.ansible_runner
 
testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')
 
 
def test_silk_version(host):
    version = "3.19.2"
    command = """/usr/local/bin/silk_config --silk-version"""
 
    cmd = host.run(command)
 
    assert version in cmd.stdout

In this case, Molecule would run this Python code on the Docker host and Testinfra would handle connecting to the test environment, running the tests, and providing output back to molecule. The test we are running simply checks if the version of SiLK installed in the testing environment is 3.19.2. If our tests pass, Molecule considers the verify step a success and moves on. The last thing Molecule does during a test run is to clean up after itself by rerunning the destroy step.

Automating Confidence

As highlighted by our example above, Molecule is capable of streamlining your Ansible role development process. It stands up and tears down configurable test environments quickly and easily and handles idempotence and verification testing. This blog posting only scratches the surface of what Molecule is capable of, however.

For example, handling a clustered test environment is a breeze by simply adding another named instance to the platform section of the molecule.yml file. Testing against different operating systems and different OS versions is also a simple command tweak away. Adding preparations to the test nodes that, for one reason or another, need to occur outside the role is simple through inclusion of the prepare step (by adding a prepare.yml playbook). It is also easy to add role dependencies by creating a requirements.yml file in the Molecule directory.

Finally, the whole process is simple to move to a CI/CD system. The ci.yml in the Ansible SiLK ­­­repository shows how to do this with Github actions, but the process is portable enough to easily be recreated using your CI/CD platform of choice. If you are not using Molecule for developing your Ansible roles, you may be severely slowing down your development cadence and reducing the quality of your Ansible code. Writing Ansible roles with the help of Molecule makes it highly likely that you can be confident your role does what you want it to, leading to better quality code and reduced user frustration.

Additional Resources

Learn more about the SEI’s work in DevSecops - https://www.sei.cmu.edu/our-work/devsecops/

Read about the SEI blog post A Framework for DevSecOps Evolution and Achieving Continuous-Integration/Continuous-Delivery (CI/CD) Capabilities - https://insights.sei.cmu.edu/blog/a-framework-for-devsecops-evolution-and-achieving-continuous-integrationcontinuous-delivery-cicd-capabilities/

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed