search menu icon-carat-right cmu-wordmark

From Vagrant to Victory

Tim Palko
PUBLISHED IN
CITE

A few years ago, my team took the task of designing and writing a new (and fairly large) web application project that required us to work collaboratively on features, deploy to unfamiliar environments, and work with other teams to complete those deployments. Does this sound like DevOps yet? Our task was to make these deployments happen with limited resources; however, we didn't want to sacrifice environment parity or automation, knowing that these would help our team lower risk for the project and maintain a better handle on the security of our process. The idea of using Chef, a leading suite of platform-independent infrastructure management and provisioning tools, came to mind; however the work that would have been required to implement the full Chef ecosystem was not in scope for the project. We realized, however, that we were already using Vagrant to provision our local environments in the same way we wanted to provision our remote environments. That is, our Vagrant-based workflow was applying Infrastructure as Code concepts and provisioning with just a single component of Chef, rather than depending on a larger suite of Chef tools. To solve our remote deployment problem, we furnished a solution that allowed us to maintain environment parity by reusing all of the existing Chef configuration while sharing it with Vagrant and keeping the deployment for any size system to a single, automatable line. In this blog post, I will be talking about the transformation of a vanilla Vagrant setup to a stable and testable Infrastructure as Code solution.

If you aren't familiar with Vagrant, check out my blog post here.

If you are already using Vagrant, and you need to take your application to deployment, there is a short hop from provisioning your local virtual machine to provisioning multiple remote servers. That is, with a single step as easy as "vagrant up," your entire test or production environment can be realized, from scratch.

The first incarnation of our deployment actually used Fabric, which helped us by allowing us to remotely manage our test and production servers with Python. What we found, though, was that writing idempotent Python code to provision a server was messy: hard to read and hard to maintain. As the project grew to require seven unique application servers, we found we were doing a lot of work hand-rolling something that was available out-of-the-box with Chef. After all, our Vagrant provisioning was already using Chef cookbooks and its own crafted Chef JSON and calling chef-solo on a virtual machine in VirtualBox. Why not leverage these same tools and configurations to provision our test and production application servers as well?

First, let's outline the basic things that Chef does, and see how Vagrant uses them. We'll want to copy the procedure. Let's start with the actual line of code we want: the execution of the chef-solo binary:

chef-solo -c [Chef configuration] -j [JSON attributes] -e [environment name]

Now, let's compare this to a standard Vagrant provision block that uses Chef:

config.vm.provision :chef_solo do |chef|
        chef.log_level = "warn"
        chef.add_recipe "postgresql::server"
        chef.add_recipe "postgresql::client"
        chef.add_recipe "nginx"
        chef.json.merge!({
            // -- some overriding attributes
        })
    end

Basically, Vagrant takes information from the provision block, reconstructs the chef-solo call, and executes it on your local virtual machine. To provide the -c [Chef configuration], Vagrant pulls items such as log_level and other parameters that may be set within the provision block. To provide the -j [JSON attributes], Vagrant uses whatever you pass to chef.json.merge!(). To provide the -e [environment name], Vagrant uses its own default value, which can be set with chef.environment, but really has no bearing on the local provisioning, because Vagrant assumes that this configuration is just for your local, development virtual machine. In our example, we will set our own environment.

We want to do what Vagrant is doing, except apply it to remote servers. To do this, we will need all of the information required by chef-solo:

  • a Chef configuration file written in Ruby
  • JSON that describes the configuration we want to have on the remote server, and
  • an environment name, along with some small configuration for it

Now, we will take each in turn.

The Chef Configuration

Chef can be configured in a few lines of Ruby code. Here is a quick example:

File: /solo.rb

file_backup_path        "/var/chef/backup"

    root = File.absolute_path(File.dirname(__FILE__))
    file_cache_path root

    role_path               root + "/roles"
    environment_path        root + "/environments"
    cookbook_path           root + "/cookbooks"

    log_level :info
    verbose_logging "true"

All of the values set here are things that could be set in the provision block of a Vagrantfile. Therefore, anything that is set in the Vagrantfile, aside from the recipes and the JSON, should go into your solo.rb file. OK, on to the server configuration.

Chef Roles, Nodes, and Environments

First, if you are new to Chef, I would recommend reading up on their constructs of nodes and roles. Basically, a role is a responsibility, and a node is a machine you want to task with one or more responsibilities. In a Vagrant-based development environment, there is typically one node, the virtual machine, and each responsibility you ask it to perform, say to be a web server or a database server, is a role.

The concept behind reusing the Vagrant configuration for remote provisioning is to pull it from the Vagrantfile, which is decidedly Vagrant's domain, and place it into a common workspace that can be used by both local and remote provisioning. You are not going to be running Vagrant remotely, so the Vagrantfile cannot be the source of configuration on remote servers. Therefore, any configuration you have in the Vagrantfile that you want to apply to your test or production environment has to be removed.

Let's continue with the example provision block above.

config.vm.provision :chef_solo do |chef|
        chef.log_level = "warn"
        chef.add_recipe "postgresql::server"
        chef.add_recipe "postgresql::client"
        chef.add_recipe "nginx"
        chef.json.merge!({
            :postgresql => {
                :password => {
                    :postgres => "idontlikerandompasswords"
                },
                :pg_hba => [{
                      :type => 'host',
                      :db => 'mydb',
                      :user => 'mydbuser',
                      :addr => '127.0.0.1/32',
                      :method => 'md5'
                }]
            }
        })
    end

Here, we are declaring individual recipes that we want to run on our local virtual machine and override attributes we want to use in place of default attributes in those recipes. Our first step will be to define one or more roles (Chef's term) to describe our remote application servers. Looking at this example, we can identify two roles, web and database, based on the fact that we are installing a database server (PostgreSQL) and a web server (Nginx). (Your case may be different, and you can be as fine-grained as you like.) Now, we will define those roles, assigning the appropriate recipes and override attributes to each.

Web role:

Recipes: postgresql::server, postegresql::client
Override attributes: everything listed under 'postgresql'

Database role:

Recipes: nginx, postgresql::client
Override attributes: (none)

Now, we need to take these role definitions from the back of the napkin and into something that Chef can use. Then, we can add references to them in the Vagrantfile, replacing the chef.add_recipe and chef.json references.

File: /roles/web.json

{
        "name": "web",
        "default_attributes": {},
        "json_class": "Chef::Role",
        "description": "Web application",
        "chef_type": "role",
        "run_list": [ 
            "recipe[postgresql::client]",
            "recipe[nginx::default]"
        ]
    }

File: /roles/database.json

{
        "name": "webdb",
        "default_attributes": {
            :postgresql => {
                :password => {
                    :postgres => "idontlikerandompasswords"
                },
                :pg_hba => [{
                      :type => 'host',
                      :db => 'mydb',
                      :user => 'mydbuser',
                      :addr => '127.0.0.1/32',
                      :method => 'md5'
                }]
            }
        },
        "json_class": "Chef::Role",
        "description": "Web Application Database",
        "chef_type": "role",
        "run_list": [ 
            "recipe[postgresql::client]",
            "recipe[postgresql::server]"
        ]
    }

In Vagrantfile, our provision block becomes:

config.vm.provision :chef_solo do |chef|
    chef.log_level = "warn"
    chef.roles_path = "./roles"
    chef.add_role "webdb"
    chef.add_role "web"
end

Basically, we moved off the server configurations to a non-Vagrant space, and pointed Vagrant to that configuration. As a bonus, the Vagrantfile is much cleaner. If you have this set up, go ahead and test it now.

Our next step in provisioning a remote environment is to define nodes (again, Chef's term) that represent virtual machines in your remote environment. You may be asking yourself,

Wait, we just created these roles with all the information we had in the Vagrantfile. Isn't that enough?

It would be, except Chef allows a many-to-many relationship between roles and the servers to which they are applied so that roles can be reused across servers and one server can have many roles. To do this, your servers need to also be abstracted (into nodes) so that we can apply this relationship in our configuration. In our case, what nodes we need to define are environment-dependent. That is, in our local development environment, we have one node: the virtual machine in VirtualBox. In our test and production environments, we have two: a web server and a database server. Even then, the two nodes in test are different machines than the two nodes in production. If we had multiple web servers behind a load balancer in production, we would have that many 'web server' nodes to model this infrastructure. For now, let's go with the single web server model.

File: /nodes/web_test.rb

{
        "hostname": "web_test",
        "run_list": [ 
            "role[web]"
        ]
    }

File: /nodes/webdb_test.rb

{
        "hostname": "webdb_test",
        "run_list": [ 
            "role[webdb]"
        ]
    }

File: /nodes/web.rb

{
        "hostname": "web",
        "run_list": [ 
            "role[web]"
        ]
    }

File: /nodes/webdb.rb

{
        "hostname": "webdb",
        "run_list": [ 
            "role[webdb]"
        ]
    }

* Note that the hostname value matches the filename. Chef expects this match.

This is a very simplistic example, but you can see that if we needed to add functionality to either or both servers, it would be a matter of creating a role with the appropriate recipes to encapsulate that functionality, and adding that role to the appropriate Node or Nodes--and also to your Vagrantfile for your development "node." It is also trivial to move a role, with all of its recipes and configurations, from one node to another.

Since we are talking about deploying to different remote environments, i.e., test and production, we should now explore how Chef handles the modeling of environments. You might guess that if we put role configurations in a roles folder, and node configurations in a nodes folder, we would put environment configurations in an environments folder, and you would be correct.

Here is a simple environment configuration:

File: /environments/test.rb

name "test"
    description "Test environment"
    default_attributes()
    override_attributes()

The default_attributes and override_attributes are where we would place JSON configuration for recipes executed in these environments. Notice that we already have JSON configuration in our roles JSON. How does all this JSON fit together? First, see that the JSON in the roles JSON is labeled default_attributes. These are applied first. Then, anything you have in the environments .rb file for the environment to which you are deploying is added, first by merging in default_attributes, then laying override_attributes on top of it. In this way, you can manage all of your local development attribute values in the roles JSON files, and override as necessary for each remote environment.

We have modeled our system with roles and nodes. We have created a space for environment-specific values. How does this translate back to the chef-solo command we need to call?

Test environment:

chef-solo -c solo.rb -j nodes/web_test.rb -e test
chef-solo -c solo.rb -j nodes/webdb_test.rb -e test

Production environment:

chef-solo -c solo.rb -j nodes/web.rb -e production
chef-solo -c solo.rb -j nodes/webdb.rb -e production

Of course, for this to work, these commands must be executed on the remote machines, and the codebase must be available to it along with the cookbooks, roles, and node configurations. This is where Fabric steps back into the spotlight, because it allows us to execute commands on remote servers. Before we write our Fabric scripts, take note that Chef must be installed on those remote servers. Vagrant will take care of this installation for you on your local virtual machine, and so we must emulate this behavior for our remote machines. So, our Fabric code should:

  1. package our code and configuration into an archive
  2. copy that archive to the target machine
  3. expand the archive on the target machine
  4. ensure Chef is installed, or install it
  5. execute chef-solo

The first three steps translate into shell scripting in a straightforward manner: tar, scp, and tar again. The last one is the call to the chef-solo executable. The Chef installation in step 4 can be done with the following command, as listed in the Chef docs:

curl -L https://www.chef.io/chef/install.sh | sudo bash

Writing the Fabric scripts is outside the scope of this post, but since this is just implementing the procedure we designed in this example, it should be straightforward. What is an adventure without a little mystery? Good luck!

Caveat: The example I provided in my blog post on Vagrant, the one we will be working from today, is not sufficient to configure your application and allow it to be served to you. For this, the Nginx server remains to be configured with details about your specific application. If it is a Python app, then some WSGI-based server with which Nginx can communicate additionally must be running.

I welcome your feedback and questions in the comments section below.

Additional Resources

View the webinar DevOps Panel Discussion featuring Kevin Fall, Hasan Yasar, and Joseph D. Yankel.

View the webinar Culture Shock: Unlocking DevOps with Collaboration and Communication with Aaron Volkmann and Todd Waits.

View the webinar What DevOps is Not! with Hasan Yasar and C. Aaron Cois.

Listen to the podcast DevOps: Transform Development and Operations for Fast, Secure Deployments featuring Gene Kim and Julia Allen.

Read all of the blog posts in our DevOps series.

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed