Using Cucumber For Testing Operations Tickets

We system administrators handle many tasks each day for our customers. These tasks usually come in as tickets, asking us to build a new system or to change an existing system. The "standard" workflow for this is: Receive ticket, do ticket, lob ticket back over to customer asking them to test. While we're good at our jobs, we're not infallible. Every so often, the customer lobs the ticket right back saying "It doesn't work!" We then spend more time readdressing the ticket and usually find that it doesn't work because we omitted a step in our haste. Chagrined, we fix our work, test it, and then tell our customer that it should work now.

I would like to show you a better way. I've written before about how I've been using Cucumber for testing tickets. Today, I'd like to show you how you can too.

Intended audience

This is primarily aimed at system administrators but should benefit anyone who has operations duties. You should be familiar with programming or scripting. Specific familiarity with Ruby will be helpful but is not required.

Installing Cucumber

Cucumber is distributed as a Ruby gem. This means you'll need Ruby. On some platforms, you can install a package for Cucumber that will also install its prerequisites (for example, rubygem-cucumber on Fedora or cucumber on Debian). On others, you will need to install Ruby and rubygems support (on RHEL/CentOS, this would be the ruby and rubygems packages) and then install Cucumber and its prerequisites with this command: gem install cucumber

Getting started

To start work with Cucumber, you will need a directory structure like this:

features
|- steps
`- support

  • Tests for tickets go in the features directory. These files have the .feature extension. Each logical piece of work (e.g. a ticket, service, or system) will have its own file. (Cucumber has its origins in software testing so it calls these "features," corresponding to features in software being built.)

  • Step definitions (discussed below) go in the steps directory. These files have the .rb extension. Step definitions should be grouped logically in files so you can easily find them later.

  • Support files, e.g. functions or classes that are used in the step definitions, go in the support directory. These files also have the .rb extension.

To tell Cucumber you're using Ruby, it's customary to create a file named env.rb in the support directory. (You can use other languages, e.g. Python, with Cucumber. I will provide an example of this in the future.)

Anatomy of a Cucumber feature

Here is an example ticket written as a Cucumber feature:

Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM
    Given my DNS servers are:
      |172.22.0.23|
      |172.22.0.24|
      |172.22.0.25|
    When I query my DNS servers for the A record for "quux.example.com"
    Then I should see that the A record response is "192.168.205.203"

Each feature has a title that should reflect the purpose of the ticket. I like to include the ticket number in the title so I can easily refer back to the ticket. (I also include the date in the filename so I can easily find the feature or figure out when I did the work.)

Each feature may also have a description. The description should reflect the reason behind what you're testing for. It is optional and can be omitted. I tend to leave this out but you may find it useful if you share your features with your co-workers.

Each feature has one or more scenarios. These are the actual tests for the feature. How many scenarios you use will depend on personal judgment and the exact request you're handling. I tend to use a scenario for each discrete element of the ticket. For example, for an email account, I would have one scenario for logging in and one for any specific properties for the account.

Each scenario is composed of steps. A step is a discrete element of the text. A step usually starts with one of three keywords, Given, When, and Then. These keywords roughly equate to separate stages in the test:

  • Given steps are for items that are already in place that impact the test. For example, if your test relies on specific pieces of infrastructure or on a specific state in your environment, these would be Givens.
  • When steps represent actions you would take to verify the ticket. These might include running commands on a system or going to a web page.
  • Then steps represent the outcome of the actions taken in the When steps.

(Whens and Thens are generally written as first person sentences to reflect that these are actions you would take or outcomes you would see.)

Step definitions

Once you've specified the feature, its scenarios, and its scenarios' steps, you then need to tell Cucumber how to execute the steps. You do this via step definitions. As noted above, step definitions are stored in Ruby files in the steps subdirectory.

A step definition will look like this:

Given(/^a step definition$/) do
  # code goes here
end

A step definition starts with one of the three keywords, Given, When, or Then. This is followed by a regular expression that matches the step for this definition. Following this is the block of code to run for the step.

You can use grouping in the regular expression to capture any data embedded in the step. Any captured groups are then passed along to the block. For example, given this step definition:

When(/^I query my DNS servers for the A record for "(.*?)"$/) do |host|
  # code goes here
end

If a scenario has the step "When I query my DNS servers for the A record for "quux.example.com"", it would execute using the above step definition with the host variable set to "quux.example.com".

Walkthrough

Let's see this in action. To follow along locally, you'll need to have Vagrant installed. To get started, clone this git repo and follow the instructions in the README.md file.

The environment consists of four systems: a system to run Cucumber on (tester), a master DNS server (dns_master), and two slave DNS servers (dns_slave1 and dns_slave2). The two slave DNS servers transfer zones from the master DNS server.

First, we'll need to set up the directories. Connect to tester using SSH (vagrant ssh tester). Once there, create the features directory and then the steps and support directories underneath it.

In the support directory, create a blank file named env.rb. This tells Cucumber that we will be using Ruby for step definitions.

Now we're ready to work through a scenario. Let's say we receive the ticket below:

From: Joe in Marketing
Subject: Need a new hostname
 
Our web design firm has created a mockup of our new website.  We'd like
to test the mockup with our marketing tools.  Would you please set up
QUUX.EXAMPLE.COM to point to our server at 192.168.205.203?

From this, we need to figure out an appropriate test for it. We want to make sure that quux.example.com resolves to 192.168.205.203. Fortunately, we already have a test for this: the example feature from above. Let's use that.

In the features directory, create a file named 20130708_example.feature. Copy the example feature from above and paste it into that file.

Our next goal is to get the test to fail. We need to prove that the test fails without the change in place. Otherwise, how do we know that our work actually causes the test to pass?

At the shell prompt, run this command while in the features directory:

cucumber ../features/20130708_example.feature

You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # ../features/20130708_example.feature:7
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # ../features/20130708_example.feature:11
    Then I should see that the A record response is "192.168.205.203"   # ../features/20130708_example.feature:12
 
1 scenario (1 undefined)
3 steps (3 undefined)
0m0.002s
 
You can implement step definitions for undefined steps with these snippets:
 
Given(/^my DNS servers are:$/) do |table|
  # table is a Cucumber::Ast::Table
  pending # express the regexp above with the code you wish you had
end
 
When(/^I query my DNS servers for the A record for "(.*?)"$/) do |arg1|
  pending # express the regexp above with the code you wish you had
end
 
Then(/^I should see that the A record response is "(.*?)"$/) do |arg1|
  pending # express the regexp above with the code you wish you had
end

As you can see in the output above, Cucumber doesn't know how to run our test because there are no step definitions. We need to fix this before we can get the test to fail.

Cucumber helpfully provides skeletal step definitions. Let's copy these and then paste them into a new file in the steps directory named dns_steps.rb.

Let's rerun the Cucumber command. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # features/20130708_example.feature:6
    Given my DNS servers are:                                           # features/steps/dns_steps.rb:1
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
      TODO (Cucumber::Pending)
      ./features/steps/dns_steps.rb:3:in `/^my DNS servers are:$/'
      features/20130708_example.feature:7:in `Given my DNS servers are:'
    When I query my DNS servers for the A record for "quux.example.com" # features/steps/dns_steps.rb:6
    Then I should see that the A record response is "192.168.205.203"   # features/steps/dns_steps.rb:10
 
1 scenario (1 pending)
3 steps (2 skipped, 1 pending)
0m0.004s

We see that Cucumber now knows how to run the steps and that the first one is in the state "pending." In order to get this step to pass, we'll need to define it. Let's go ahead and do that.

Recall that our first step is:

Given my DNS servers are:
  |172.22.0.23|
  |172.22.0.24|
  |172.22.0.25|

We use this step to define the DNS servers we want to deal with.

Open dns_steps.rb. Find the skeleton for the first step:

Given(/^my DNS servers are:$/) do |table|
  # table is a Cucumber::Ast::Table
  pending # express the regexp above with the code you wish you had
end

Change that to:

Given(/^my DNS servers are:$/) do |servers|
  @nameservers = servers.raw.map {|row| row.first}
end

The new code for the step definition takes the IPs in the table and stores them as an array in the class variable @nameservers. Since class variables are shared between steps in a scenario, this lets us use the IPs in later steps.

(I haven't discussed tables since they're a more advanced topic. I will talk about them in a future post. The curious can look at Cucumber's documentation.)

Save the file and rerun the feature. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # steps/dns_steps.rb:1
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # steps/dns_steps.rb:5
      TODO (Cucumber::Pending)
      ./steps/dns_steps.rb:6:in `/^I query my DNS servers for the A record for "(.*?)"$/'
      ../features/20130708_example.feature:11:in `When I query my DNS servers for the A record for "quux.example.com"'
    Then I should see that the A record response is "192.168.205.203"   # steps/dns_steps.rb:9
 
1 scenario (1 pending)
3 steps (1 skipped, 1 pending, 1 passed)
0m0.003s

Our first step passes! Now Cucumber is saying that the second step is pending. Let's define that step now.

Recall that our second step is:

When I query my DNS servers for the A record for "quux.example.com"

Reopen dns_steps.rb. Find the skeleton for the second step:

When(/^I query my DNS servers for the A record for "(.*?)"$/) do |arg1|
  pending # express the regexp above with the code you wish you had
end

Change that to:

When(/^I query my DNS servers for the A record for "(.*?)"$/) do |host|
  responses = {}
  @nameservers.each do |server|
    resolver = Net::DNS::Resolver.new(:nameservers => server, :recursive => false, :udp_timeout => 15)
    responses[server] = resolver.query(host, Net::DNS::A)
  end
 
  @nameservers.each do |server|
    next if @nameservers.first == server
 
    responses[@nameservers.first].answer.map {|r| r.to_s}.sort.should \
      eq(responses[server].answer.map {|r| r.to_s}.sort),
      "DNS responses from #{@nameservers.first} and #{server} don't match!"
  end
 
  @response = responses[@nameservers.first]
  @host     = host
end

This code does four things:

  1. It queries each of the DNS servers specified in @nameservers for the A record(s) for the name stored in the host variable (for this example, "quux.example.com").
  2. It then verifies that the response from all of the DNS servers is identical. If it's not, it fails the step with the reason "DNS servers don't match."
  3. It then stores the response of the first DNS server in the @response class variable.
  4. It stores the host variable in the @host class variable so we can use it in error messages in later steps.

Let's look closer at the second block. Do you see this code in the middle?

ns1.should eql(ns2)

This is an example of an RSpec expectation. We use expectations to verify conditions that should be true (the should expectation) or that should not be true (the should_not expectation). If the expectation is not correct (e.g. the condition is false but it's expected to be true), it will cause the step to fail.

(RSpec is another BDD testing framework. For now, all you need to know is that Cucumber uses RSpec's expectations and matchers. This is worth knowing if you need to do something complex but isn't as important for most tasks. Thoughtbot has a PDF of common RSpec matchers.)

Why do we test validity in a When step? Here, we're conducting a sanity check on the DNS server responses. If the responses aren't the same, there's an issue with the DNS servers and any later tests we conduct will be invalid. Since we want to get feedback as soon as possible, we want to fail as soon as possible and so we do a sanity check here rather than in a later step.

Add the following to the top of dns_steps.rb:

require 'rubygems'
require 'net/dns'

Save the file and rerun the Cucumber command. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # steps/dns_steps.rb:4
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # steps/dns_steps.rb:8
    Then I should see that the A record response is "192.168.205.203"   # steps/dns_steps.rb:26
      TODO (Cucumber::Pending)
      ./steps/dns_steps.rb:27:in `/^I should see that the A record response is "(.*?)"$/'
      ../features/20130708_example.feature:12:in `Then I should see that the A record response is "192.168.205.203"'
 
1 scenario (1 pending)
3 steps (1 pending, 2 passed)
0m2.703s

Now the second step passes! On to the third step. Recall that our third step is:

Then I should see that the A record response is "192.168.205.203"

Reopen dns_steps.rb. Find the skeleton for the third step:

Then(/^I should see that the A record response is "(.*?)"$/) do |arg1|
  pending # express the regexp above with the code you wish you had
end

Change that to:

Then(/^I should see that the A record response is "(.*?)"$/) do |ip|
  @response.should_not be_nil, "There is no DNS response.  Did you run a query?"
  @response.answer.should_not be_empty, "There are no records in the response."
  @response.answer.length.should eq(1),
    "There should only be one record.  Instead, there are #{@response.answer.length} records."
  @response.answer.first.address.should eq(ip),
    "The current A record is #{@response.answer.first.address}, not #{ip}."
end

This code performs the following checks:

  1. It verifies that @response is defined. If it's not, something undesirable has happened (for example, there is a bug in a When step or an appropriate When step had not been used) and any of our tests will fail.
  2. It verifies that @response's field answer is not empty. The resolve method used in the previous step sets answer to an array of resource records. If the array is empty, there were no records and we should fail the step now since there isn't an A record pointing to the IP.
  3. It verifies that @response's field answer only has one element. If there are multiple records, there will be multiple elements. Since we only want a single record, this should also fail.
  4. It verifies that the first (and only) element in @response's field answer is an A record pointing to the desired IP.

Note that each different condition has its own failure message. This helps us determine exactly where the step failed.

Save the file and rerun the Cucumber command. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # steps/dns_steps.rb:4
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # steps/dns_steps.rb:8
    Then I should see that the A record response is "192.168.205.203"   # steps/dns_steps.rb:27
      There are no records in the response. (RSpec::Expectations::ExpectationNotMetError)
      ./steps/dns_steps.rb:29:in `/^I should see that the A record response is "(.*?)"$/'
      ../features/20130708_example.feature:12:in `Then I should see that the A record response is "192.168.205.203"'
 
Failing Scenarios:
cucumber ../features/20130708_example.feature:6 # Scenario: A record for QUUX.EXAMPLE.COM
 
1 scenario (1 failed)
3 steps (1 failed, 2 passed)
0m0.388s

As you can see, our test has failed because the third step failed. That step failed because there are no A records for quux.example.com. Our feature has failed in the expected manner. Success! (Of sorts, anyway.) We can now implement the change needed for the ticket.

In another window, connect to dns_master using SSH (vagrant ssh dns_master). Go to /home/vagrant/named and edit the file example.com. Add the following A record at the end of the file:

quux     IN      A   192.168.205.203

This is important: Do not increment the serial number for now. Our zone file should now look like this (your serial number may differ):

$TTL 300
example.com.  IN  SOA   ns1.example.com. hostmaster.example.com. (
                      1375624788 ; Serial
                      3h ; Refresh
                      15 ; Retry
                      1w ; Expire
                      300 ; Minimum
                      )
              IN  A     192.168.205.199
              IN  NS    ns1.example.com.
              IN  NS    ns2.example.com.
              IN  TXT   "U3VuIEF1ZyAwNCAxMDoxOTowOSAtMDQwMCAyMDEz"
; Hosts
www           IN  CNAME example.com.
foo           IN  A     192.168.205.198
ns1           IN  A     172.22.0.24
ns2           IN  A     172.22.0.25
quux          IN  A     192.168.205.203

Save the file and close your editor. At the command prompt, run this command: sudo rndc reload

The change made to the zone file will now be loaded by the master DNS server. However, since we did not increment the serial number, the change will not be picked up by the slave DNS servers. (This is probably the most common error I make when updating zone files.)

Go back to the first window and rerun the Cucumber command. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # steps/dns_steps.rb:4
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # steps/dns_steps.rb:8
      DNS responses from 172.22.0.23 and 172.22.0.24 don't match! (RSpec::Expectations::ExpectationNotMetError)
      ./steps/dns_steps.rb:18
      ./steps/dns_steps.rb:15:in `each'
      ./steps/dns_steps.rb:15:in `/^I query my DNS servers for the A record for "(.*?)"$/'
      ../features/20130708_example.feature:11:in `When I query my DNS servers for the A record for "quux.example.com"'
    Then I should see that the A record response is "192.168.205.203"   # steps/dns_steps.rb:27
 
Failing Scenarios:
cucumber ../features/20130708_example.feature:6 # Scenario: A record for QUUX.EXAMPLE.COM
 
1 scenario (1 failed)
3 steps (1 failed, 1 skipped, 1 passed)
0m0.449s

As you can see, the feature has failed because the DNS server responses don't match.

Go back to the second window and edit the zone file again. This time, increment the serial number. The zone file should now look like this:

$TTL 300
example.com.  IN  SOA   ns1.example.com. hostmaster.example.com. (
                      1375624789 ; Serial
                      3h ; Refresh
                      15 ; Retry
                      1w ; Expire
                      300 ; Minimum
                      )
              IN  A     192.168.205.199
              IN  NS    ns1.example.com.
              IN  NS    ns2.example.com.
              IN  TXT   "U3VuIEF1ZyAwNCAxMDoxOTowOSAtMDQwMCAyMDEz"
; Hosts
www           IN  CNAME example.com.
foo           IN  A     192.168.205.198
ns1           IN  A     172.22.0.24
ns2           IN  A     172.22.0.25
quux          IN  A     192.168.205.203

Save the file and exit your editor. Run this command to reload the zone file: sudo rndc reload

Since we incremented the serial number this time, the changes to the zone file are pushed out to the slave DNS servers.

Go back to the first window and rerun the Cucumber command. You should see output like:

[vagrant@tester features]$ cucumber ../features/20130708_example.feature
Feature: New A record for QUUX.EXAMPLE.COM (20130708001)
  Joe from Marketing wants us to add a host record
  for quux.example.com so they can interact with a mockup
  for a new website.
 
  Scenario: A record for QUUX.EXAMPLE.COM                               # ../features/20130708_example.feature:6
    Given my DNS servers are:                                           # steps/dns_steps.rb:4
      | 172.22.0.23 |
      | 172.22.0.24 |
      | 172.22.0.25 |
    When I query my DNS servers for the A record for "quux.example.com" # steps/dns_steps.rb:8
    Then I should see that the A record response is "192.168.205.203"   # steps/dns_steps.rb:27
 
1 scenario (1 passed)
3 steps (3 passed)
0m0.395s

As you can see, the test passed. This means we're done! Now we can tell Joe that we've added the A record for him and we can move on to the next ticket.

Wrap-up

You should now have an idea of how to use Cucumber for use with tickets (or other tasks you have to do). If there are things about Cucumber that you need more information on, please see the Cucumber documentation. If something about this process confuses you, leave a comment and I'll try to help.

To learn more

There's a fair bit I haven't mentioned. I'll mention some of this in future blog posts.

If you want to see other examples, the features directory in the git repository for the demo environment contains the Cucumber features I used when building it and making sure it worked as expected.

If you're impatient and want to learn more now, you can read the Cucumber documentation and examples. The Pragmatic Bookshelf has published books on Cucumber, The Cucumber Book, and RSpec, The RSpec Book.

More on acceptance testing

A couple days ago, I wrote about writing tests for tickets. (Didn’t read that post? Go on and read it. All done? Good.) I figured I should say more about the theory behind the practice

Theory

The tests I write for tickets are based on the concept of “acceptance tests” from agile development. Acceptance tests are usually written by customers and developers together to determine when a given feature is done. A feature cannot be considered finished (and should not be released to the customer) until all of the acceptance tests for the feature pass.

Acceptance tests are written using the language of the business domain and are understandable by both the customers and the developers. They often follow the pattern of “Given/When/Then”, i.e. “Given something is true, When I do something, Then I should see something.” Implementation-specific language is not used in these tests.

My testing process is based on the practice of “acceptance test-driven development” (ATDD). Under ATDD, acceptance tests are written before any code is written. Once the test is verified to fail, the developer writes the code. (By verifying that the test failed first, you ensure that you know that the code you wrote caused the test to pass.) When the developer believes they are done or wants to check their work so far, they run the tests. Once, and only once, the tests pass, they an consider their work done.

Sysadmin reality

As a system administrator, I do not have the luxury to work out these tests directly with my customer. I have to rely on what they have written in their ticket (or said over the phone) to figure out what they want. Sometimes, I have to ask for clarification to make sure I understand their request well enough to write the tests.

Since I have to work out on my own what my customer wants, the tests I write are not guaranteed to be as effective for establishing when I am truly done. After all, my understanding of what they want may be incomplete and, therefore while my tests say I am done with the request, I’m not really done: I have not done everything my customer wanted.

One benefit of having to devise these tests myself is that I can use implementation-specific language. This lets me simplify the implementations of tests although maybe not the tests themselves. For example, I can write implementation-specific steps for getting the list of email accounts from mailservers (e.g. “When I get the list of accounts in Postfix”, “When I get the list of accounts in exchange”, etc.) without having to write conditional and abstraction logic within the test implementation.

To learn more

ATDD is covered in detail in ATDD By Example by Markus Gärtner. In addition to discussing the idea behind ATDD, he provides some coverage of two tools, Cucumber and FitNesse. The Cucumber Book by Matt Wynne and Aslak Hellesøy covers Cucumber in more detail.

The second part of The RSpec Book by David Chelimsky et al covers behavior-driven development (BDD) which is similar to and incorporates aspects of ATDD. (One of the foundations of BDD is “acceptance test-driven planning.”) I do not believe my testing methodology follows BDD since I am not always testing behavior. (See my comments about indirect tests in my last post.)

Finally, in chapter seven of The Clean Coder, Robert Martin discusses acceptance testing and the importance of establishing a correct definition of done.

Tickets and testing

Some time ago, I received a ticket from a customer to change an A record in a zone, a quick, easy thing to do. I made the change and told them that it was done. A few days later, I got a followup ticket saying “This doesn’t appear to be done.” I checked and, sure enough, the change didn’t appear when I queried the DNS servers. Apparently, in my haste, I had forgotten to do one simple but necessary step: increment the serial number in the zone file. Oops. Chagrined, I incremented the serial number, reloaded the zone file, verified that the change did indeed appear, and apologized profusely to the customer.

If only I tested the change before I decided I was done with the work, I could have found my error before my customer.

Mistakes like this don’t happen often, but they do happen. I know I have made many errors like this one throughout my time as a system administrator.

One Friday about a year ago, I handled a particularly hairy ticket. I spent the entire weekend half-expecting, dreading, I would get a phone call from the person on call telling me that I had screwed up. That call never came so I guess I did it right. On the following Monday, I decided to avoid weekends like that one and adopted a new policy: Test work I do before saying that I’m done. Before I do something, I would set up a test for the desired outcome, verify that the test fails, do the work, and then verify that the test passes.

An Example

To give you an idea of how this works, let’s do an example based on the scenario I mentioned earlier. Since I use Cucumber for my tests, I write a feature that looks like:

Feature: Change A record for foo.example.com
  Scenario:
    When I query for the A record for “foo.example.com”
    Then I should get a response with the A record “10.1.1.2”

I then run this feature and verify that it fails:
Cucumber says it failed!

I make the change on the DNS server and run the test again:
Cucumber says it failed again!

What? It failed? Oh, I didn’t increment the serial number again. Let’s fix that and:
Cucumber says it passed!

It passed! That means I’m done and can tell that to my customer, confident that an error hasn’t slipped through.

What I’ve learned

Having done this for about a year, I’ve learned several things.

  1. There is a cost to testing. If I already have step definitions set up for my test, it only adds a few minutes. If I have to write new step definitions, it takes a lot longer. (Some step definitions, particularly those that have to scrape websites, have taken hours to write.) Even if it only takes a few minutes to set up and run the test, that’s still a few minutes.
  2. Some things cannot or should not be tested directly. In these cases, indirect tests are needed. Instead of verifying the desired behavior, you might need to test the configuration that specifies the desired behavior. Care must be taken when using indirect tests since they are not as reliable as direct tests.

    For example, most tasks involving email delivery or routing should not be tested directly because emails sent for testing would be seen as unwanted noise by the customer or because you do not have the login credentials for the email account. If your customer wants to emails to forwarded to , you can test the mailserver configuration to verify that the forwarding is set up.

  3. Make sure that the test checks everything that is being done. (This is especially true for indirect tests.). If your test only checks part of the functionality, your test may pass but you’re not actually done. For example, if you’re supposed to set up a website that pulls data from a database, make sure you test for content that’s in the database. If you only test for the presence of a string from static content (say, from the template for the site), your test will pass even though you haven’t set up the database privileges correctly.

I believe in testing. Why should you?

Every error has costs. The obvious cost is time: It takes time to fix an error. This time may exceed the original time it took to do the work.

The more important cost is in customer respect and goodwill. Every error that your customers see takes away from the respect and goodwill you’ve built up. Sadly, it’s a lot easier to lose respect than to gain it. If your customers see enough errors from you that you deplete that respect, there will be consequences. Even if they don’t quit our services entirely, they are sure to tell their friends and colleagues that you’re incompetent, untrustworthy, and, perhaps most damning, unprofessional.

“But I don’t have time to test!” you might say. That’s precisely when you should be testing! If you’re doing everything at a breakneck pace, you’re sure to make mistakes. When you’re stressed and overloaded is precisely when you should slow down and do things carefully. Every error you catch now saves you from paying the cost of that error later.

How has this worked out over the past year?

Over the past year, I have tested most of the work I have done. Various reasons have prevented me from testing all of it.

I feel like my customer-visible error rate has decreased significantly. (I don’t have accurate statistics so I don’t know for sure.) Of all of the mistakes that were reported by my customers in the past year, only one has been for work I had tested and that was because the test was not complete. All other reported errors were for untested work. Said another way: My customer-visible error rate for properly tested work is 0%. I think that’s a pretty good track record.

Closing thoughts

Creating tests for checking your work will help you decrease your customer-visible error rate. They won't help you make less errors but they will help you prevent your customers from seeing them.

I recommend trying it out. Write a test for a particular work item, do it, make your test pass. Now do the same for your next work item. And then the one after that. And so on.

Calling myself on hypocrisy

"You know what you believe in by observing yourself in a crisis."
— Robert C. Martin, The Clean Coder

Almost three years ago, I wrote a rather presumptuous post about moving away from Perl. I specifically cited Perl's object system as a major reason for dropping it. chromatic called me out on that and for good reason. I also cited maintainability issues with my Perl scripts but those are a result of my own coding and not anything inherent in Perl. (Allowing me to write sloppy code does not mean the same thing as making me write sloppy code.)

While I talked about moving on to Python, I have made almost no progress on this. And when I feel stressed or pressed for time (which is just about always), I always fall back to Perl. It is simply what I know best.

I'm sure I could still correct this. I could immerse myself in Python (or Ruby) and learn either to the point of being able to dream in it. But I haven't done it in the past three years. What likelihood is there that I'll be able to do this in the next three years?

Through observation, and no matter how what rational or irrational reasons I could put forth otherwise, I can only conclude that I believe in Perl.

A thought on using RSpec for behavior-driven system administration

In lieu of a better term, behavior-driven system administration is the application of BDD principles to system administration, primarily building and maintaining systems.

The RSpec Book by David Chelimsky et al presents BDD as using the tools Cucumber and RSpec. Cucumber is used to describe and test the feature to be implemented and then RSpec is used to test the implementation while it is being built.

In Test-Driven Infrastructure with Chef, Stephen Nelson-Smith states that RSpec is not used in his example in the book because "there's no point in unit testing a declarative system." On one hand, I can agree with the statement. Somewhere, I read software developers saying that you should test your application but not worry about testing the platform.1

On the other hand, I disagree. I think RSpec can be useful when testing a declarative system but my rationale has less to do with testing than with auditing.2 I see Cucumber and RSpec tests as filling two different roles: Cucumber verifies the system's behavior. RSpec verifies the system's state.3 This also lets me use the Cucumber features to document the system's behavior and RSpec scenarios to document the system's state. (As pointed out in one of the BoF sessions at LISA 2011, Puppet manifests don't themselves work as documentation of a system.)

I am currently working on building a new system where I hope to test this. I have RSpec scripts written for making sure Puppet is set up and correctly configured but I haven't completed the Cucumber feature for Puppet to verify it's behavior so I can't show how this works in practice. When I have a full example, including a working Cucumber feature and Puppet config, I'll make another post and walk through it. (Or if it doesn't work out, I'll point that out too.)

  • 1. I don't like this amorphous "somewhere" but I can't remember or find where I read this. If I find it, I'll add a reference.
  • 2. Although, yes, I do appreciate knowing whether or not the change I have made to Puppet's manifests was the correct change to make. This may be something that goes away as I get more experience with Puppet.
  • 3. This may be a misuse of RSpec since it's also intended to verify behavior rather than state. I use RSpec since I'm more familiar with it than Test::Unit or other testing methods.

Pages

Subscribe to Ithiriel RSS