Cucumber – Jon Kruger

Agile, ATDD, Cucumber, Quality, TDD, unit testing

Does a whole team approach to testing change how developers should test?

Lately I’ve been thinking about a whole team approach to testing, where we decide as a team how features will be tested and where we use the skillsets of the whole team to automate testing. We do this on our project, and this has led to a regression testing suite of ~2500 SpecFlow acceptance tests that automate almost all scripted QA testing and regression testing for our application.

We didn’t always do this. Originally there was no automated acceptance testing, but developers were diligently writing unit tests. Those unit tests are still around, but we don’t write many unit tests anymore. We start with acceptance tests now, and the acceptance tests cover all of the testing scenarios that need to be covered. Our application has well-defined design patterns that we follow, so the idea of TDD driving the design of our code doesn’t really apply. If the unit tests fail, we often just delete them because it’s not worth fixing all of the mocks in the unit tests that are causing them to fail, and we have acceptance testing coverage around all of it.

This approach does not line up with the conventional wisdom on automated testing. They say that you’re supposed to write lots of unit tests that run really fast to give you fast feedback, help design your code, and ensure the internal quality of your code. In the past, this is how I’ve always done it. In fact, many of them dislike Cucumber.

Cucumber makes no sense to me unless you have clients reading the tests. Why would you build a test-specific parser for English?

— DHH (@dhh) March 29, 2011

While TDD isn’t as mainstream as I would like, TDD is nothing new. Kent Beck was writing about it 10 years ago, and the original XP guys valued such things as unit testing, the SOLID principles, and things like that.

Automated acceptance testing still feels like a relatively new phenomenon. I’m sure people were doing it 10 years ago, but back then we didn’t have Cucumber and SpecFlow and the Gherkin language. Now I see a lot more people using tools like that to automate QA testing in way that uses business language and more maintainable code, rather than the old “enterprise” solutions like QTP.

Here’s what I’m getting at – I wasn’t there 10 years ago when Kent Beck was writing his books and the XP movement was starting, but it seems to me to be primarily an effort by developers to ensure the quality of their code through the effort of developers. I see very little talk of where QA fits into that process. There is some mention of QA for sure, but the general gist seems to be that developers need to write tests in order to ensure quality, and the best way to do that is to write unit tests. QA people typically don’t say that unit testing is enough because it doesn’t test end-to-end, so then what do they do? Manually test? Use QTP?

My question is this – if we think of testing as whole-team activity and not just a QA activity or a developer activity, will we arrive at the same conclusions as we did before?

I’m not ready to discount unit testing as a valuable tool, and I’m also not ready to say that everyone should do it my way because it worked for us on one project. But we have largely abandoned unit testing in favor of acceptance testing, and other teams in our department are doing it too. I write unit tests for things like extension methods and some classes that have important behavior on their own and I want to ensure that those classes work independent of the larger system.

We have 3 Amigos meetings in which one of the things we do is develop a set of acceptance tests for a feature before any code is written. We usually decide at this point (before any code is written) that most or all of these scenarios will be automated. We write the acceptance tests in SpecFlow, I watch them all fail, and them I write the code to make them pass. I follow the patterns and framework that we have set up in our application, so there aren’t many design decisions to make. When my acceptance tests pass, I am done.

Where do unit tests fit in there? If my acceptance tests pass, then I’m done, so why spend more time writing duplicate tests? Also, with acceptance tests, I’m not dealing with mocks, and more importantly, I’m not fixing broken unit tests because of broken mocks. If you follow the Single Responsibility Principle (which we try to do), you end up with lots of small classes, and unit tests for those classes would be mostly useless because those classes do so little that it’s hard to write bugs and each class does such a small part of the larger activity.

There is an obvious trade-off here – my acceptance tests are not fast. I’m just testing web services (no driving a browser), so all ~2500 tests will run in about an hour. But we accepted this trade-off because we were able to get things done faster by just writing the acceptance tests, which we were going to do anyway to automate QA testing. The end result is high quality software with few bugs, not just because we have tests, but also because we communicate as a team and decide on the best way to test each feature and what it is that needs to be tested, and then we find the best way to automate the testing as a team.

Again, I’m not ready to say that this way is the best way for every project, and I’ve seen each approach work extremely well. I just wonder if the conventional wisdom on testing would be the same if we thought of it from the perspective of the whole team.

October 15, 2012 by Jon Kruger

.NET, BDD, Cucumber, Ruby, TDD, unit testing

Why do we group our tests by file?

Most people I know put their unit tests in files that mirror the folder structure and filename of the actual class that is being tested. So if I have app/models/order.rb, I’ll have spec/models/order_spec.rb. The tests in the order_spec.rb file will test the code in the order.rb file. Every project I have been on, whether Ruby or .NET, has done it this way.

But have you ever thought about why we do it this way?

We do it this way on our project and I keep running into two problems. The first problem is when I am about to modify some existing code and I want to know what tests will test that portion of the code. The first place I look is the corresponding test file (based on the folder structure or filename), but that doesn’t always give me all of the tests for that functionality.

The second problem is when I refactor some existing code and the refactoring spans multiple classes. Now I have broken tests all over the place and it’s hard to reconstruct the tests so that they test the same business concepts that they were testing before. Often times the tests were testing a portion of the system of classes that I am refactoring, but were not as encompassing as they should’ve been.

Here’s the thing — when I write tests, I’m typically using the Given/When/Then style of writing tests, even in unit tests using frameworks like RSpec. I’ll have test code that looks like this:

describe "When completing an order" do
  it "should set the status to In Process" do
     # test code here
  end
end

That code tests functionality having to do with orders and is going to help me ensure that my code performs some business function correctly. But if you look at that code snippet, you don’t know what classes I’m testing. Yeah I know, it’s just an example and I left out those details. But the point is that it doesn’t matter what classes I’m testing. What matters is that my tests are testing that my code performs a certain business function.

That being said, why are we grouping our tests by file? Wouldn’t it make much more sense to group our tests by business function instead?

I already have ways to find tests for a given class. I can search my code for the class name, or if I’m in .NET I can do Find Usages and have a little window pop up that tells me everywhere a class (or method) is used.

If my classes were grouped in folders by business function instead, I get the following benefits:

I can see what tests exist for a given business function
It encourages me to write tests that test business functionality, not test data structures (classes, methods, etc.)
I can put all kinds of tests in there (unit tests, Cucumber tests, even manual test plans) — all in one place, all checked into source control
My tests document business functionality instead of documenting a class

Remember, code constructs like classes and methods are just a means to an end, our goal is to write software that provides business value and performs specific functions. So I might as well organize my tests accordingly.

(Disclaimer: I have never actually tried organizing tests this way. It makes sense to me and I think it would work great, but I might try it and find out that it doesn’t work. But if anything, maybe I’ll start a good discussion.)

June 27, 2011 by Jon Kruger

BDD, Cucumber, Ruby, TDD, unit testing

Using Cucumber for unit tests… why not?

It seems that the accepted way to test in Ruby is to use Rspec for unit tests and to use Cucumber for acceptance tests (higher level functional testing). After doing a little bit of Cucumber, I’ve started to fall in love with the format of Cucumber tests.

Most Rubyists would probably agree that behavior-driven development is good (in other words, writing tests in a Given/When/Then format). We obviously do this in Cucumber (there isn’t much choice), but I’ve also written tests in this format in Rspec and in .NET.

I like BDD for two main reasons. First, I believe that software development is a series of translations. I want to translate business requirements into readable, executable specifications, then translate that into tests, then translate that into implementation code. Second, before I implement a feature and even before I write my tests, I try to write out what I want the code to do in English. If I can’t write out what I want to do in English, how and I supposed to know what I’m supposed to write in code?

Here’s my theory: if we agree that BDD is good, why don’t we write our unit tests in a format that is more amenable to BDD, that being the Cucumber format of tests? I’m not saying that we write acceptance level tests instead of unit tests, I’m saying that maybe we should write unit tests in a different format. Not only that, Cucumber tables give us a nice way to write more readable, data-driven tests. Here are a couple examples from the supermarket pricing kata (in Rspec and Cucumber).

Cucumber:

Feature: Checkout

  Scenario Outline: Checking out individual items
    Given that I have not checked anything out
    When I check out item 
    Then the total price should be the  of that item

  Examples:
    | item | unit price |
    | "A"  | 50         |
    | "B"  | 30         |
    | "C"  | 20         |
    | "D"  | 15         |

  Scenario Outline: Checking out multiple items
    Given that I have not checked anything out
    When I check out 
    Then the total price should be the  of those items

  Examples:
    | multiple items | expected total price | notes                |
    | "AAA"          | 130                  | 3 for 130            |
    | "BB"           | 45                   | 2 for 45             |
    | "CCC"          | 60                   |                      |
    | "DDD"          | 45                   |                      |
    | "BBB"          | 75                   | (2 for 45) + 30      |
    | "BABBAA"       | 205                  | order doesn't matter |
    | ""             | 0                    |                      |

  Scenario Outline: Rounding money
    When rounding "" to the nearest penny
    Then it should round it using midpoint rounding to ""

    Examples:
      | amount | rounded amount |
      | 1      | 1              |
      | 1.225  | 1.23           |
      | 1.2251 | 1.23           |
      | 1.2249 | 1.22           |
      | 1.22   | 1.22           |

Rspec:

require 'spec_helper'

describe "Given that I have not checked anything out" do
  before :each do
    @check_out = CheckOut.new
  end

  [["A", 50], ["B", 30], ["C", 20], ["D", 15]].each do |item, unit_price|
  describe "When I check out an invididual item" do
    it "The total price should be the unit price of that item" do
      @check_out.scan(item)
      @check_out.total.should == unit_price
    end
  end
end

  [["AAA", 130], # 3 for 130
    ["BB", 45],  # 2 for 45
    ["CCC", 60],
    ["DDD", 45],
    ["BBB", 75], # (2 for 45) + 30
    ["BABBAA", 205], # order doesn't matter
    ["", 0]].each do |items, expected_total_price|
    describe "When I check out multiple items" do
      it "The total price should be the expected total price of those items" do
        individual_items = items.split(//)
        individual_items.each { |item| @check_out.scan(item) }
        @check_out.total.should == expected_total_price
      end
    end
  end
end

class RoundingTester
  include Rounding
end

[[1, 1],
  [1.225, 1.23],
  [1.2251, 1.23],
  [1.2249, 1.22],
  [1.22, 1.22]].each do |amount, rounded_amount|
  describe "When rounding an amount of money to the nearest penny" do
    it "Should round the amount using midpoint rounding" do
      RoundingTester.new.round_money(amount).should == rounded_amount
    end
  end
end

A couple things stand out to me when you compare these two. First, if I want to run data-driven tests with different values, the Cucumber syntax is so much cleaner and more descriptive. Second, the “Given I have not checked anything out” section in the Rspec version is really long and contains two nested “describe” sections (many times you end up with many more than this). When you nest sections like this, it’s really hard to see the context of things or read the tests because the “Given” text is nowhere near the nested “When” sections in the code.

Rspec follows in the footsteps of previous unit testing frameworks that write test methods in test classes (or in the case of Rspec, something that resembles test classes and methods. But is this the best way, or just the way that we’re used to? We have been writing unit tests this way for years and years because we had no other choice. But that doesn’t mean that it’s the best way.

Here are the benefits I see of using the Cucumber syntax over Rspec:

The tests are much easier to read (especially when doing data-driven “scenario outline” tests).
The Given/When/Then text is all in one place (as opposed to spread out and mixed in with code).
It forces me to be able to write out in English what I want the code to do.
Any step definition that I write can easily be reused anywhere in any other Cucumber test.
The code just looks cleaner. I’ve seen a lot of messy Rspec tests.
Rspec doesn’t have a method that corresponds to the “When” step (unless I’m missing something), so you have to shoehorn it into before(:each) or the “it” method. (I’m not sure why this is, we figured this out in the .NET world long ago.)

To be fair, there are more BDD-friendly flavors of Rspec (like rspec-given). This helps you write tests in Given/When/Then format, but I still feel like all of the underscores and symbols and syntax is getting in the way of the actual test verbiage.

Favoring Cucumber is my personal preference and I know that there are some people that would probably disagree with my opinion on this, and that’s fine. But I’m really enjoying what Cucumber brings to the table, both in terms of functionality and the syntax.

December 13, 2010 by Jon Kruger

Does a whole team approach to testing change how developers should test?

Why do we group our tests by file?

Using Cucumber for unit tests… why not?

About Me