Patterns for Structuring a Unit Test

Christian Emmer
Christian Emmer
Feb 3, 2022 · 6 min read

Unit tests should be testing acceptance criteria, not internal behavior. Here are some patterns to help accomplish that.

Given-when-then

Given-when-then is a way to write acceptance criteria scenarios in business requirements that then can be translated 1:1 to unit tests. This style stems from behavior-driven development (BDD) which specifies that tests should be written to test behavior (as opposed to state). Tests should map so well to scenarios that they practically write themselves.

It's possible to use the given-when-then test structure with acceptance criteria written in other ways, but there is a lot of benefit to be gained from tests matching acceptance criteria word for word.

Here are the three parts to each given-when-then scenario (and therefore test):

  • Given some context or state of the world
  • When we invoke some action or behavior
  • Then we should get some expected change

Some examples of good given-when-then acceptance criteria using the Gherkin syntax:

Given a user that is logged out
 When the user tries to authenticate with valid credentials
 Then the user should be logged in

Given a user that is logged out
 When the user tries to authenticate with invalid credentials
 Then the user should be given an error

You can combine multiple expressions within each clause using "ands" like this:

Given a user with $10 in gift card balance
  And the price of an orange is $2
 When that user places an order for 2 oranges
 Then an order containing 2 oranges should be created
  And that user should have $6 left on their gift card

Let's translate these scenarios into unit tests:

const User = (username, password) => {
  let loggedIn = false;
  return {
    isLoggedIn: () => loggedIn,
    login: (usernameInput, passwordInput) => {
      if (usernameInput !== username || passwordInput !== password) {
        throw 'invalid username or password';
      }
      loggedIn = true;
    }
  };
};

const test = (func) => func();

test(() => {
  // Given a user that is logged out
  const user = User("admin", "letmein");

  // When the user tries to authenticate with valid credentials
  user.login("admin", "letmein");

  // Then the user should be logged in
  if (!user.isLoggedIn()) {
    throw 'User was not logged in!';
  }
});

test(() => {
  // Given a user that is logged out
  const user = User("admin", "letmein");

  // When the user tries to authenticate with invalid credentials
  try {
    user.login("admin", "password123");
  } catch(err) {
    // Then the user should be given an error
    return;
  }

  throw 'An error was not returned!';
});

console.log('All tests passed!');
const User = (giftCardBalance) => {
  return {
    giftCardBalance: () => giftCardBalance,
    purchase: (products) => {
      const order = Order(products);
      giftCardBalance -= order.total;
      return order;
    }
  };
};

const Product = (name, price) => ({name, price});

const Order = (products) => ({
  products,
  total: products.reduce((total,product)=>total+product.price,0)
});

const test = (func) => func();

test(() => {
  // Given a user with $10 in gift card balance
  const user = User(10);

  // And the price of an orange is $2a
  const orange = Product("Orange", 2);

  // When that user places an order for 2 oranges
  const order = user.purchase([orange, orange]);

  // Then an order containing 2 oranges should be created
  if (order.products.length !== 2 || order.total !== 4) {
    throw 'Invalid order produced!';
  }

  // And that user should have $6 left on their gift card
  if (user.giftCardBalance() !== 6) {
    throw 'User has an invalid gift card balance!';
  }
});

console.log('All tests passed!');

Notice how each of these scenarios have only one "when". If we limit ourselves to only one behavior under test, we're forced to have a clearer purpose for the test.

There's a secondary benefit to writing tests this way - all of a domain's acceptance criteria lives forever in the code's tests, creating living documentation. It can be hard to know what the intended behavior of some systems is, but if the behavior is spelled out in specific language in tests then it should help alleviate some confusion.

The given-when-then test style can be used with any testing framework, but there are some that explicitly encourage it with BDD: Cucumber , JBehave , and SpecFlow to name a few.

Arrange-act-assert

Arrange-act-assert is mostly synonymous with given-when-then, but stems from test-driven development (TDD) rather than BDD.

The three parts to an arrange-act-assert test are:

  • Arrange the data and state necessary for the test
  • Act on the object or method under test
  • Assert the expected results

The main value loss with using arrange-act-assert is it tends to describe technical behavior and test internal state while given-when-then encourages testing of functional behavior. Otherwise, there isn't a significant difference other than three "A" words are harder to distinguish at a glance.

Arrange-act-assert should follow the same rules as given-when-then such as only testing one behavior at a time, or only having one "act" expression.

I have seen a lot of arrange-act-assert articles encourage mocking during "arrange" which can lead to the fallacy of testing implementation rather than behavior. The debate of spies vs. mocks is too large to cover here.

Setup-exercise-verify-teardown

Known as the "four-phase test", setup-exercise-verify-teardown is nearly the same as arrange-act-assert other than it explicitly calls out a "teardown" phase to reset the system-under-test to its pre-setup state.

If we try our best to create a test pyramid where we have noticeably more unit tests than integration tests, or if we apply the principles of hexagonal architecture and use test doubles for our repositories, then the "teardown" step shouldn't be needed in most cases.

Summary

Overall, the goal of these patterns is to help you write more readable and maintainable tests, one of the most important goals with software engineering. Each pattern was established at a different time and came from a different school of thought, but all have the same objectives.