Finding Unknown Bugs With Property-Based Testing

by Hit Subscribe 19. December 2018 02:12

There are many ways of testing your application or library. The test pyramid provides a good starting point to the most common types of tests—unit tests, integration tests, end-to-end tests, and manual tests. But there are other types of tests, like contract tests, load tests, smoke tests, and what we'll be looking at in this article—property-based tests.

Finding_hidden_bugs_with_property_based_testing

What's the Idea?

Property-based testing is where you test that your code satisfies a set of predefined properties, using a wide range of input. What's the difference between unit testing and other types of tests? That last part—using a wide range of input.

A library for property-based testing allows you to write a test that will accept randomly generated data. The idea is that the library runs your tests many times, each time with different data. The library is actually trying to make your test fail. We'll see an example below.

The difference with unit testing is that with unit testing, you are responsible for thinking about the input. If you forget a specific case that could make your test fail, you'll ship a bug. With property-based testing, you have more data fired at your code, and there's a higher chance that you will discover the bug.

On a side note, the original library for property-based testing was QuickCheck, written in and for Haskell. It was initially released in 1999, so it's not a new idea.

Let's Write Our First Test

Let's start simple and assume we want to write a method that can import a user into an existing system. In my case, I was writing an Orchard module and I needed to import a list of email addresses. I will simplify things here and only import one email address. I'll also remove some extra details that aren't important to get my point across.

I will be using .NET in this article. But if you're not using .NET, just search the internet and you should be able to find a library for your language. There are property-based testing libraries for many different languages. For .NET, FsCheck is the go-to library. It's written in F#, but we can use it for any .NET language.

Create a new class library and add the following NuGet packages:

  • FsCheck.Xunit
  • xunit
  • xunit.runner.visualstudio (because I'm working in Visual Studio)

Now, define the property of the system that we're implementing:

public class UserImportServiceTests
{
    [Property]
    public void ShouldImportValidEmail(string email)
    {
        var mockUserService = new Mock<IUserService>();
        var service = new UserImportService(mockUserService.Object);

        var result = service.ImportUser(email);

        result.Should().BeTrue();
        mockUserService.Verify(x => x.CreateUser(email));
    }
}

Then, in another class library, we can start implementing. In the spirit of TDD, we'll just make it compile first:

public class UserImportService
{
    private readonly IUserService _userService;

    public UserImportService(IUserService userService)
    {
        _userService = userService;
    }

    public bool ImportUser(string email)
    {
        throw new NotImplementedException();
    }
}

When we run our tests, we immediately see what we expect. They fail:

Now, let's implement it in what we think is a correct way:

public class UserImportService
{
    private readonly IUserService _userService;

    public UserImportService(IUserService userService)
    {
        _userService = userService;
    }

    private const string EmailPattern =
        @"^(?![\.@])(""([^""\r\\]|\\[""\r\\])*""|([-\p{L}0-9!#$%&'*+/=?^_`{|}~]|(?<!\.)\.)*)(?<!\.)"
        + @"@([a-z0-9][\w-]*\.)+[a-z]{2,}$";

    public bool ImportUser(string email)
    {
        if (!string.IsNullOrEmpty(email) && !Regex.IsMatch(email, EmailPattern))
        {
            return false;
        }

        _userService.CreateUser(email);
        return true;
    }
}

That should do it. We validate the email using a regular expression and then call the UserService if it validates. This UserService is a built-in service in Orchard but doesn't provide a way of importing users in bulk. That's the reason I was writing this UserImportService.

When we run our test, FsCheck will generate various sets of input data and use it in the test. In our case, FsCheck will generate random strings. We can see that FsCheck managed to make our test fail:

But it's failing because FsCheck is trying to enter values that aren't valid emails. It has tried using "\003wf" as an email address. Because that failed, it tried a more simple value. This is what is called shrinking. FsCheck succeeded in "shrinking" to the most simple value of "a" and the test still failed. But that defeats the purpose of our test because we want it to use valid emails.

What we need is a way of telling FsCheck to only use valid emails. This is where generators fit in. Generators are responsible for generating the random data that is injected into our tests. Luckily, FsCheck already supports the System.Net.Mail.MailAddress class so we can change our tests to look like this:

[Property]
public void ShouldImportValidEmail(MailAddress mailAddress)
{
    var mockUserService = new Mock<IUserService>();
    var service = new UserImportService(mockUserService.Object);

    var result = service.ImportUser(mailAddress.Address);

    result.Should().BeTrue();
    mockUserService.Verify(x => x.CreateUser(mailAddress.Address));
}

Notice how we now accept a MailAddress instance in our test. We still pass a string to our service because that is what is sent from the client to our UserImportService. Let's run our tests again:

Whoa! FsCheck created some crazy email address. That can't be right, can it? Well, it actually is because "com" is a domain just like "ncrunch.net" is. It's just a top-level domain and usually the maintainers of TLD's won't create email addresses in that domain. But they could. So let's change our code:

public bool ImportUser(string email)
{
    try
    {
        var mailAddress = new MailAddress(email);
        _userService.CreateUser(mailAddress.Address);       
    }
    catch (FormatException)
    {
        return false;
    }

    return true;
}

Now our test passes:

Let's take a moment to think about this. When we write a test, we're already thinking about the happy flow and the edge cases. Which means our tests and our implementation are more tied together than we would like to admit. If I had written unit-tests, I would also write a specific set of tests, and possibly not thought about certain cases. In the above case, I would have written a test for a valid email address and for an invalid email address. Or at least what I think is an invalid email address! I didn't know that foo@com is a valid email address.

In my real-world case, a similar bug made it to production and it only surfaced after a few months when the admin tried to import an email address like peter.Morlion@example.com (notice the capital M). This led to a bug report, which I reproduced in a unit test and then fixed. The special case is now covered. But with the property-based testing approach, I would have found this bug before it was released.

That's the advantage of property-based testing—FsCheck will "think" about edge cases that you didn't imagine. As I've mentioned above, it's trying hard to make your code fail. It's a good thing if it succeeds. We'd rather have our test find a bug than have a user find one.

Characteristics of Property-based Testing Tools

Let's back away from the code now. You might be thinking my test failed because I got lucky. Maybe this time the generator didn't enter any fancy emails. Or maybe it didn't use capital letters. That's a valid concern. Property-based-tests can only work if the generators do their job well. There could be something missing in a generator but I've shown that your unit tests can also miss certain cases. With a good generator your code will be tested many more times and with a lot more varying data than you can write in unit tests. So you're actually testing your code more thoroughly than with traditional unit tests.

There are some more points that we can look at, but that would take us beyond a simple introduction. Here is a summary of what a good property-based testing library must do:

  • Generate random data to use in your tests.
  • Show the most simple set of data that fails the test. We call this shrinking.
  • Allow you to create custom generators.
  • Provide a way of defining how many iterations of a test should be performed.
  • Allow filtering data from those generators.

FsCheck provides all of this and a lot more. The documentation is F#-centric but provides some useful C# examples too.

To finish, let's jump into that last point about filtering.

Filtering

We tested our code with valid email addresses. But we also need to test it with strings that aren't valid email addresses. To do this, we'll use the existing generator for strings but filter the result. With FsCheck, you need to register your generator in the static constructor of your test class. FsCheck then looks for static properties in the class you provided to the register method. An example will clarify this:

static UserImportServiceTests()
{
    // Register all static properties the return a generator in this class
    Arb.Register<UserImportServiceTests>();
}

// Return the string generator, but filter out anything that returns a valid email address
public static Arbitrary<InvalidMail> InvalidMailGenerator => Arb.From<string>().Filter(x =>
{
    try
    {
        new MailAddress(x);
        return false;
    }
    catch (Exception)
    {
        return true;
    }
}).Convert(s => new InvalidMail(s), i => i.Value);

Notice how we register an Arbitrary<InvalidMail> and not an Arbitrary<string>. That's because in our generator, we'll be using Arb.From<string>. If we registered this as an Arbitrary<string>, FsCheck would use our new generator when we call Arb.From<string> inside our generator. This leads to a StackOverflowException. The InvalidMail class is a simple class that accepts a string and stores it in a Value property.

With this, we can now write a new test like this:

[Property]
public void ShouldNotImportInvalidEmail(InvalidMail invalidEmail)
{
    var mockUserService = new Mock<IUserService>();
    var service = new UserImportService(mockUserService.Object);

    var result = service.ImportUser(invalidEmail.Value);

    result.Should().BeFalse();
    mockUserService.Verify(x => x.CreateUser(invalidEmail), Times.Never);
}

This leads to another edge-case that I didn't think about: an empty string.

When an empty string is passed, an ArgumentException is thrown. In our UserImportService, we took FormatExceptions into account, but not ArgmentExceptions. So we'll have to change our code to this:

public bool ImportUser(string email)
{
    try
    {
        var mailAddress = new MailAddress(email);
        _userService.CreateUser(mailAddress.Address);
                
    }
    catch (FormatException)
    {
        return false;
    }
    catch (ArgumentException)
    {
        return false;
    }

    return true;
}

You could also catch the general System.Exception type, but I personally like to catch as specific as possible. Our tests now pass:

Property-based or Unit Tests? Both!

Our code now passes more test cases than I could think about. We started with a regular expression that wasn't sufficient. Then we used the System.Net.Mail.MailAddress class but didn't catch all possible exceptions. After using property-based tests, our code is now more robust than it would have been with unit tests. This is not to say that unit tests no longer have a place. It can be easier to control the exact input of a unit test and sometimes this is necessary.

Property-based tests aren't fully deterministic. But you should consider that many of your unit tests could probably be replaced with property-based tests. And that such a change would lead to more bugs being found earlier. It can be another tool in your testing toolbelt.

This post was written by Peter Morlion. Peter is a passionate programmer that helps people and companies improve the quality of their code, especially in legacy codebases. He firmly believes that industry best practices are invaluable when working towards this goal, and his specialties include TDD, DI, and SOLID principles.

Tags:

Blog

Pingbacks and trackbacks (1)+

Please log in if you would like to post a comment

Month List

Trial NCrunch
Take NCrunch for a spin
Do your fingers a favour and supercharge your testing workflow
Free Download