Explicit Unit Tests for Data Structures Are a Waste of Time

by Hit Subscribe 27. November 2018 02:34

When you properly unit test your code, you can go quickly and safely. However, it's not always clear what makes unit tests "proper."

In this article, I'll discuss the practice of writing explicit unit tests for data structures and show you how it's a total waste of time.

are you wasting your time

Unit Tests Code Coverage

One of the core metrics associated with unit tests is their code coverage. This metric shows the proportion of an application's code exercised by tests (e.g., 80% coverage).

While there are several types of code coverage (like line, branch, and functional), the distinction between them isn't important in the context of our discussion here. What's important is the fact that good developers strive to achieve as much unit testing code coverage as possible. At the end of the day, the higher your code coverage, the more confidence you have in your test suite. In my opinion, that's a good goal.

However, sometimes you might be tempted to write unneeded unit tests just for the sake of increasing code coverage when it's not really required. That's exactly the case with explicit unit tests for data structures.

Objects vs. Data Structures

At this point, I want to explain what I mean by "data structures," as this term might have multiple interpretations. To do that, I'll draw a contrast between them and object-oriented objects.

Objects hide implementation details and expose behavioral API to the outside world. In other words, you can use objects to make actions without being concerned with how exactly they execute these actions. For example, this object represents a user in the system:

public class User
    {
        public void LogIn(String password)
        {
            // some implementation
        }

        public void LogOut()
        {
            // some implementation
        }
    }

That's great, but what about situations when you don't need to make actions and just want to pass data from one part of the application to another? Enter data structures.

In contrast to objects, data structures don't have any behavior. They simply expose their inside data and the structure of that data to the world. For example, this is how you could represent simple user's data in the system:

public class User
    {
        public String Id { get; private set; }
        public String FirstName { get; private set; }
        public String LastName { get; private set; }
        public String FullName
        {
            get
            {
                return this.FirstName + " " + this.LastName;
            }
        }

        public User(String id, String firstName, String lastName)
        {
            this.Id = id;
            this.FirstName = firstName;
            this.LastName = lastName;
        }
    }

Some developers will claim that this distinction doesn't exist in C# (and some other languages) because all the classes are objects. At the end of a day, all of them implicitly extend object class. However, I don't think that's the case. In my opinion, that's just a conflation of two distinct concepts due to unfortunate naming.

But don't take my word for it. Robert "Uncle Bob" Martin wrote the following in his book Clean Code:

"Mature programmers know that the idea that everything is an object is a myth."

Therefore, you should distinguish between objects and data structures even if language constructs don't help you.

Why You Don't Need Unit Tests for Data Structures

Now that you know what I mean by data structures, let's get to the main premise of this article: you shouldn't write explicit unit tests for data structures in your application.

For example, referring back to the user data structure described above, some developers would write this kind of unit test:

[Test]
public void GetFullName_ReturnsConcatenationOfFirstAndLastNames()
{
    // Arrange
    User user = new User("id", "first", "last");
    // Act
    String fullName = user.FullName;
    // Assert
    Assert.AreEqual(fullName, "first last");
}

This unit test might seem legit on first sight because you do want to cover this code and ensure that you implemented user data structure correctly. However, in practice, it's just a waste of time. See, as we already established, applications consist of both objects and data structures. Since user data structure doesn't expose behavior, there must be at least one object that uses this data structure.

Let's assume that this is it:

public class AuthManager
    {
        public void LogIn(User user)
        {
            // some implementation
        }

        public void LogOut(User user)
        {
            // some implementation
        }
    }

Now, when you write unit tests for AuthManager, you will mock users' details data, but you'll still use the real implementation of user data structure. It will look something like this:

[Test]
public void LogIn_UserLoggedIn()
{
    // Arrange
    User user = new User("id", "first", "last");
    // Act
    authManagerUnderTest.LogIn(user);
    // Assert
    // assert according to AuthManager requirements
}

Since the functionality of AuthManager is predicated on the assumption that user data structure works, this test will fail if the data structure is buggy. In other words, this unit test explicitly covers the object's code as well as implicitly covers the data structure's code. And that's true for every unit test that makes use of data structures. That's why you don't need to test data structures explicitly.

And it's not just about sparing one-time effort on a couple of tests. Data structures change very often: fields are added and removed, names are changed, additional hierarchies are introduced. If you'll have explicit unit tests for data structures, you'll need to invest in their ongoing maintenance.

Conversely, implicit unit tests are much less coupled to data structures' implementation details.

Complex Data Structures

Until now, I intentionally used a trivial data structure as an example to explain the basics. However, the same argument applies to complex data structures equally well.

Let's say that you need your data to be structured as a tree. No problem. That's your tree node:

public class DataTreeNode
    {
        public String Data { get; private set; }
        public DataTreeNode Parent { get; private set; }
        public List<DataTreeNode> Children { get; set; }

        public DataTreeNode(String data, DataTreeNode parent)
        {
            this.Data = data;
            this.Parent = parent;
        }
    }

DataTreeNode is much more involved than the previous data structure, but there is still no need to unit test it explicitly. There is an object in your application that depends on it, and you'll implicitly cover DataTreeNode by properly unit testing that object.

Algorithms Bundled With Data Structures

Data structures should expose data and shouldn't expose any behavior. That's a rule. However, like with any other rule, sometimes you'll want to make an exception.

One such case is when you put some data-related algorithm into the data structure itself. For example, if you'll need to find all the leaves of a specific DataTreeNode in many places in your app, it will make sense to add this method to its public API:

public class DataTreeNode
    {
        // implementation as in previous example

        public LinkedList<DataTreeNode> GetAllLeafs() 
        {
            LinkedList<DataTreeNode> leafs = new LinkedList<>();
            AddAllLeafNodes(this, leafs);
            return leafs;
        }

        private AddAllLeafNodes(DataTreeNode node, LinkedList<DataTreeNode> leafs) 
        {
            if (!node.Children.Any()) {
                leafs.Add(node);
            } else {
                foreach (var child in node.Children) {
                    AddAllLeafNodes(child, leafs);
                }
            }
        }
    }

While this algorithm isn't very complex, it's also not entirely trivial and even uses recursion. If you're an experienced unit tester, you'll surely want to unit test this piece of logic. That's alright, but keep in mind that you don't really need to because the same argument applies: there is at least one object that calls this method in your application. If you unit test that object properly, you also cover the implementing code of this data structure.

However, given that unit testing (especially test-driven development) can make implementation of nontrivial logic easier, I wouldn't say that writing unit tests for algorithms bundled with data structures is necessarily wasteful. After all, if you decided to make an exception and bundle behavior with data structures, it's alright to make an additional exception and unit test this behavior, even though you don't have to. After all, there is more to unit testing than just code coverage.

Summary

In object-oriented design, there are objects and there are data structures. Some languages expose these fundamental concepts as part of their syntax, while others don't. Nevertheless, you should always keep this distinction in mind and avoid mixing object and data structure responsibilities together.

You want to unit test the hell out of your objects to ensure that they behave according to the requirements. Conversely, unit testing of data structures is a waste of time because their functionality is covered implicitly by objects' unit tests. However, when you make educated trade-offs and bundle specific behaviors with data structures, you might want to unit test these, even though it's not required from a pure code coverage standpoint.

But what if you find a coverage gap in one or more of your data structures? Well, this coverage gap indicates either that some objects inside your application aren't tested properly or that this code (or the entire data structure) isn't used at all. In the former case, your time will be better spent on unit testing the objects, and in the latter case, we're talking about dead code that should simply be removed.

This post was written by Vasiliy Zukanov. Vasiliy worked in the semiconductor industry and then became a software engineer specializing in Android. After several years of 9-5 jobs, he decided to go into Android dev freelancing. Currently, he consults and creates Android video courses for developers.

Tags:

Blog