So, lately we have been striving to make Dabo a Test Driven Development (TDD) project. This means that we are implementing unit tests for all of the code. This wiki article explains what exactly a unit tests is and guidelines for creating tests.

Test-Driven Development Tutorials

Probably the best explanation of unit tests can be found in the Dive Into Python book. It gives a well thought out structure to unit testing and provides a sample for the unittest module. If you are new to agile development and a TDD environment, here is a wonderful place to start.

What is Unit Testing?

All software has requirements associated with it. These can be formal or informal, but when coding a module you expect methods, properties, and classes to behave a certain way. We can write a test suite that will test the software to make sure that it behaves in the way that we expect it. These tests are called unit tests. The Unit part means that we will only test the smallest possible part of the application. For OOP and Python in particular, this represents a Class or a function. Note that methods are not a unit because they can't function without the class.

The goal of the unit tests is to isolate every part of the program and show that the individual parts work. The tests provide strict requirements that the units must satisfy. Unit testing is not a replacement for integration testing or high level systems testing. You must still do that to catch some bugs. But, by showing that all of the individual parts work we can begin to isolate and fix bugs easier.

Advantages Of Unit Testing

  1. Forces the developer to think about and explicitly define the requirements of a function
  2. Helps prevent unnecessary code because the developer will know that the function has complete functionality when all tests pass.
  3. A developer can be confident that when he changed the code the function still behaves the same way
  4. In a team environment, you can be sure your new code doesn't break anyone else's code because the tests all pass. (Assuming all of the tests have been shared with the team)

How Do I Know If A Test Is A Unit Test?

Here is one of the more applicable quotes for unit testing. Please note that the context of this quote is referring to units that don't explicitly do these things. For example, if you are testing a C system function that accesses a file on a FAT32 system, of course you need to access the file system. But, a function that takes a string input and counts number of words shouldn't have a test that connects to the filesystem to get the string because that is not a requirement for the function. "A test is not a unit test if:

  • It talks to the database
  • It communicates across the network
  • It touches the file system
  • It can't run at the same time as any of your other unit tests
  • You have to do special things to your environment (such as editing config files) to run it.

Tests that do these things aren't bad. Often they are worth writing, and they can be written in a unit test harness. However, it is important to be able to separate them from true unit tests so that we can keep a set of tests that we can run fast whenever we make our changes. " - Michael Feathers

This is a good quote that I found in one of his blogs. Basically, a test is a unit test specifically if it tests one and only one unit. You shouldn't have to access a database through a servlet to get data for a test. Instead, unit tests will take advantage of dummy data and mock objects to completely isolate the tests from the outside world. A mock object is a simulated object created to mimic an actual class object in a way that can be controlled. They help to test the behavior of the unit (think crash test dummy in a car). Your unit tests need to only test one unit at a time. Anything else is more of an implementation test.

Implementation of Unit Testing

Normally in a TDD we will write these tests before we write any code. What?!? Your probably thinking why should I write the tests before the code. Well, look up at the Advantages of Unit Testing section for the answers. So, we have this class and we want to test it. A Unit Test for the class should be stored in one file. The Unit Test is then broken down into its most fundamental elements, the Test Cases. A test case has the job of answering a single question about the code. Not two, three, four...ONE AND ONLY ONE. If you find yourself testing a class method that answers more than one question, it is a good bet that you need to refactor that code and break it into units. A test case should be able to run completely by itself without any human input. Unit testing is about automation. It should determine by itself whether the function it is testing has passed or failed, without a human interpreting the results. Lastly, the test case needs to run in isolation, separate from any other test cases (even if they test the same functions). Each test case is an island.

So how do we do this? Let's look at an example class that we want to create. The example class has the following requirements:

  1. Count the number of words in a piece of text
  2. Count the number of sentences in a piece of text

So how many test cases do we have? Those of you that said 2 are wrong. Remember that a test case answers a SINGLE question about the code. Each of those requirements for the class has several test cases. We will now break it up further to get the requirements which the test cases will be based off of.

  • Rules for counting number of words in a piece of text.
    1. Delimiters that mark a division in words are space, tab, newline, and return characters.
    2. A word can have any amount of delimiting characters before or after it.
    3. Characters next to each other that are non-delimiting characters are part of the same word.
    4. A word can have 0 delimiter characters directly in front of it if it is the first word in the string
    5. A word can have 0 delimiter characters directly after it if the last character in the word is the last character in the string
  • Rules for counting number of sentences in a piece of text.
    1. A sentence is marked by a capital letter at the beginning and a full-stop at the end.
    2. Full-stops can be periods, question marks, and exclamation points.
    3. A full-stop must come at the end of a word.
    4. A sentence can be one word.
    5. Consecutive full stop characters are not treated as separate sentences.

We will ignore the use of periods in Acronyms, Abbreviations, and name prefixes for the purpose of simplicity. So, now we have the rules we will define the API. The class will have 2 functions, countWords and countSentences. Now we will put the above rules into requirements that match the API.

  • countWords should return the correct number of words for all strings with no consecutive delimiter characters
  • countWords should return 0 if the string is all delimiter characters
  • countWords should return 0 if the string length is 0
  • If you take a string and insert a delimiter character next to another delimiter character, the number of words should be the same.
  • If you take a string and insert any number of delimiter characters at the beginning of the string, the number of words should be the same.
  • If you take a string and insert any number of delimiter characters at the end of the string, the number of words should be the same.

I need to finish this. Stay tuned....It gets interesting.

Where To Place Test Code

The developers of the unittest module we are using say the following. "You can place the definitions of test cases and test suites in the same modules as the code they are to test (e.g. widget.py), but there are several advantages to placing the test code in a separate module, such as 'widgettests.py':

  • The test module can be run standalone from the command line
  • The test code can more easily be separated from shipped code
  • There is less temptation to change test code to fit the code it tests without a good reason
  • Test code should be modified much less frequently than the code it tests
  • Tested code can be refactored more easily
  • Tests for modules written in C must be in separate modules anyway, so why not be consistent?
  • If the testing strategy changes, there is no need to change the source code"

There will be a separate test directory for all of the dabo tests. Please follow the code structure and put the test code in there. Mock object and adapter placement has not been decided yet so use your best judgment when doing this.