Extra lesson: This topic was not covered during out bootcamp. The material below was taught by Dr. Gavin Simpson at the SWC Bootcamp @ Dalhousie University (NS, Canada).
Materials: If you have not already done so, please download the lesson materials for this bootcamp, unzip, then go to the directory testing
, and open (double click) on the file testing.Rproj
to open Rstudio.
Unless you are going to write code without testing it, you are always going to incur the cost of testing.
The difference between having unit tests and not having them is the difference between the cost of writing the test and the cost of running it compared to the cost of testing by hand.
If the cost of writing a unit test is 2 minutes and the cost of running the unit test is practically 0, but the cost of manually testing the code is 1 minute, then you break even when you have run the test twice. (source)
Testing is often introduced as a last-minute thing, but most scientists who write code do an informal version of testing as they develop.
Software testing is a process by which one or more expected behaviours and results from a piece of software are exercised and confirmed. Well chosen tests will confirm expected code behaviour for the extreme boundaries of the input domains, output ranges, parametric combinations, and other behavioural edge cases.
Unless you write flawless, bug-free, perfectly accurate, fully precise,and predictable code every time, you must test your code in order to trust it enough to answer in the affirmative to at least a few of the following questions:
The three right answers are:
The longer answer is that testing either before or after your software is written will improve your code, but testing after your program is used for something important is too late.
Seems like extra work but will save you time * Decreased frustration. Bugs appear very close to hard deadlines. Testing allows to quickly identify where the problem is and fix it.
More confidence in the code * Better code structure. Code that is easy to test is usually better designed. Tests sometimes make you see large complicated functions and break them down into smaller, more manageable chunks.
Make changes or updates without worrying too much * Make changes confidently because you know your tests will catch any issues.
Getting more serious about testing has totally changed my approach towards software development over the last year. I find that I now write programs that are better separated into component parts, that define their roles more clearly, that have fewer bugs or unexpected behaviours and that are easier to modify as I go along.
We'll use the testthat
package to make testing easy and intuitive. This is a brilliant package that scales up from one-off tests to detailed suites that are well suited to large packages.
library(testthat)
In the previous section we created a function that linearly rescales values.
rescale <- function(x, r.out) {
p <- (x - min(x)) / (max(x) - min(x))
r.out[[1]] + p * (r.out[[2]] - r.out[[1]])
}
This is a simple function, and one that we could use elsewhere. But especially if we do use it elsewhere we want to know how it behaves. So we write tests partly to document how it will react in particular edge cases.
It also means that if we depend on it, we are free to change how it is implemented internally (adding a new argument, or changing the underlying algorithm, etc) and if the tests still agree then the code that depends on the function will still behave correctly if we have written the tests well.
Behaving correctly
r.out
r.out
is the same as the range of the input data (range(x)
), the data should be unchanged.Corner cases:
We already ran through some of these when developing the function the first time.
x <- rnorm(20)
r.out <- c(0.1, 1.4)
range(rescale(x, r.out)) == r.out
expect_that(range(rescale(x, r.out)), equals(r.out))
Note that this does not produce output! It will only produce output if the test fails, in which case it will appear as an error. Alternatively, when running non-interactively, we'll see indications that individual tests have passed.
That is the idea. There are some issues around where to store the tests, but that's not hard to sort out.
Recall the skewness()
function you wrote yesterday:
variance <- function(x) {
n <- length(x) # number of observations
xbar <- mean(x) # mean of x
sum((x - xbar)^2) / (n-1) # var(x)
}
skewness <- function(x) {
n <- length(x)
xbar <- mean(x)
skew <- sum((x - xbar)^3) / (n-2)
skew <- skew / var(x)^(3/2)
skew
}
What things could you do to test that skewness()
returns correct
values or behaves appropriately?
We could check that for data with know sign of skewness the function works:
set.seed(42)
x <- rlnorm(100)
hist(x)
expect_more_than(skewness(x), 0) ## ok
expect_less_than(skewness(x), 0) ## throws error
equals()
Equality with a numerical tolerenceDon't use ==
for comparisons involving floating point operations!
sqrt(2)^2 == 2
In base R use all.equal(sqrt(2)^2, 2)
In testthat use expect_that(foo, equals(bar))
or expect_equal(foo, bar)
expect_that(10, equals(10)) # passes
expect_that(10, equals(10 + 1e-7)) # passes
expect_that(10, equals(10 + 1e-6)) # fails
expect_that(10, equals(11)) # fails
expect_equal(variance(1:10), var(1:10)) # passes
is_identical_to
: Exact quality with identical (this can be surprising with decimal numbers)expect_that(10, is_identical_to(10))
expect_that(10, is_identical_to(10 + 1e-10))
is_a()
checks that an object inherit()s from a specified classmodel <- lm(mpg ~ cyl, mtcars)
expect_that(model, is_a("lm"))
matches()
matches a character vector against a "regular expression".string <- "Testing is fun!"
# Passes
expect_that(string, matches("Testing"))
prints_text()
matches the printed output from an expression against a regular expressiona <- list(1:10, letters)
# Passes
expect_that(str(a), prints_text("List of 2"))
# Passes
expect_that(str(iris), prints_text("data.frame"))
shows_message()
checks that an expression shows a messageexpect_that(library(mgcv),
shows_message("This is mgcv"))
gives_warning()
expects that you get a warningexpect_that(log(-1), gives_warning())
expect_that(log(-1),
gives_warning("NaNs produced"))
# Fails
expect_that(log(0), gives_warning())
throws_error()
verifies that the expression throws an error. You can also supply a regular expression which is applied to the text of the error. This one is very useful.expect_that(1 / 2, throws_error())
expect_that(seq_along(1:NULL), throws_error())
is_true()
is a useful catchall if none of the other expectations do what you want -it checks that an expression is truex <- require(plyr)
expect_that(x, is_true())
functions.R
).test-
(e.g., test-rescale.R
).testthat
library(testthat)
test_dir(".")
Storing things in different directories ends up being the long-term bet, but you can run into pathname issues here.
Start with the rescale
function from before:
rescale <- function(x, r.out) {
p <- (x - min(x)) / (max(x) - min(x))
r.out[[1]] + p * (r.out[[2]] - r.out[[1]])
}
Write tests to check that
r.out
(wrong length, wrong type)Instructors: Code that works through this is available in exercises.R
Acknowledgements: This material was developed by Rich FitzJohn and modified by Gavin Simpson, drawing on from material developed by Katy Huff, Rachel Slaybaugh, Anthony Scopatz and Karthik Ram.