Need for software testing

By Dmitry Kabanov

June 28, 2021

This post discusses how lack of tests can make it difficult to progress in a software project. This happens as at later stages it can become difficult to use components developed earlier if these components are unreliable.

Recently I was attending the class on uncertainty quantification for partial differential equations. As a postdoc, I do participate in the class only partially, however, I have a group of students that I am helping with the homework.

The first homework required use of Monte Carlo methods with differential equations with random coefficients to estimate uncertainty in the solutions of these equations. As the person with experience in writing PDE solvers, I wrote the solver for the PDE that we used in this homework and shared it with others. Of course, after writing a solver, I wrote a test that the observed order of accuracy is correct (second order, in this case) as I strongly believe that the automated testing is superimportant in software and the test on the observed order of accuracy is the best test for numerical solvers.

So, after writing the solver, I was sure that it worked correctly, and we started to do Monte Carlo exercises with this solver. It quickly turned out that we get some unexpected results. Particularly, I was doing an exercise on rare-event simulation, however, the event proposed in the exercise was not rare at all! I’ve spent some time trying to understand what could be wrong with my importance-sampling Monte Carlo that I used for this rare-event simulation but could not find any bugs. I did not think that the problem could be with the PDE solver as I was sure it worked correctly: I had a test that shows that the order of convergence is correct, hence, there could not be any problem.

However, another student, who used my solver, told me that while estimating the convergence rate of the bias error, she did not get an expected second rate but only the first rate. That finally made me convinced that the problem is with the PDE solver as the convergence rate of the bias error directly depends on the convergence rate of the PDE solver.

After two-hour investigation, with looking at the solver function and not seeing any apparent mistakes, I have realized that the problem is that my test suite for the PDE solver is too weak. Indeed, having a single test that check convergence only for a particular PDE with constant coefficients is not enough to ensure correctness of the solver.

The solver solves an equation of the elliptic type:

\begin{aligned} \mathrm{div} \left( a(x, \omega) \, \nabla u(x, \omega) \right) &= f(x), \quad x \in D, \\ u(x \in \partial D) &= 0, \end{aligned}

where the source term \(f(x)\) was given, so I hardcoded it into the solver function.

As it was clear now that I need a test with coefficient \(a(x, \omega)\) being nonconstant, I used the method of manufactured solutions to develop this more stringent test. Precisely, I chose the form of the coefficient \(a(x, \omega)\) and the solution \(u(x, \omega)\) and then found the needed source term \(f(x)\) by substituting into the equation. Consequently, the solver function was generalized by adding an argument for the callback of the source term.

After adding this test, it was clear that the solver does not give the correct order of convergence: it was one instead of two! Now, having the test, I started to analyze the solver again. As the problem occurred only with the variable coefficient and not with the constant one, it was clear that there is some problem with the construction of the matrix \(A\) as elliptic equations are solved numerically by building the system of linear algebraic equations: \[ A u = f, \] where \(u\) and \(f\) are functions discretized on the grid.

Usually, the matrix \(A\) is a sparse matrix, and I used the function scipy.sparse.spdiags to build it. What I haven’t realized initially, why programming the solver, that this function has some unintuitive behavior when constructing upper diagonals.

Precisely, if you have the following vectors to build the tridiagonal matrix:

  e1 = [-1, -2, -3, -4]
  e2 = [10, 20, 30, 40]
  e3 = [ 1,  2,  3,  4]

then the call to spdiags produces the following results

  from scipy import sparse
  print(sparse.spdiags([e1, e2, e3], [-1, 0, 1], 4, 4).todense())

[[10  2  0  0]
 [-1 20  3  0]
 [ 0 -2 30  4]
 [ 0  0 -3 40]]

Note that the first value of the upper-diagonal vector is not used!

Fortunately, scipy provides another function, diags, which has more intuitive behavior. The call to diags produces the expected result:

  print(sparse.diags([e1, e2, e3], [-1, 0, 1], shape=(4, 4)).todense())

[[10.  1.  0.  0.]
 [-1. 20.  2.  0.]
 [ 0. -2. 30.  3.]
 [ 0.  0. -3. 40.]]

After changing from spdiags to diags, the PDE solver started to work correctly and passed both tests: for constant and variable coefficients. We were able to use it for Monte Carlo simulations and finish the homework.

The point of this post is that when one needs to work on the multicomponent code like in this case (Monte Carlo simulator plus PDE solver), it is crucial to test the individual components of the code independently. Moreover, it is crucial to write good tests: if tests are too “easy”, they give a false feeling of safety. Only when tests are strict enough, one can be sure that different software components work correctly and can be used together to solve a computational task.