White-Box Testing: Symbolic Execution and Search-Based Testing

White-box testing generates test inputs from the internal structure of the program. Unlike black-box testing, which uses only the specification, this method uses the code itself to generate tests. The goal is to generate a test suite that achieves high structural coverage.

Two techniques for this are Symbolic Execution and Search-Based Testing.

1. Symbolic Execution

Symbolic execution explores multiple program execution paths by executing the program with symbolic inputs rather than concrete values.

As the symbolic executor analyzes the code, it builds algebraic expressions for the program's variables. For example, if a variable x starts as the symbolic value X_0 and the code is x = x + 1;, the executor records the new state of x as the expression X_0 + 1.

The Three Components of a Symbolic State

The symbolic executor maintains a "symbolic abstraction state" with three parts:

Location State: The program counter, or what line of code is being executed (e.g., l3).
Symbolic State (Store): A map of all variables to their current symbolic expressions (e.g., $x \to X_0 + 1$ ).
Path Condition (PC): A formula that represents the conjunction of all branch conditions that must be true to reach the current line of code. It starts as true.

How It Works: Stepping Through an Example

When the executor encounters a condition like if (x > 10), it forks, exploring both possibilities:

Path 1 (True Branch): It assumes the condition is true and adds $$X_0 > 10$$ to its path condition.
Path 2 (False Branch): It assumes the condition is false and adds $\neg(X_0 > 10)$ (or $X_0 \le 10$ ) to its path condition.

When a path terminates, the executor uses a constraint solver to find a concrete value for the symbolic input $$X_0$$ that satisfies the final path condition. This solution is the test input.

For example, if a path terminates with the PC $(X_0 > 10) \land (X_0 \cdot 2 = 30)$ , the solver would return $$X_0 = 15$$ . This generates a test case, x = 15, that executes that path.

Challenges and Solutions

This technique faces path explosion. An if statement in a loop can create an infinite number of paths. To manage this, tools use search heuristics (like Depth-First Search or Breadth-First Search) to prioritize which path to explore next.

2. Search-Based Software Testing (SBST)

A different white-box approach frames test generation as an optimization problem. This is used for generating test data to cover difficult branches.

The process uses a metaheuristic search algorithm, like Hill Climbing.

Goal: The search targets a specific branch.
Fitness Function: A fitness function evaluates how good a candidate solution (a test input) is. This function guides the search.
Search: The algorithm starts with a random test input. It explores neighboring solutions (e.g., changing an input value by +1 or -1) and moves to the neighbor with better fitness. This repeats until it finds a solution that covers the target.

The fitness function is a combination of Approach Level (how close the code got to the target branch) and Branch Distance (how close the condition was to evaluating to the desired outcome).