Regression Testing
Software changes. Programmers fix bugs, add features, and refactor code. Every change risks breaking existing behavior.
Regression testing retests functionality after a change to provide confidence that the new code does not have unexpected side effects.
You could retest all by running your entire test suite after every commit. But as the number of tests grows, this approach becomes impractical and costly.
We use techniques to optimize regression testing. The three approaches are minimization, selection, and prioritization.
1. Test Suite Minimization
- Idea: Permanently reduce the test suite size by removing redundant tests.
- Goal: Find a minimal subset of tests \(TS\) that achieves the same adequacy criterion (e.g., 100% branch coverage) as the original suite \(T\).
- Key Concepts:
* Redundant Test: A test \(t\) is redundant if its removal does not reduce the adequacy of the test suite. For a metric \(m\), \(m(T) = m(T \setminus \{t\})\).
* Essential Test: A test \(t\) is essential if it is the only test that covers a specific test goal (like a branch).
- Risk: Minimization is a trade-off. A removed test might be redundant from a coverage perspective but may have been the only test that could find a specific fault.
2. Test Selection
- Idea: Temporarily select a subset of tests from the full suite to run for a specific change.
- Goal: Run only the tests that are relevant to the code that was modified.
- How it Works (CFA-based): One approach analyzes the program's Control-Flow Automaton (CFA). The algorithm compares the CFA of the original program (\(P\)) with the CFA of the modified program (\(P'\)).
1. It traces the paths of both versions simultaneously.
2. If it finds an edge in the original program \(P\) that has been modified or deleted in \(P'\), it adds all tests that covered that edge to the selected test suite \(TS\).
* This algorithm selects all tests that would execute differently due to the code change.
3. Test Prioritization
- Idea: Re-order the test suite so that the tests most likely to find faults run first.
- Goal: Maximize the rate of fault detection. This provides faster feedback to developers.
- How it Works: Tests are sorted based on a heuristic.
* Greedy Prioritization: Sort the test suite by a previously computed metric, such as branch coverage. The test that covered the most branches runs first.
* Additional Greedy Prioritization: Prioritize based on new information. The algorithm repeatedly selects the test that covers the most not-yet-covered goals. This ensures the first tests cover a wide breadth of the code.