Fuzzing and Random Testing
While techniques like equivalence partitioning are systematic, other black-box techniques rely on randomness. These methods find bugs and test system robustness.
1. Random Testing
Random testing generates random input values. This is easy to automate and scales well. The goal is to generate well-formed input, meaning the inputs satisfy the program's type system or other invariants.
There are two approaches to generating these random values:
- Option 1: Sample from a Fixed Pool: Tools like Randoop maintain a small, fixed set of values for primitive types (e.g.,
-1, 0, 1, 10, 100forint) and randomly pick from this pool. - Option 2: Sample from the Valid Input Space: Tools like QuickCheck choose a value at random from the entire valid domain (e.g., any
intfromInteger.MIN_VALUEtoInteger.MAX_VALUE).
Tools also create complex object inputs by chaining methods. The tool might start with List<Integer> l = new ArrayList<>();, then call a method on that object, like l.add(10);, and then call another, like l.add(20);, to build complex data.
2. Fuzzing: Testing for Robustness
Fuzzing is a type of random testing. Its goal is to test for robustness and security. A fuzzer feeds a program invalid, uncommon, or undefined input data to see if it crashes, throws an unhandled exception, or has a memory violation.
- Basic Fuzzing: The fuzzer generates a string of random characters and feeds it to the program's input.
- Mutational Fuzzing: Instead of starting from scratch, mutational fuzzing starts with a set of initial inputs (seeds). These seeds are valid inputs (e.g., a real PNG file for an image parser). The fuzzer generates new test cases by applying small, random mutations to these seeds, such as:
* Flipping a random bit in the file.
* Deleting a random character.
* Inserting a random character.
* Splicing two seed files together.
If a mutated input causes a crash, the fuzzer saves it as a "relevant input" to be analyzed later.
3. Grammar-Based Fuzzing
A challenge for mutational fuzzing is that most random mutations create invalid inputs. If you are testing a SQL database, most random strings are not valid SQL queries and will be rejected by the parser immediately. This means the deeper logic of the program is never tested.
The solution is grammar-based fuzzing.
This technique uses a grammar (like a Backus-Naur Form definition) that describes the syntax of valid inputs. The fuzzer generates a test case by applying production rules from the grammar.
For example, given a simplified grammar for a URL:
<url> ::= <protocol> "://" <domain>
<protocol> ::= "http" | "https"
<domain> ::= <word> "." <word>
<word> ::= "a" | "b" | "c" | ...
The fuzzer starts with <url>, expands it, and chooses rules until it has generated a complete, valid input like https://google.com.
A problem is ensuring this process terminates, as grammars can be recursive. To solve this, tools calculate the minimal cost (shortest number of steps to produce a terminal string) for each rule and use this to finish a derivation and ensure it terminates.