[ruby/yarp] Move fuzz docs into docs/ directory, reference in README

https://github.com/ruby/yarp/commit/97547b6192
This commit is contained in:
Kevin Newton 2023-06-30 14:28:54 -04:00 коммит произвёл git
Родитель 136fcd5118
Коммит d49a92d036
2 изменённых файлов: 94 добавлений и 1 удалений

Просмотреть файл

@ -146,7 +146,7 @@ The default encoding for a string created with method String.new is:
- For a \String object argument, the encoding of that string.
- For a string literal, the script encoding;
see {Script Encoding}[https://docs.ruby-lang.org/en/master/encodings_rdoc.html#label-Script+Encoding].
see {Script Encoding}[rdoc-ref:encodings.rdoc@Script+Encoding].
In either case, any encoding may be specified:

93
docs/fuzzing.md Normal file
Просмотреть файл

@ -0,0 +1,93 @@
# Fuzzing
We use fuzzing to test the various entrypoints to the library. The fuzzer we use is [AFL++](https://aflplus.plus). All files related to fuzzing live within the `fuzz` directory, which has the following structure:
```
fuzz
├── corpus
│   ├── parse fuzzing corpus for parsing (a symlink to our fixtures)
│   ├── regexp fuzzing corpus for regexp
│   └── unescape fuzzing corpus for unescaping strings
├── dict a AFL++ dictionary containing various tokens
├── docker
│   └── Dockerfile for building a container with the fuzzer toolchain
├── fuzz.c generic entrypoint for fuzzing
├── heisenbug.c entrypoint for reproducing a crash or hang
├── parse.c fuzz handler for parsing
├── parse.sh script to run parsing fuzzer
├── regexp.c fuzz handler for regular expression parsing
├── regexp.sh script to run regexp fuzzer
├── tools
│   ├── backtrace.sh generates backtrace files for a crash directory
│   └── minimize.sh generates minimized crash or hang files
├── unescape.c fuzz handler for unescape functionality
└── unescape.sh script to run unescape fuzzer
```
## Usage
There are currently three fuzzing targets
- `yp_parse_serialize` (parse)
- `yp_regexp_named_capture_group_names` (regexp)
- `yp_unescape_manipulate_string` (unescape)
Respectively, fuzzing can be performed with
```
make fuzz-run-parse
make fuzz-run-regexp
make fuzz-run-unescape
```
To end a fuzzing job, interrupt with CTRL+C. To enter a container with the fuzzing toolchain and debug utilities, run
```
make fuzz-debug
```
# Out-of-bounds reads
Currently, encoding functionality implementing the `yp_encoding_t` interface can read outside of inputs. For the time being, ASAN instrumentation is disabled for functions from src/enc. See `fuzz/asan.ignore`.
To disable ASAN read instrumentation globally, use the `FUZZ_FLAGS` environment variable e.g.
```
FUZZ_FLAGS="-mllvm -asan-instrument-reads=false" make fuzz-run-parse
```
Note, that this may make reproducing bugs difficult as they may depend on memory outside of the input buffer. In that case, try
```
make fuzz-debug # enter the docker container with build tools
make build/fuzz.heisenbug.parse # or .unescape or .regexp
./build/fuzz.heisenbug.parse path-to-problem-input
```
# Triaging Crashes and Hangs
Triaging crashes and hangs is easier when the inputs are as short as possible. In the fuzz container, an entire crash or hang directory can be minimized using
```
./fuzz/tools/minimize.sh <directory>
```
e.g.
```
./fuzz/tools/minimize.sh fuzz/output/parse/default/crashes
```
This may take a long time. In the the crash/hang directory, for each input file there will appear a minimized version with the extension `.min` appended.
Backtraces for crashes (not hangs) can be generated en masse with
```
./fuzz/tools/backtrace.sh <directory>
```
Files with basename equal to the input file name with extension `.bt` will be created e.g.
```
id:000000,sig:06,src:000006+000190,time:8480,execs:18929,op:splice,rep:4
id:000000,sig:06,src:000006+000190,time:8480,execs:18929,op:splice,rep:4.bt
```