Contents

Fuzzing libfyaml


I spent some time fuzzing libfyaml, a feature-rich YAML parser/emitter written in C. I ended up finding and reporting 68 bugs, all of which have been fixed by the maintainer. Here’s how it went.

libfyaml is a YAML library that supports YAML 1.2. It provides a pretty extensive API - parsing, emitting, path expressions, document manipulation, a reflection/type system, and more. It’s written in C, which makes it a great candidate for fuzzing with sanitizers.

The library has a lot of surface area. Beyond basic parse/emit, there are features like ypath expressions, document iterators, alias resolution, node manipulation (insert, remove, sort), and a whole reflection system for mapping YAML to C types. Plenty of code paths to explore.

I used libFuzzer as the fuzzing engine, combined with:

  • AddressSanitizer (ASan) - for detecting memory errors (use-after-free, buffer overflows, memory leaks)
  • UndefinedBehaviorSanitizer (UBSan) - for catching undefined behavior (signed integer overflow, invalid casts, etc.)

The build flags looked like this:

-fsanitize=fuzzer,address,signed-integer-overflow,undefined

I also had a separate reproducer binary (without the fuzzer sanitizer linked in) for debugging and triaging crashes outside of the fuzzing loop.

I went with a single-harness, multi-target approach. Instead of writing separate fuzzers for each API, I wrote one harness that exercises as many code paths as possible in a single run.

Each fuzzer input is split into two parts: a header and the actual fuzz data.

The header occupies the first 36 bytes of the input. It’s a struct of 9 uint32_t fields, each controlling a different aspect of the library’s configuration:

struct seed_data_t {
  uint32_t seed1;  // parser flags
  uint32_t seed2;  // emitter flags
  uint32_t seed3;  // node walk flags
  uint32_t seed4;  // path parse flags
  uint32_t seed5;  // node style
  uint32_t seed6;  // extended emitter flags
  uint32_t seed7;  // primitive type selection
  uint32_t seed8;  // type info flags
  uint32_t seed9;  // C generation flags
  struct flags_t *flags;
} __attribute__((aligned(16)));
  • seed1 - parser flags (YAML version, document resolution, recycling, accelerators, depth limits, JSON mode, ypath aliases, duplicate keys, …)
  • seed2 - emitter flags (sort keys, output mode like block/flow/JSON/pretty, indentation, width, doc start/end markers, …)
  • seed3 - node walk flags (follow mode, pointer type like YAML/JSON/ypath, URI encoding, max depth, …)
  • seed4 - path expression parser flags (recycling, accelerators)
  • seed5 - node style (any, flow, block, plain, single/double quoted, literal, folded, alias)
  • seed6 - extended emitter flags (color, visible whitespace, extended indicators, …) with output destination bits masked off to avoid side effects
  • seed7 - primitive type selection (bool, char, int, float, double, etc. - for the reflection system)
  • seed8 - type info flags (const, volatile, restrict, anonymous, incomplete, …)
  • seed9 - C code generation flags (indentation style, comment format)

The setup_flags function translates these raw seeds into the actual flag values used by the library:

void setup_flags(struct seed_data_t *seed_data, struct flags_t *flags) {
  flags->parse_flags      = seed_data->seed1;
  flags->emitter_flags    = seed_data->seed2;
  flags->node_walk_flags  = seed_data->seed3;
  flags->path_parse_flags = seed_data->seed4;
  flags->node_style       = fy_node_style__vals[seed_data->seed5 % array_elements(fy_node_style__vals)];
  flags->extended_emitter_flags = seed_data->seed6 & ~(
        FYEXCF_OUTPUT_STDOUT
      | FYEXCF_OUTPUT_STDERR
      | FYEXCF_OUTPUT_FILE
      | FYEXCF_OUTPUT_FD
      | FYEXCF_NULL_OUTPUT
      | FYEXCF_OUTPUT_FILENAME
    );
  flags->primitive_type = primitive_type_names[seed_data->seed7 % array_elements(primitive_type_names)];
  flags->type_info_flags = seed_data->seed8;
  flags->cgen_flag = cgen_flag_combos[seed_data->seed9 % array_elements(cgen_flag_combos)];
}

Most seeds are used directly as bitmasks - the raw uint32_t value is passed as the flag set. For enum-like fields (node style, primitive type, C generation flags), the seed is taken modulo the number of valid values to pick one from a predefined list. The extended emitter flags mask off output destination bits to avoid side effects like writing to stdout/stderr/files during fuzzing. This means the coverage-guided engine has a direct, transparent mapping between input bytes and every configuration bit. When the fuzzer mutates bytes 0-35, it’s directly toggling library features.

In LLVMFuzzerTestOneInput, the header is extracted at the start of each iteration:

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  if (size <= sizeof(struct seed_data_t)) return 0;

  struct flags_t flags = {0};
  struct seed_data_t _seed;
  struct seed_data_t *seed = &_seed;
  memcpy(seed, (struct seed_data_t *)data, sizeof(struct seed_data_t));
  setup_flags(seed, &flags);
  seed->flags = &flags;
  data += sizeof(struct seed_data_t);
  size -= sizeof(struct seed_data_t);

  // ... run all test functions with remaining data ...
}

Everything after byte 36 is the fuzz data - the YAML content. For string-based APIs, this data is null-terminated before being passed to the library. For binary/file-pointer APIs, the raw bytes are passed directly via fmemopen.

The main reason I went with a header-based design instead of, say, using FuzzedDataProvider to consume bytes on the fly, is transparency for the fuzzing engine. With a fixed-offset header, libFuzzer can directly correlate specific byte positions with specific code coverage changes. If flipping bit 3 at offset 0 enables FYPCF_RESOLVE_DOCUMENT and that opens up a new code path, the fuzzer learns that immediately. With a stream-based approach, the relationship between byte positions and their effects shifts depending on how many bytes were consumed before, which makes it harder for the engine to learn.

It also keeps things simple. The struct layout is fixed, so reproducing a crash is trivial - just look at the first 36 bytes to know exactly what configuration was active.

On each iteration, the harness runs through 35+ test functions sequentially, covering:

  • Parsing - from strings, from file pointers
  • Emitting - to file pointers, strings, buffers, via emitter objects
  • Document operations - cloning, comparing, inserting, removing, scanf
  • Path expressions - building, executing, taking results
  • Iterators - document, node, and token iteration
  • Alias resolution - with various ypath configurations
  • Reflection/type system - packed blobs, type lookups, C code generation, type context parsing and emitting
  • Other operations - composition, checkpoint/rollback, sequence and mapping manipulation

Every test function runs on every input. This means the fuzzer doesn’t need to “discover” how to select a target - it just hammers everything at once.

This approach has real downsides. Running all 35+ test functions on every input is slow. Each fuzzer iteration does a lot of redundant work - most test functions will likely reject or quickly bail out on an input that was really only useful for one or two of them. This directly hurts the executions-per-second metric, which is one of the most important factors in fuzzing effectiveness.

There’s also the fixed header overhead. Every input must be at least 37 bytes long (36 for the header + at least 1 byte of data), and the first 36 bytes are always “spent” on configuration rather than actual YAML content. For a library where interesting bugs can hide in short, carefully crafted inputs, wasting 36 bytes on a header is not ideal. The fuzzer has to work harder to find minimally-sized triggering inputs.

Another issue is that the same fuzz data is shared across all test functions with wildly different expectations. A path expression parser expects something like /foo/bar, while the YAML parser expects valid YAML, and the reflection system expects packed binary blobs. The same input can’t realistically be good for all of them at once, so most test functions end up exercising only their error/early-exit paths for any given input.

A more disciplined approach would be to write separate harnesses per feature group, or at least use the header to dispatch to a single test function per iteration. That said, the brute-force approach worked well enough here - it found 68 bugs, so I can’t complain too much.

In total, I reported 68 issues to the libfyaml GitHub repository. All of them have been acknowledged and fixed by the maintainer. Here’s a breakdown by category:

By far the most common class of bugs, making up over a third of all findings. These showed up across many different subsystems - node freeing, emitting, path expression execution and walk results, input reference counting, document iterators, alias resolution, accelerator lookups, and list operations. This category also includes a double-free in fy_input_free. ASan caught all of these.

The second most common category. Various places where allocated memory wasn’t properly freed on error paths or during cleanup. These appeared in parsing, emitting, path expression building, node building from strings and file pointers, alias resolution, atom iteration, input allocation, and the reflection type system.

Heap buffer overflows, stack buffer overflows, global buffer overflows, and out-of-bounds accesses. These showed up in the accelerator growth function, UTF-8 handling (fy_utf8_get, fy_utf8_get_branch, fy_utf8_get_generic), path traversal with StrtolFixAndCheck, emitter setup, and token text preparation.

Caught by UBSan. These included converting infinity to an integer type, shifts or operations on signed integers that triggered overflow, misaligned memory access in the bundled xxhash implementation, and similar issues in the emitter and atom handling code.

Segmentation faults from null pointer dereferences in path expression building, atom line iteration, and reader input generation - triggered by specific flag combinations like FYPCF_RESOLVE_DOCUMENT with FYPCF_YPATH_ALIASES.

Infinite recursion triggered by specific combinations of parser flags - particularly when document resolution and ypath aliases were enabled together. Different flag combinations (with disabled recycling vs disabled accelerators) hit different recursion paths.

Cases where the library would hang indefinitely - one in fy_document_buildf and one in fy_check_ref_loop during document building with alias resolution.

A case where path expression execution could consume unbounded memory.

One report was about a function (fy_token_get_utf8_length) referenced in documentation that didn’t actually exist in the library.


All issues were reported with reproducer inputs and sanitizer stack traces. The maintainer was responsive and fixed everything. The full list of reports can be found on the libfyaml issues page.

Every single one of these 68 bugs was found without any AI assistance. The fuzzer was written by hand, the triage was done manually, and the reports were filed by a human. No LLMs, no copilots, no AI-guided fuzzing - just libFuzzer, sanitizers, and patience.

I originally had plans to turn some of the more interesting bugs (particularly the use-after-free ones) into a CTF challenge. But in the current AI era, where LLMs can solve most CTF challenges without much effort, I decided not to waste time on that. It just doesn’t feel worth the effort anymore when the solution process can be shortcut so easily.

Fuzzing libfyaml turned out to be quite productive. 68 bugs across a wide range of categories - from use-after-free and buffer overflows to memory leaks and undefined behavior. The single-harness multi-target approach worked well here because the library has so many interconnected features that benefit from being exercised together.