Skip to content

Latest commit

 

History

History
215 lines (158 loc) · 6.22 KB

write-a-fuzzer.md

File metadata and controls

215 lines (158 loc) · 6.22 KB

Write a fuzzer

Fuchsia's toolchain supports fuzzing using LLVM's [libFuzzer]. To create a fuzzer for a particular interface, you need to implement a fuzz target function that uses a provided sequence of bytes to exercise the interface. The sequence of bytes is referred to as a fuzzer "input". The fuzz target function is used by libFuzzer to search for inputs that cause panics or other errors.

Sample Code to Fuzz {#samples}

For each of the examples below, assume you want to test code like the following:

  • {C/C++}

    class Parser {
      Parser(const std::string &name, uint32_t flags);
      virtual ~Parser();
      int Parse(const uint8 *buf, size_t len);
    };
  • {Rust}

    struct ToyStruct {
        n: u8,
        s: String,
    }
    
    fn toy_example(input: ToyStruct) -> Result<u8, &'static str>;
  • {Go}

    package mypackage
    
    func HandleBytes(s []byte) {
      // Complicated code goes here
    }
    
    type MyStruct struct {
      // Various fields
    }
    
    func HandleStruct(s MyStruct) {
      // Complicated code goes here
    }

Simple Fuzz Target Function {#basic}

For each language, your fuzz target function will use the bytes provided to call the code you want to fuzz. If the interface being fuzzed has documented constraints on its parameters, you can reject inputs that don't meet those constraints. You can also ignore returned errors since failing gracefully on invalid parameters is correct behavior.

  • {C/C++}

    For C and C++, the fuzz target function must have the signature extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) and return 0:

    extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
      Parser parser("", 0);
      if (size < 5) {
        return 0;
      }
      parser.Parse(data, size);
      return 0;
    }

    Place this code in a source file adjacent to code being fuzzed as you would with a unit test. For example, the code above might be in parser-fuzztest.cc.

  • {Rust}

    For Rust, the recommended approach is to use the Arbitrary trait discussed in the next section.

    It is also possible to create a "manual" fuzz target function that is analgous to the simple fuzz target functions in other languages. This function must take a reference to a byte slice as its single parameter, return nothing, and have the #[fuzz] attribute:

    use fuzz::fuzz;
    
    #[fuzz]
    fn toy_example_u8(input: &[u8]) {
        if input.len() == 0 {
            return
        }
        let n = input[0];
        if let Ok(s) = std::str::from_utf8(input[1:]) {
            let _ = toy_example(ToyStruct{n, s: s.to_string(),});
        }
    }

    As with unit tests, this code can be place in the same file as the code it is testing. For example, the code above might be in toy_example/src/lib.rs.

  • {Go}

    For Go the fuzz target function must have the signature func Fuzz(s []byte) and return nothing.

    func Fuzz(s []byte) {
      mypackage.HandleBytes(s)
    }

    This function can be added to and exported from either an existing Go package, or a new Go package if no existing package is a good fit.

Support for Fuzzing More Complex Types {#advanced}

Each language has utilities to facilitate making more complicated fuzz target functions:

  • {C/C++}

    The FuzzedDataProvider class provided by LLVM can help you map portions of the provided data to more complex types.

    For example:

      #include <fuzzer/FuzzedDataProvider.h>
    
      extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
        FuzzedDataProvider provider(data, size);
        auto flags = provider.ConsumeIntegral<uint32_t>();
        auto name = provider.ConsumeRandomLengthString();
        Parser parser(name, flags);
        auto buf = provider.ConsumeRemainingBytes<uint8_t>();
        parser.Parse(buf.data(), buf.size());
        return 0;
      }

    There are two notable advantages to using this library:

    • First, it makes it easier to rapidly write a fuzzer.
    • Second, it is designed to dynamically split inputs in such a way that the fuzzer can efficiently create new inputs from coverage data.

    There is one notable disadvantage:

    • Since inputs are dynamically split, it is more difficult to provide a pre-existing corpus. It is still feasible to provide a dictionary.
  • {Rust}

    You can create a fuzz target function that takes one or more inputs with the Arbitrary trait from the arbitrary crate. This is the recommended approach.

    To write a fuzz target function that automatically transforms arbitrary inputs:

    1. If needed, implement the Arbitrary trait for the types used by your test code. If possible, the recommended way to do this is by automatically deriving the trait. Otherwise, this can be done "by hand" by following the crate's instructions.

      For example, in your src/lib.rs:

      use arbitrary:Arbitrary;
      
      #[derive(Arbitrary)]
      struct ToyStruct { ... }
    2. Create a function with the #[fuzz] attribute that passes the necessary parameters to the code you wish to test.

      For example, in your src/lib.rs:

      use fuzz::fuzz;
      
      #[fuzz]
      fn toy_example_arbitrary(input: ToyStruct) {
          let _ = toy_example(input);
      }
  • {Go}

    You can use the "encoding/binary" package to "cast" or transform bytes to fixed size types:

    import (
      "bytes"
      "encoding/binary"
    )
    
    func Fuzz(s []byte) {
      var s MyStruct
      buf := bytes.NewReader(b)
      if err := binary.Read(buf, binary.LittleEndian, &s); err != nil {
        return
      }
      mypackage.HandleStruct(s)
    }

Next, you can build your newly created fuzzer using GN and Ninja.