Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reported coverage results do not match corpus #12986

Open
TheOneric opened this issue Jan 28, 2025 · 10 comments
Open

Reported coverage results do not match corpus #12986

TheOneric opened this issue Jan 28, 2025 · 10 comments
Assignees

Comments

@TheOneric
Copy link

TheOneric commented Jan 28, 2025

#12687 brought up abysmal tag coverage and initially set out to fix it by adding sample files with the missing tags. However this approach was abandoned in favour of improving fuzzer performance, since all tags are already part of the fuzzing dictionary.
We since implemented a sample size limit in libass, though it’s not enabled yet in OSS-Fuzz’ build setup.

However, while investigating a good sample size cutoff point, it turned out all tags are already part of the corpus and measuring coverage locally yields much better figures. Yet results reported in the web claim some tags aren’t tested at all, which seems like an OSS-Fuzz bug.

We also noticed the fuzzer produces a bunch of Fontconfig warnings when run in OSS Fuzz environment, due to missing any configuration file which must be located at an absolute path. While the absence of any fonts likely affects coverage of later rendering stages, this has no impact on earlier parsing and thus cannot explain this.
As a tangent: any recommendations how to best include such a configuration file into the runtime environment? I didn’t find anything seeming helpful in the docs.

Btw, the previous, unreproducible memleak also still remains a complete mystery to us: #8598 (appears to be fixed now)

@jonathanmetzman
Copy link
Contributor

What fuzzer and job is thsi?

@jonathanmetzman jonathanmetzman self-assigned this Jan 29, 2025
@TheOneric
Copy link
Author

@TheOneric
Copy link
Author

TheOneric commented Jan 30, 2025

And for comparison; this is a locally generated coverage report using the public libass corpus: coverage_local_nolimit.html.gz. Here all tags in libass/libass/ass_parse.c are covered, while the report from OSS Fuzz omits a fair amount of tags.

@phi-go
Copy link
Contributor

phi-go commented Jan 30, 2025

I'm looking into this same problem for some research we are doing. Still investigating things, but I think the reason that coverage is so low is that the coverage measurement is aborted due to a timeout, see report below [1]. As I understand the code on the oss-fuzz side, a timeout will only cause a larger report but not actually an "error".

So for the report @TheOneric linked, the execution count of this line corresponds to the number of inputs added to the coverage report. So only 40 inputs are considered.

Similarly for this fuzzer harness of sudoers, the max rss memory usage is exceeded causing only a fraction of the corpus to be executed.

Though, I'm not sure why this issue does not show up for you locally @TheOneric. Does this explanation still sound plausible?

[1]

Fontconfig error: Cannot load default config file
Fontconfig error: Cannot load default config file
Fontconfig error: Cannot load default config file
ALARM: working on the last Unit for 139 seconds
       and the timeout value is 100 (use -timeout=N to change)
MS: 0 ; base unit: 0000000000000000000000000000000000000000
artifact_prefix='./'; Test unit written to ./timeout-8d3048bb47f40c9d57dc33757032f6d5c4556326
==57== ERROR: libFuzzer: timeout after 139 seconds
    #0 0x55ffa4ddf444 in __sanitizer_print_stack_trace /src/llvm-project/compiler-rt/lib/ubsan/ubsan_diag_standalone.cpp:31:3
    #1 0x55ffa4d5e498 in fuzzer::PrintStackTrace() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerUtil.cpp:210:5
    #2 0x55ffa4d417a7 in fuzzer::Fuzzer::AlarmCallback() /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:304:5
    #3 0x7f03fd7a741f  (/lib/x86_64-linux-gnu/libpthread.so.0+0x1441f) (BuildId: 9a65bb469e45a1c6fbcffae5b82a2fd7a69eb479)
    #4 0x55ffa4e5c6e9 in hb_object_destroy<hb_face_t> /work/build/../../src/harfbuzz/src/hb-object.hh:288:7
    #5 0x55ffa4e5c6e9 in hb_face_destroy /work/build/../../src/harfbuzz/src/hb-face.cc:368:8
    #6 0x55ffa4e6f1c5 in hb_font_destroy /work/build/../../src/harfbuzz/src/hb-font.cc:1995:3
    #7 0x55ffa4df3964 in shape_harfbuzz /src/libass/libass/ass_shaper.c:721:9
    #8 0x55ffa4df3964 in ass_shaper_shape /src/libass/libass/ass_shaper.c:1013:16
    #9 0x55ffa4deab5d in ass_render_event /src/libass/libass/ass_render.c:2855:10
    #10 0x55ffa4de986e in ass_render_frame /src/libass/libass/ass_render.c:3396:17
    #11 0x55ffa4de093e in consume_track /src/libass/fuzz/fuzz.c:162:27
    #12 0x55ffa4de093e in LLVMFuzzerTestOneInput /src/libass/fuzz/fuzz.c:411:9
    #13 0x55ffa4d42d40 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:614:13
    #14 0x55ffa4d4c310 in fuzzer::Fuzzer::CrashResistantMergeInternalStep(std::__Fuzzer::basic_string<char, std::__Fuzzer::char_traits<char>, std::__Fuzzer::allocator<char>> const&, bool) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMerge.cpp:239:5
    #15 0x55ffa4d338c5 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:887:8
    #16 0x55ffa4d5ecf2 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #17 0x7f03fd582082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId: 0702430aef5fa3dda43986563e9ffcc47efbd75e)
    #18 0x55ffa4d2619d in _start (/out/libass_fuzzer+0x2e19d)

DEDUP_TOKEN: __sanitizer_print_stack_trace--fuzzer::PrintStackTrace()--fuzzer::Fuzzer::AlarmCallback()
SUMMARY: libFuzzer: timeout
MERGE-OUTER: attempt 2
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1983618745
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
MERGE-INNER: using the control file '/tmp/libFuzzerTemp.Merge54.txt'
MERGE-INNER: '/corpus/libass_fuzzer/regressions/ab3338ceac374f20b7518dd6d0f6b018-2' caused a failure at the previous merge step
MERGE-INNER: 10257 total files; 10219 processed earlier; will process 38 files now

@TheOneric
Copy link
Author

TheOneric commented Jan 30, 2025

Yes, only a small fraction of the corpus actually contributes to the reported coverage perfectly explains the result, thanks!
I’m a bit surprised this doesn't cause a more visible error or warning and also to encounter timeouts during coverage for samples which don't count as a timeout during fuzzing, but i guess there’s always some variance from machine performance which might push some samples previously close to timeouting over the limit.

Let’s hope this will then resolve itself once the input size limit is actually enabled in OSS Fuzz configuration.

@phi-go
Copy link
Contributor

phi-go commented Jan 30, 2025

Another easy performance improvement should be to silence the fontconfig warning as printing to stdout can be quite expensive. If the fontconfig file needs to be read for every execution, would it be possible to have that in memory instead of accessing the filesystem? Maybe it would also be a suitable fuzzer input?

@TheOneric
Copy link
Author

Yes, I’d like to silence the warning by making the config load succeed since otherwise we’re lacking fonts to render later.
I believe there’s no way to pass an in memory config to fontconfig, but suspect fontconfig caches as much as possible for the default system config.
I don’t think it makes much sense to take it as a fuzzer input for libass.

Preferably I’d like to get fontconfig working, since this will allow us to include fontprovider logic in fuzzing, which ime is one of the most likely areas to gain new bugs which stay undetected for a bit due to depending on particular fontname and system details.
If it cannot work at all, as a last resort we could disable all system font providers and only use a small bundled in-memory font. The drawbacks being the lack of provider coverage mentioned above and overhead from processing the memory font for samples which end up never using any font.

@phi-go
Copy link
Contributor

phi-go commented Feb 5, 2025

Sounds like a good plan. As it sounds like the fontconfig setup has little influence on the behavior, other than that it needs to exist, maybe do the setup in the Dockerfile? Or if you want to set it per fuzz target, do that in a LLVMFuzzerInitialize function, to avoid wasting time and having stateful LLVMFuzzerTestOneInput function.

@TheOneric
Copy link
Author

maybe do the setup in the Dockerfile?

Ideally I’d like to, yes. However, it needs to exist in the runtime container and to be able to use the default config path the file needs to be placed on a known absolute path. (Ideally /etc/fonts/fonts.conf to avoid needing to recompile fontconfig).
But afaik the only wa to get files into the runtime container is by copying to $OUT from the build container and i do not know under which path those files are later located in the runtime container or whether this path is stable at all

@phi-go
Copy link
Contributor

phi-go commented Feb 10, 2025

Hey, you are completely right, only the files in $OUT are available in the runtime container. Indeed the files under /etc/fonts are available during the build phase, so you could copy those and the fonts into $OUT as part of the Dockerfile. To get the path, it seems the recommended way is to use argv[0]. In a quick attempt the following code sets the env var but I didn't do the required copying in the Dockerfile, just used my local conf instead which points to non-existent directories. (https://www.freedesktop.org/software/fontconfig/fontconfig-user.html). So getting this the following output at the moment, I hope this help anyway.

INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
/out/fonts.conf
Fontconfig error: Cannot load default config file
Fontconfig error: Cannot load default config file
#elif ASS_FUZZMODE == FUZZMODE_LIBFUZZER
#include <libgen.h>
int LLVMFuzzerInitialize(int *argc, char ***argv) {
  char *exe_path = strdup((*argv)[0]);
  char *dir = dirname(exe_path);
  char fontpath[100];
  sprintf(fontpath, "%s/fonts.conf", dir);
  setenv("FONTCONFIG_FILE", fontpath, true);
  return 0;
}

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
{
    char *fontpath = getenv("FONTCONFIG_FILE");
    printf("%s\n", fontpath);
    fflush(stdout);
    // OSS Fuzz docs recommend just returning 0 on too large input
    // libFuzzer docs tell us to use -1 to prevent addition to corpus
    // BUT HongFuzz' LLVMFuzzerTestOneInput hook can't handle it and
    // neither does OSS Fuzz convert this in their wrapper, see:
    // https://github.com/google/oss-fuzz/issues/11983
    if (!LEN_IN_RANGE(size))
        return 0;

    ASS_Track *track = NULL;

    // All return values but zero and -1 are reserved
    if (!init())
        return 0;

    track = ass_read_memory(ass_library, (char *)data, size, NULL);
    if (track) {
        consume_track(ass_renderer, track);
        ass_free_track(track);
    }

    ass_renderer_done(ass_renderer);
    ass_library_done(ass_library);
    ass_renderer = NULL;
    ass_library = NULL;

    return 0;
}
#else

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants