Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pass to print pr.sources to a SARIF file #749

Merged
merged 10 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,5 @@ Testing/Temporary/*
_CPack_Packages/*

install/*

CMakeUserPresets.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

The VAST parser detection leverages a dialect-based approach, where program data manipulation is abstracted and reduced to a parser dialect. This results in an MLIR representation that combines control-flow constructs with parser-specific operations.

To generate this representation, we provide the `detect-parsers` tool—a customized version of `mlir-opt` that converts VAST dialects into the parser dialect.
To generate this representation, we provide the `vast-detect-parsers` tool—a customized version of `mlir-opt` that converts VAST dialects into the parser dialect.
To use the tool, simply run:
```bash
detect-parsers -vast-hl-to-parser <input.mlir>
vast-detect-parsers -vast-hl-to-parser <input.mlir>
```

Parser conversion can be enhanced with the use of function models, which specify how functions in programs should be interpreted. A default set of models is provided in `Conversion/Parser/default-parsers-config.yaml`. Additional configurations can be supplied via a pass parameter.
3 changes: 3 additions & 0 deletions docs/Tools/vast-front.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ Additional customization options include:
- `-vast-locs-as-meta-ids`
- Uses metadata identifiers instead of file locations for locations.

- `-vast-loc-attrs`
- When used in conjunction with `-vast-show-locs`, emits location data as MLIR attributes.

## Debuging and diagnostics

- `-vast-emit-crash-reproducer="reproducer.mlir"`
Expand Down
1 change: 1 addition & 0 deletions include/vast/Conversion/Parser/Passes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,10 @@
namespace vast {

std::unique_ptr< mlir::Pass > createHLToParserPass();
std::unique_ptr< mlir::Pass > createParserSourceToSarifPass();

// Generate the code for registering passes.
#define GEN_PASS_REGISTRATION
#include "vast/Conversion/Parser/Passes.h.inc"

Check failure on line 19 in include/vast/Conversion/Parser/Passes.hpp

View workflow job for this annotation

GitHub Actions / cpp-linter (19, 22.04)

include/vast/Conversion/Parser/Passes.hpp:19:14 [clang-diagnostic-error]

'vast/Conversion/Parser/Passes.h.inc' file not found

} // namespace vast
11 changes: 11 additions & 0 deletions include/vast/Conversion/Parser/default-parsers-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,17 @@
- nodata # FILE * restrict stream
category: source

# size_t fread(void * restrict buffer, size_t size, size_t count, FILE * restrict stream);
- function: fread
model:
return_type: nodata
arguments:
- data # void * restrict buffer
- nodata # size_t size
- nodata # size_t count
- nodata # FILE * restrict stream
category: source

# char * gets(char * str);
- function: gets
model:
Expand Down
3 changes: 2 additions & 1 deletion include/vast/Frontend/Options.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,9 @@ namespace vast::cc

llvm::Twine disable(string_ref pipeline_name);

constexpr option_t show_locs = "show-locs";
constexpr option_t show_locs = "show-locs";
constexpr option_t locs_as_meta_ids = "locs-as-meta-ids";
constexpr option_t loc_attrs = "loc-attrs";

constexpr option_t disable_unsupported = "disable-unsupported";

Expand Down
1 change: 1 addition & 0 deletions lib/vast/Conversion/Parser/PassesDetails.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ VAST_RELAX_WARNINGS
VAST_UNRELAX_WARNINGS

#include "vast/Dialect/Parser/Dialect.hpp"
#include "vast/Dialect/Parser/Ops.hpp"

#include "vast/Conversion/Passes.hpp"

Expand Down
4 changes: 3 additions & 1 deletion lib/vast/Frontend/Consumer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,9 @@ namespace vast::cc {
void vast_stream_consumer::print_mlir_string_format(owning_mlir_module_ref mod) {
// FIXME: we cannot roundtrip prettyForm=true right now.
mlir::OpPrintingFlags flags;
flags.enableDebugInfo(vargs.has_option(opt::show_locs), /* prettyForm */ true);
flags.enableDebugInfo(
vargs.has_option(opt::show_locs), /* prettyForm */ !vargs.has_option(opt::loc_attrs)
);

mod->print(*output_stream, flags);
}
Expand Down
1 change: 1 addition & 0 deletions test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ set(VAST_TEST_DEPENDS
vast-query
vast-opt
vast-front
vast-detect-parsers
)

add_lit_testsuite(check-vast "Running the VAST regression tests"
Expand Down
1 change: 1 addition & 0 deletions test/lit.cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@
"-nostdsysteminc"
]
),
ToolSubst('%vast-detect-parsers', command = 'vast-detect-parsers'),
ToolSubst('%file-check', command = 'FileCheck'),
ToolSubst('%cc', command = config.host_cc)
]
Expand Down
6 changes: 6 additions & 0 deletions test/parsers/rle-a.c
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
// RUN: %vast-front -vast-emit-mlir=hl %s -o - | %file-check %s -check-prefix=HL
// RUN: %vast-front -vast-show-locs -vast-loc-attrs -vast-emit-mlir=hl %s -o - | %vast-detect-parsers -vast-hl-to-parser -parser-source-to-sarif=output=/dev/stdout -o /dev/null | %file-check %s -check-prefix=SARIF
// REQUIRES: sarif

#include <stdio.h>
#include <stdlib.h>
Expand Down Expand Up @@ -38,10 +40,14 @@ void parse_binary_file(const char *filename) {

// Read header (4 bytes) - contains length of compressed data
uint32_t compressed_length;
// SARIF: "startColumn": 5,
// SARIF: "startLine": 45
fread(&compressed_length, sizeof(uint32_t), 1, file);

// Allocate memory for compressed data
uint8_t *compressed_data = (uint8_t *)malloc(compressed_length);
// SARIF: "startColumn": 5,
// SARIF: "startLine": 51
fread(compressed_data, sizeof(uint8_t), compressed_length, file);

// Close the file after reading
Expand Down
2 changes: 1 addition & 1 deletion tools/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
add_subdirectory(detect-parsers)
add_subdirectory(vast-detect-parsers)
add_subdirectory(vast-front)
add_subdirectory(vast-opt)
add_subdirectory(vast-query)
Expand Down
6 changes: 0 additions & 6 deletions tools/detect-parsers/CMakeLists.txt

This file was deleted.

30 changes: 0 additions & 30 deletions tools/detect-parsers/main.cpp

This file was deleted.

7 changes: 7 additions & 0 deletions tools/vast-detect-parsers/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
add_vast_executable(vast-detect-parsers
main.cpp
ParserSourceDetector.cpp

LINK_LIBS
MLIROptLib
)
31 changes: 31 additions & 0 deletions tools/vast-detect-parsers/ParserSourceDetector.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Copyright (c) 2024, Trail of Bits, Inc.

#ifdef VAST_ENABLE_SARIF
#include "SarifPasses.hpp"

#include "vast/Dialect/Parser/Ops.hpp"
#include "vast/Frontend/Sarif.hpp"

namespace vast {
void ParserSourceDetector::runOnOperation() {
getOperation().walk([&](pr::Source op) {
gap::sarif::result result{
.ruleId{ "pr-source" },
.ruleIndex = 0,
.kind = gap::sarif::kind::kInformational,
.level = gap::sarif::level::kNote,
.message{
.text{ { "Parser source detected" } },
},
.locations{},
};
if (auto loc = cc::sarif::mk_location(op.getLoc());
loc.physicalLocation.has_value())
{
result.locations.push_back(std::move(loc));
}
results.push_back(result);
});
}
} // namespace vast
#endif
27 changes: 27 additions & 0 deletions tools/vast-detect-parsers/SarifPasses.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
// Copyright (c) 2024, Trail of Bits, Inc.

#pragma once

#ifdef VAST_ENABLE_SARIF
#include "vast/Util/Warnings.hpp"

VAST_RELAX_WARNINGS
#include <mlir/IR/BuiltinOps.h>
#include <mlir/Pass/Pass.h>
#include <mlir/Pass/PassManager.h>
VAST_UNRELAX_WARNINGS

#include <gap/sarif/sarif.hpp>

namespace vast {
struct ParserSourceDetector
: mlir::PassWrapper< ParserSourceDetector, mlir::OperationPass< mlir::ModuleOp > >
{
std::vector< gap::sarif::result > &results;

ParserSourceDetector(std::vector< gap::sarif::result > &results) : results(results) {}

void runOnOperation() override;
};
} // namespace vast
#endif
99 changes: 99 additions & 0 deletions tools/vast-detect-parsers/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
// Copyright (c) 2024, Trail of Bits, Inc.

#include <string>

#include "vast/Util/Warnings.hpp"

VAST_RELAX_WARNINGS
#include "mlir/IR/Dialect.h"
#include "mlir/IR/MLIRContext.h"
#include "mlir/InitAllDialects.h"
#include "mlir/Pass/PassOptions.h"
#include "mlir/Tools/mlir-opt/MlirOptMain.h"

#include <llvm/Support/FileSystem.h>
#include <llvm/Support/raw_ostream.h>
VAST_UNRELAX_WARNINGS

#include "vast/Conversion/Parser/Passes.hpp"
#include "vast/Dialect/Dialects.hpp"

#include "vast/Dialect/Parser/Dialect.hpp"

#include "SarifPasses.hpp"

namespace vast {

#ifdef VAST_ENABLE_SARIF
struct SarifWriter : mlir::PassWrapper< SarifWriter, mlir::OperationPass< mlir::ModuleOp > >
{
std::vector< gap::sarif::result > results;
std::string path;

SarifWriter(std::string path) : path(path) {}

void runOnOperation() override {}

~SarifWriter() {
std::error_code ec;
llvm::raw_fd_ostream os(path, ec, llvm::sys::fs::OF_None);
if (ec) {
VAST_FATAL("Failed to open file for SARIF output: {}", ec.message());
}
gap::sarif::root root{
.version = gap::sarif::version::k2_1_0,
.runs{
{
{
.tool{
.driver{
.name{ "vast-detect-parsers" },
},
},
.results{ results },
},
}, },
};

nlohmann::json root_json = root;

os << root_json.dump(2);
}
};

struct SarifOptions : mlir::PassPipelineOptions< SarifOptions >
{
Option< std::string > out_path{ *this, "output",
llvm::cl::desc("Output SARIF file path.") };
};

void registerSarifPasses() {
mlir::PassPipelineRegistration< SarifOptions >(
"parser-source-to-sarif", "Dumps all pr.source locations to a SARIF file.",
[](mlir::OpPassManager &pm, const SarifOptions &opts) {
auto writer = std::make_unique< SarifWriter >(opts.out_path);
pm.addPass(std::make_unique< vast::ParserSourceDetector >(writer->results));
pm.addPass(std::move(writer));
}
);
}
#else

void registerSarifPasses() {}
#endif
} // namespace vast

int main(int argc, char **argv) {
mlir::DialectRegistry registry;
// register dialects
vast::registerAllDialects(registry);
mlir::registerAllDialects(registry);

vast::registerParserConversionPasses();
vast::registerSarifPasses();
registry.insert< vast::pr::ParserDialect >();

return mlir::asMainReturnCode(
mlir::MlirOptMain(argc, argv, "VAST Parser Detection driver\n", registry)
);
}
2 changes: 1 addition & 1 deletion www/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ nav:
- Optimizer: Tools/vast-opt.md
- Query: Tools/vast-query.md
- REPL: Tools/vast-repl.md
- Detect Parsers: Tools/detect-parsers.md
- Detect Parsers: Tools/vast-detect-parsers.md
- Related Projects: Projects/related.md
- Benchmarks:
- LLVM Single Source: Benchmarks/single-source-results.md
Expand Down