Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ebpf_check_constraints_at_label #729

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions src/asm_parse.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ using crab::number_t;
#define DOT "[.]"
#define TYPE R"_(\s*(shared|number|packet|stack|ctx|map_fd|map_fd_programs)\s*)_"

// regex to match "require [assertion1, assertion2, ...]"
#define REQUIRE R"_(\s*require\s*\[\s*(.*?)\s*\]\s*)_"

static const std::map<std::string, Bin::Op> str_to_binop = {
{"", Bin::Op::MOV}, {"+", Bin::Op::ADD}, {"-", Bin::Op::SUB}, {"*", Bin::Op::MUL},
{"/", Bin::Op::UDIV}, {"%", Bin::Op::UMOD}, {"|", Bin::Op::OR}, {"&", Bin::Op::AND},
Expand Down Expand Up @@ -133,7 +136,8 @@ static Deref deref(const std::string& width, const std::string& basereg, const s
};
}

Instruction parse_instruction(const std::string& line, const std::map<std::string, label_t>& label_name_to_label) {
InstructionOrConstraintsSet parse_instruction(const std::string& line,
const std::map<std::string, label_t>& label_name_to_label) {
Comment on lines +139 to +140
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add an instruction (assertion) of the kind you want. Then do the check from inside ebpf_checker like any other check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think it's better to do it after #787

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to allow an external party to provide concrete state at a label and to then compare that concrete state to the abstract state for that label.

I.e. if verifier believes:
At label 4:-1 -> "r0.uvalue=0"

But execution of the BPF program results in:
At label 4:-1 -> "r0.uvalue=1"

Then this is a bug, in the VM (or possibly the verifier). This is what I am using the ebpf_check_constraints_at_label for and it has been hugely successful in finding bugs (much more than simply comparing the post-conditions after the program executes).

I am fine redoing this work once the code re-org is complete, but the goal here is to be able to reason about the program state post verification (i.e. when the program is actually executing).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the scenario. Does it happen online or offline? Do you want to query the final result of the computation during the execution of the program?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The BPF VM is single stepped through the instructions. After each instruction, a series of assertions is generated from the concrete state of the VM and checked against the abstract state at that label.

Divergence between the two signifies that either the VM is wrong or the verifier is wrong.

The majority of the verifier bugs I have been filing have been found using this technique (using BPF programs generated by specdif or libfuzzer).

// treat ";" as a comment
std::string text = line.substr(0, line.find(';'));
const size_t end = text.find_last_not_of(' ');
Expand Down Expand Up @@ -221,6 +225,16 @@ Instruction parse_instruction(const std::string& line, const std::map<std::strin
}
return res;
}
if (regex_match(text, m, regex(REQUIRE))) {
std::string constraints = m[1];
std::set<std::string> constraint_set;
std::regex re(R"(\s*,\s*)");
std::sregex_token_iterator first{constraints.begin(), constraints.end(), re, -1}, last;
for (; first != last; ++first) {
constraint_set.insert(*first);
}
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
return constraint_set;
}
return Undefined{0};
}

Expand All @@ -246,7 +260,13 @@ static InstructionSeq parse_program(std::istream& is) {
if (line.empty()) {
continue;
}
Instruction ins = parse_instruction(line, {});
auto ins_or_constraints = parse_instruction(line, {});

if (std::holds_alternative<ConstraintsSet>(ins_or_constraints)) {
continue;
}

Instruction ins = convert_to_original<Instruction>(ins_or_constraints).value_or(Undefined{0});
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
if (std::holds_alternative<Undefined>(ins)) {
continue;
}
Expand Down
21 changes: 20 additions & 1 deletion src/asm_parse.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,26 @@
#pragma once

#include <string>
#include <vector>

#include "asm_syntax.hpp"
#include "crab/interval.hpp"
#include "crab/linear_constraint.hpp"

Instruction parse_instruction(const std::string& line, const std::map<std::string, label_t>& label_name_to_label);
InstructionOrConstraintsSet parse_instruction(const std::string& line,
const std::map<std::string, label_t>& label_name_to_label);
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved

/***
* Parse a set of string form linear constraints into a vector of crab::linear_constraint_t
*
* @param[in] constraints A set of string form linear constraints.
* @param[out] numeric_ranges A vector of numeric ranges that are extracted from the constraints.
*
* @return A vector of crab::linear_constraint_t
* Example of string constraints include:
* r0.type=number
* r0.uvalue=5
* r0.svalue=5
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
*/
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
std::vector<crab::linear_constraint_t> parse_linear_constraints(const std::set<std::string>& constraints,
std::vector<crab::interval_t>& numeric_ranges);
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
80 changes: 80 additions & 0 deletions src/asm_syntax.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,49 @@ struct label_t {
explicit label_t(const int index, const int to = -1, std::string stack_frame_prefix = {}) noexcept
: from(index), to(to), stack_frame_prefix(std::move(stack_frame_prefix)) {}

/**
* @brief Construct a new label t object from a string.
*
* @param[in] string_label The string representation of the label.
* @return None (constructor)
*
* @throw std::invalid_argument The label format is invalid.
* @throw std::out_of_range The label value causes numeric overflow.
*
* Format: [prefix/]from[:to]
* - prefix: Optional stack frame prefix
* - from: Source instruction number
* - to: Optional jump target (-1 means next instruction)
*
* Example labels:
* "2:-1" - a label which falls through to the next instruction.
* "2:5" - a label which jumps to instruction 5.
* "2:-1/5:-1" - a label which falls through to the next instruction, with a stack frame prefix denoting where the
* label was called.
*/
explicit label_t(std::string_view string_label) {
auto pos = string_label.find(STACK_FRAME_DELIMITER);
if (pos != std::string_view::npos) {
stack_frame_prefix = std::string(string_label.substr(0, pos));
string_label = string_label.substr(pos + 1);
}

pos = string_label.find(':');
try {
if (pos != std::string_view::npos) {
from = std::stoi(std::string(string_label.substr(0, pos)));
to = std::stoi(std::string(string_label.substr(pos + 1)));
} else {
from = std::stoi(std::string(string_label));
to = -1;
}
} catch (const std::invalid_argument& e) {
throw std::invalid_argument("Invalid label format: " + std::string(string_label));
} catch (const std::out_of_range& e) {
throw std::out_of_range("Label value out of range: " + std::string(string_label));
}
}
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved

static label_t make_jump(const label_t& src_label, const label_t& target_label) {
return label_t{src_label.from, target_label.from, target_label.stack_frame_prefix};
}
Expand Down Expand Up @@ -315,11 +358,48 @@ struct IncrementLoopCounter {
bool operator==(const IncrementLoopCounter&) const = default;
};

// Helper metafunction to append a type to a variant
template <typename Variant, typename NewType>
struct append_to_variant;

template <typename... Types, typename NewType>
struct append_to_variant<std::variant<Types...>, NewType> {
using type = std::variant<Types..., NewType>;
};

// Helper metafunction to check if a type is in a variant
template <typename Variant, typename T>
struct is_type_in_variant;

template <typename T, typename... Types>
struct is_type_in_variant<std::variant<Types...>, T> : std::disjunction<std::is_same<T, Types>...> {};

// Function to convert ExtendedVariant to OriginalVariant if it doesn't contain the new type
template <typename OriginalVariant, typename ExtendedVariant>
std::optional<OriginalVariant> convert_to_original(const ExtendedVariant& extendedVariant) {
std::optional<OriginalVariant> result;

std::visit(
[&result](auto&& arg) {
using T = std::decay_t<decltype(arg)>;
if constexpr (is_type_in_variant<OriginalVariant, T>::value) {
result = arg;
}
},
extendedVariant);
return result;
}

Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
using Instruction = std::variant<Undefined, Bin, Un, LoadMapFd, Call, CallLocal, Callx, Exit, Jmp, Mem, Packet, Atomic,
Assume, IncrementLoopCounter>;

using ConstraintsSet = std::set<std::string>;
using InstructionOrConstraintsSet = append_to_variant<Instruction, ConstraintsSet>::type;
using LabeledInstruction = std::tuple<label_t, Instruction, std::optional<btf_line_info_t>>;
using LabeledConstraints = std::tuple<label_t, ConstraintsSet>;
using InstructionSeq = std::vector<LabeledInstruction>;
using ConstraintsSeq = std::vector<LabeledConstraints>;
using InstructionAndConstraintsSeq = std::tuple<InstructionSeq, ConstraintsSeq>;

/// Condition check whether something is a valid size.
struct ValidSize {
Expand Down
3 changes: 3 additions & 0 deletions src/config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ struct ebpf_verifier_options_t {

// Print the BTF types in JSON format.
bool dump_btf_types_json = false;

// Store pre-invariants for use in ebpf_check_constraints_at_label and ebpf_get_invariants_at_label.
bool store_pre_invariants = false;
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
};

struct ebpf_verifier_stats_t {
Expand Down
71 changes: 71 additions & 0 deletions src/crab_verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "crab/fwd_analyzer.hpp"
#include "crab_utils/lazy_allocator.hpp"

#include "asm_parse.hpp"
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
#include "asm_syntax.hpp"
#include "crab_verifier.hpp"
#include "string_constraints.hpp"
Expand Down Expand Up @@ -162,6 +163,14 @@ static checks_db get_analysis_report(std::ostream& s, const cfg_t& cfg, const cr
return db;
}

static thread_local std::optional<crab::invariant_table_t> saved_pre_invariants = std::nullopt;

static void save_invariants_if_needed(const crab::invariant_table_t& pre_invariants) {
if (thread_local_options.store_pre_invariants) {
saved_pre_invariants = pre_invariants;
}
}
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved

static checks_db get_ebpf_report(std::ostream& s, const cfg_t& cfg, program_info info,
const ebpf_verifier_options_t& options,
const std::optional<InstructionSeq>& prog = std::nullopt) {
Expand All @@ -174,6 +183,7 @@ static checks_db get_ebpf_report(std::ostream& s, const cfg_t& cfg, program_info
// Get dictionaries of pre-invariants and post-invariants for each basic block.
ebpf_domain_t entry_dom = ebpf_domain_t::setup_entry(true);
auto [pre_invariants, post_invariants] = run_forward_analyzer(cfg, std::move(entry_dom));
save_invariants_if_needed(pre_invariants);
return get_analysis_report(s, cfg, pre_invariants, post_invariants, prog);
} catch (std::runtime_error& e) {
// Convert verifier runtime_error exceptions to failure.
Expand Down Expand Up @@ -220,6 +230,7 @@ std::tuple<string_invariant, bool> ebpf_analyze_program_for_test(std::ostream& o
try {
const cfg_t cfg = prepare_cfg(prog, info, options.cfg_opts);
auto [pre_invariants, post_invariants] = run_forward_analyzer(cfg, std::move(entry_inv));
save_invariants_if_needed(pre_invariants);
const checks_db report = get_analysis_report(std::cerr, cfg, pre_invariants, post_invariants);
print_report(os, report, prog, false);

Expand Down Expand Up @@ -262,4 +273,64 @@ void ebpf_verifier_clear_thread_local_state() {
global_program_info.clear();
crab::domains::clear_thread_local_state();
crab::domains::SplitDBM::clear_thread_local_state();
saved_pre_invariants = std::nullopt;
}

bool ebpf_check_constraints_at_label(std::ostream& os, const std::string& label_string,
const std::set<std::string>& constraints) {
try {
label_t label = label_t(label_string);
if (!saved_pre_invariants.has_value()) {
os << "No pre-invariants available\n";
return false;
}
if (saved_pre_invariants.value().find(label) == saved_pre_invariants.value().end()) {
os << "No pre-invariants available for label " << label << "\n";
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
return false;
}
ebpf_domain_t from_inv(saved_pre_invariants.value().at(label));
auto concrete_domain = ebpf_domain_t::from_constraints(constraints, false);

if (concrete_domain.is_bottom()) {
os << "The provided constraints are unsatisfiable and self-contradictory (concrete domain is bottom)\n";
os << concrete_domain << "\n";
return false;
}
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
if (from_inv.is_bottom()) {
os << "The abstract state is unreachable\n";
os << from_inv << "\n";
return false;
}

if ((from_inv & concrete_domain).is_bottom()) {
os << "Concrete state does not match invariant\n";

// Print the concrete state
os << "--- Concrete state ---\n";
os << concrete_domain << "\n";

os << "--- Abstract state ---\n";
os << from_inv << "\n";

Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
return false;
}

return true;
} catch (std::exception& e) {
os << "Error occurred while checking constraints: " << e.what() << "\n";
return false;
}
}
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved

std::set<std::string> ebpf_get_invariants_at_label(const std::string& label) {
// If the label is malformed, throw an exception so the caller can handle it.
label_t l = label_t(label);
elazarg marked this conversation as resolved.
Show resolved Hide resolved

if (!saved_pre_invariants.has_value()) {
return {};
}
if (saved_pre_invariants.value().find(l) == saved_pre_invariants.value().end()) {
return {};
}
return saved_pre_invariants.value().at(l).to_set().value();
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
}
39 changes: 39 additions & 0 deletions src/crab_verifier.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,42 @@ int create_map_crab(const EbpfMapType& map_type, uint32_t key_size, uint32_t val
EbpfMapDescriptor* find_map_descriptor(int map_fd);

void ebpf_verifier_clear_thread_local_state();

/**
* @brief Given a label and a set of concrete constraints, check if the concrete constraints match the abstract
* verifier constraints at the label. Requires the `store_pre_invariants` option to be set.
*
* Abstract constraints are computed by the verifier and stored if the `store_pre_invariants` option is set.
* These constraints represent the program state at a specific point in the control flow graph,
* as determined by the static analysis performed by the verifier.
*
* If the 'store_pre_invariants' option is not set, this function will always return false along with an error message.
* This is because the verifier did not store the abstract constraints at each label.
*
* For invalid labels, this function will return false along with an error message.
*
* @param[in,out] os Print output to this stream.
* @param[in] label The location in the CFG to check against.
* @param[in] constraints The concrete state to check.
* @retval true The state is valid.
* @retval false The state is invalid.
*
* Note:
* This can also return false if the label is not found in the CFG or if the label is malformed.
*/
bool ebpf_check_constraints_at_label(std::ostream& os, const std::string& label,
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
const std::set<std::string>& constraints);
elazarg marked this conversation as resolved.
Show resolved Hide resolved
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
/**
* @brief Get the invariants at a given label. Requires the `store_pre_invariants` option to be set.
*
* If the 'store_pre_invariants' option is not set, this function will return an empty set
* as no invariants were stored during verification.
*
* @param[in] label The label in the CFG where invariants should be retrieved
* @return The set of invariants at the given label.
* Each invariant represents a constraint on the program state at this point.
* Returns an empty set if no invariants are available.
* @throw std::invalid_argument The label format is invalid
* @throw std::out_of_range The label value causes numeric overflow
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
*/
std::set<std::string> ebpf_get_invariants_at_label(const std::string& label);
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
Alan-Jowett marked this conversation as resolved.
Show resolved Hide resolved
Loading
Loading