Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The never-ending Linker problem #83

Closed
george-cosma opened this issue Sep 6, 2024 · 5 comments · Fixed by #88 or #87
Closed

The never-ending Linker problem #83

george-cosma opened this issue Sep 6, 2024 · 5 comments · Fixed by #88 or #87

Comments

@george-cosma
Copy link
Collaborator

george-cosma commented Sep 6, 2024

The Linker Problem

To fully implement a working wasm interpreter we must be able to resolve imports. The idea of a Linker comes to mind. There are multiple ways to design it.

Design 1: Monolithic Runtime Instance

linker_problem_monolith drawio

This design would entail collecting all validation info from all of the modules into the Linker, which will then produce a "merged" validation info which can then be instantiated as a sole RuntimeInstance. Import resolution would then be resolved universally, since a call to an imported function would actually be a regular call for the merged validation info.

✅ Pros:

  • Import Resolution solved at link-time. No performance overhead.
  • The architecture of the code will remain mostly unchanged
  • Linker will live for a very short amount of time (long enough to process the validation infos)

❌Cons:

  • Code sections will (likely) need to be modified due to index namespace merger. A call to function "12" is trivial with a single module, but what about if we merge two modules? Module 1 imports a function as index 1, and Module 2 exports this function with index 8. All function indices will then have to be reshuffled and the appropriate call instructions will need their operands changed.
  • Same as above, but for globals, tables, memories, etc.
  • Import Resolution solved at link-time. This would prevent any sort of design where equivalent modules could be swapped at runtime.

Design 2: Each validation info has its own runtime instance (Swarm)

linker_problem_swarm drawio

As the name and image suggests, each module gets its own runtime instance. Where the magic lies is actually inside the Linker, which, this time, is an entity which lives as long as the runtimes. When a module needs to call an imported function, it does so via the Linker.

✅ Pros:

  • Validation Info & Runtime creation is simple
  • Closer in style to the WASM Component Model (TODO: Check specification)
  • Allows for the ability to "hot swap" modules at runtime. Let's say the infotainment system is decoupled (intentionally or not). Then the Linker could be ordered to replace the module responsible for communicating with it, with a dummy module that fails instantly instead of waiting for a timeout.

❌Cons:

  • Performance overhead due to import resolution at runtime, especially if we tackle the threading proposal.
  • The fuel mechanic is non-trivial
  • Lots of arhitectural changes
  • Linker is long-running
  • A very specific problem with resumability (continue reading)

With this type of linker, there is an arhitectural problem we need to solve to maintain resumability. If Module 1 calls an imported function in Module 2, and Module 2 then calls an imported function from Module 1, at the end of this chain Module 1 must be able to resume code properly.

Here is as example how an import call could work:

linker_problem_rel1 drawio

Now, how it would work in the described scenario:

linker_problem_rel2 drawio

Notice that the second Store PC would overwrite the previously stored program counter. An intuitive solution would be to make it a stack, but that feels like it would create more problems than it solves. An alternative solution would have the call instruction create a callframe not only on the caller module, but also on the called module. Or something like that. There are solutions, but I do not know which is the correct one.

I'd like to continue this discussion. I want to know your opinions in regards with which approach to go with. I personally believe the second "swarm" approach is more appropriate, but that is based more on vibes.

@george-cosma
Copy link
Collaborator Author

I've made some dummy benchmarks to see how much slower approach 2 ("Swarm") would be: https://github.com/george-cosma/indirection_bench

@george-cosma
Copy link
Collaborator Author

Proposed API changes:

  • One module exmaple:
// .-----------------------.
// | Single module example |
// '-----------------------'

const ADD_ONE: &'static str = r#"
(module
    (func (export "add_one") (param $x i32) (result i32)
        local.get $x
        i32.const 1
        i32.add
    )
)"#;

use wasm::{validate, RuntimeInstance, DEFAULT_MODULE};

fn main() {
    let wasm_bytes = wat::parse_str(ADD_ONE).unwrap();
    let validation_info = validate(&wasm_bytes).unwrap();
    let mut instance = RuntimeInstance::new(&validation_info).unwrap();

    // `get_fn` will verify that the function "add_one" exists for module <DEFAULT_MODULE>.
    // On success: return the identifier pair (module_name, function_name, module_id, function_id)
    // On failure: RuntimeError -- couldn't find the function
    let add_one = instance.get_fn(DEFAULT_MODULE, "add_one").unwarp();
    
    // Also, to maintain compatability with index-based accessing (which can be useful in some edge cases, and for us it
    // is useful for integration tests):
    let add_one = instance.get_fn_idx(/* module_idx: */0, /* function_idx: */0).unwarp();
    // On success: return the identifier pair (module_name, function_name, module_id, function_id)
    // On failure: RuntimeError -- couldn't find the function

    // `invoke` will verify that the function identifier is still valid (it wasn't created with an instance and ran on
    // another). That is why we also store the module_name and function_name.
    assert_eq!(12, instance.invoke(&add_one, 11).unwrap());
    // Or should we do it this way? Or both?
    assert_eq!(12, add_one.invoke(&instance, 11).unwrap());
}
  • Multiple modules example:
// .--------------------------.
// | Multiple modules example |
// '--------------------------'

const ADD_ONE: &'static str = /* as above */;
const ADD_TWO: &'static str = r#"
(module
    (import "add_one_module" "add_one" (func %add_one (param i32) (result i32)))
    (func (export "add_two") (param $x i32) (result i32)
        local.get $x
        call %add_one
        call %add_one
    )
)"#;

fn main() {
    let wasm_bytes = wat::parse_str(ADD_ONE).unwrap();
    let validation_info = validate(&wasm_bytes).unwrap();
    let mut instance = RuntimeInstance::new_named("add_one_module", &validation_info).unwrap();

    let wasm_bytes = wat::parse_str(ADD_TWO).unwrap();
    let validation_info = validate(&wasm_bytes).unwrap();
    instance.add_module("add_two_module", &validation_info).unwarp();

    let add_two = instance.get_fn("add_two_module", "add_two").unwarp();
    // Alternative:
    let add_two = instance.get_fn_idx(1, 0).unwarp();

    assert_eq!(13, instance.invoke(&add_two, 11).unwrap());
    // Or should we do it this way? Or both?
    assert_eq!(13, add_two.invoke(&instance, 11).unwrap());
}

@george-cosma george-cosma mentioned this issue Sep 23, 2024
5 tasks
@george-cosma george-cosma reopened this Sep 26, 2024
@george-cosma george-cosma mentioned this issue Sep 27, 2024
5 tasks
@george-cosma
Copy link
Collaborator Author

The never ending linker problem has eneded

@cemonem cemonem reopened this Jan 28, 2025
@cemonem
Copy link

cemonem commented Jan 28, 2025

We might need to save a pointer to the current store in addition to the pc and the current module to account for side effectful things imported function does private to its own module, but how do we handle global values exported then? Yeah this was handled by the pr also.

@cemonem
Copy link

cemonem commented Jan 28, 2025

Yeah

@cemonem cemonem closed this as completed Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants