Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new transform): Introduce WASM Plugins #2006

Merged
merged 100 commits into from
Jun 11, 2020
Merged

feat(new transform): Introduce WASM Plugins #2006

merged 100 commits into from
Jun 11, 2020

Conversation

Hoverbear
Copy link
Contributor

@Hoverbear Hoverbear commented Mar 9, 2020

Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
@Hoverbear Hoverbear self-assigned this Mar 9, 2020
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
@Hoverbear Hoverbear linked an issue Mar 23, 2020 that may be closed by this pull request
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
@Hoverbear
Copy link
Contributor Author

Hoverbear commented Mar 31, 2020

@lukesteensen @LucioFranco Could you take an initial look at this (the RFC only?)

I suggest we merge this with just transforms specced and amend it with sinks and sources later. I'll work on getting the POC woven in and the RFC less draft-style.

@Hoverbear Hoverbear changed the title DRAFT: Wasm engine feature: RFC - Wasm engine Mar 31, 2020
@Hoverbear Hoverbear added domain: sinks Anything related to the Vector's sinks domain: sources Anything related to the Vector's sources domain: transforms Anything related to Vector's transform components and removed needs: rfc Needs an RFC before work can begin. labels Mar 31, 2020
Cargo.toml Outdated
@@ -177,11 +183,11 @@ dirs = "2.0.2"

[features]
# Default features for *-unknown-linux-gnu and *-apple-darwin
default = ["sources", "transforms", "sinks", "vendored", "unix", "leveldb-plain", "rdkafka-plain"]
default = ["sources", "transforms", "sinks", "vendored", "unix", "leveldb-plain", "rdkafka-plain", "engine"]
Copy link
Contributor

@LucioFranco LucioFranco Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob a better name than "engine"?

edit: I guess you do call it the vector engine but from reading this I wouldn't assume its the wasm engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely open to new terms

Copy link
Contributor

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As sinks? For this we should consider how we handle batching, partitioning, etc, more.

I think we will want to expose premade host fn that basically just call out to and configure our current implementations.

What about the idea of codecs being separate?

I'm not totally following this question, could you explain more?

We should consider if we are satisfied with the idea of hostcalls being used for get and other API. We could also let
the host call the Guest allocate function and then pass it a C String pointer to let it work on. This, however, requires
serializing and deserializing the entire Event each time, which is a huge performance bottleneck.

Is this something where we could expose both? Would be nice to let the user choose the trade offs.

Do we one day want to support a cross platform cargo install vector option for installing a Vector binary?

To me this isn't that important since we should be having users go through their system package manager or download the binary from our hosted repo.

Should a user import vector::... to use the Vector guest API in their Rust module?

I would imagine we want to expose a separate smaller crate for rust users that want to compile to wasm? We probably also want to support rust as the best language to write in?

How can we let users see what's happening in WASM modules? Can we use tracing somehow? Lucet supports tracing, perhaps
we could hook in somehow?

Probably via the same method we have users interact with events via some host fn.

Overall this looks fantastic! Super excited to get WASM support! Great work :)

rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
We noted that the existing lua runtime was able to accomplish these tasks quite elegantly, however it was an order of
magnitude slower than a native transform.

> TODO: Proof
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proof is just "lua" :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to do some real testing. :)

rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
rfcs/2020-03-05-wasm-engine.md Outdated Show resolved Hide resolved
@Hoverbear
Copy link
Contributor Author

Ok lets leave it off by default and I'll remove the WATs and have them work through /target

Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
@@ -7,7 +7,7 @@ description: <%= "The Vector `#{component.name}` #{component.type} #{component_s
event_types: <%= component.event_types.to_json %>
function_category: <%= component.function_category.to_json %>
issues_url: <%= metadata.links.fetch("urls.#{component.id}_issues") %>
<%- if !component.transform? -%>
<%- if !component.transform? || component.respond_to?(:operating_systems) -%>
Copy link
Contributor Author

@Hoverbear Hoverbear Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binarylogic I tried just removing this line but that broke, is this the correct way to introduce this value (operating_systems) to the transforms page?

src/event/mod.rs Outdated Show resolved Hide resolved
Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Ana Hobden <[email protected]>
@Hoverbear Hoverbear changed the title feature(new transform): Introduce WASM Plugins feat(new transform): Introduce WASM Plugins Jun 10, 2020
@Hoverbear
Copy link
Contributor Author

I asked @lukesteensen to take a final peak at this before merge to make sure I didn't miss anything. :)

Copy link
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Excited to start experimenting with this 🎉

Comment on lines +16 to +17
let age = modified.duration_since(std::time::UNIX_EPOCH)?;
Ok(Self(age.as_secs()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Using the time itself instead of duration_since would be a more unique fingerprint. Not that I'd really expect a collision either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heck, you're right! I wanted to improve the fingerprinter in general later, but it seems sufficient right now.

@Hoverbear Hoverbear merged commit 5706d2a into master Jun 11, 2020
@binarylogic binarylogic added type: feature A value-adding code addition that introduce new functionality. and removed type: new feature labels Jun 16, 2020
@binarylogic binarylogic deleted the wasm-engine branch July 23, 2020 17:32
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
* Add wip

Signed-off-by: Ana Hobden <[email protected]>

* wip

Signed-off-by: Ana Hobden <[email protected]>

* wip

Signed-off-by: Ana Hobden <[email protected]>

* WIP

Signed-off-by: Ana Hobden <[email protected]>

* Fix bindings

Signed-off-by: Ana Hobden <[email protected]>

* Integrate Make task

Signed-off-by: Ana Hobden <[email protected]>

* Remove old cruft

Signed-off-by: Ana Hobden <[email protected]>

* More refinements to rfc

Signed-off-by: Ana Hobden <[email protected]>

* Get function passing working

Signed-off-by: Ana Hobden <[email protected]>

* Tracing and proper FFI strings

Signed-off-by: Ana Hobden <[email protected]>

* Wip

Signed-off-by: Ana Hobden <[email protected]>

* 🚀

Signed-off-by: Ana Hobden <[email protected]>

* More RFC notes

Signed-off-by: Ana Hobden <[email protected]>

* Format wasm rfc

Signed-off-by: Ana Hobden <[email protected]>

* Add API notes

Signed-off-by: Ana Hobden <[email protected]>

* Write more of the RFC

Signed-off-by: Ana Hobden <[email protected]>

* Add plan of attack and platform support

Signed-off-by: Ana Hobden <[email protected]>

* Minor fixes.

Signed-off-by: Ana Hobden <[email protected]>

* Command line test passes

Signed-off-by: Ana Hobden <[email protected]>

* Extract foreign module APIs.

Signed-off-by: Ana Hobden <[email protected]>

* Do some documentation.

Signed-off-by: Ana Hobden <[email protected]>

* More RFC writing

Signed-off-by: Ana Hobden <[email protected]>

* Cleanup and RFC updating.

Signed-off-by: Ana Hobden <[email protected]>

* Checkpoint

Signed-off-by: Ana Hobden <[email protected]>

* Passing a Raw registration up works

Signed-off-by: Ana Hobden <[email protected]>

* Update RFC to reflect new simpler APIs.

Signed-off-by: Ana Hobden <[email protected]>

* Various cleaning of structure.

Signed-off-by: Ana Hobden <[email protected]>

* Some refining on docs and roles

Signed-off-by: Ana Hobden <[email protected]>

* Some new events

Signed-off-by: Ana Hobden <[email protected]>

* Remove RFC present in other PR

Signed-off-by: Ana Hobden <[email protected]>

* Fix release builds and add new metric

Signed-off-by: Ana Hobden <[email protected]>

* Rename somethings

Signed-off-by: Ana Hobden <[email protected]>

* Caching and fingerprinting works now.

Signed-off-by: Ana Hobden <[email protected]>

* Add benchmark for wasm protobuf

Signed-off-by: Ana Hobden <[email protected]>

* Add noop bench

Signed-off-by: Ana Hobden <[email protected]>

* Add wasm CI

Signed-off-by: Ana Hobden <[email protected]>

* Update lockfile

Signed-off-by: Ana Hobden <[email protected]>

* Clean up make jobs, benching

Signed-off-by: Ana Hobden <[email protected]>

* Fixup make jobs

Signed-off-by: Ana Hobden <[email protected]>

* Use WATS

Signed-off-by: Ana Hobden <[email protected]>

* Figure out less hazardous memory management (demo)

Signed-off-by: Ana Hobden <[email protected]>

* Responsible memory management

Signed-off-by: Ana Hobden <[email protected]>

* process takes pointer/len, emit works

Signed-off-by: Ana Hobden <[email protected]>

* Better benchmarks

Signed-off-by: Ana Hobden <[email protected]>

* Add docs

* Fixup nits.

Signed-off-by: Ana Hobden <[email protected]>

* Add website files

Signed-off-by: Ana Hobden <[email protected]>

* Rework registration.

Signed-off-by: Ana Hobden <[email protected]>

* Some refining of the protobuf transform and tests.

Signed-off-by: Ana Hobden <[email protected]>

* Fixup benches.

Signed-off-by: Ana Hobden <[email protected]>

* Make emit more flexible.

Signed-off-by: Ana Hobden <[email protected]>

* Add tests for other mods

Signed-off-by: Ana Hobden <[email protected]>

* Add transform changes

Signed-off-by: Ana Hobden <[email protected]>

* Add error handling

Signed-off-by: Ana Hobden <[email protected]>

* Add panic handling

Signed-off-by: Ana Hobden <[email protected]>

* No more stale wats

Signed-off-by: Ana Hobden <[email protected]>

* wasm interop exposed and more protobuf guide

Signed-off-by: Ana Hobden <[email protected]>

* Clean up

Signed-off-by: Ana Hobden <[email protected]>

* Various small fixes

Signed-off-by: Ana Hobden <[email protected]>

* Add cached metric to compilation event

Signed-off-by: Ana Hobden <[email protected]>

* Clean up test spans

* More robust protobuf

Signed-off-by: Ana Hobden <[email protected]>

* Add options calls to WASM modules.

Signed-off-by: Ana Hobden <[email protected]>

* checkpoint

Signed-off-by: Ana Hobden <[email protected]>

* Improve fingerprinter.

Signed-off-by: Ana Hobden <[email protected]>

* Add wasm-timings feature

Signed-off-by: Ana Hobden <[email protected]>

* Add assert config

Signed-off-by: Ana Hobden <[email protected]>

* Extract fingerprint and artifact cache.

Signed-off-by: Ana Hobden <[email protected]>

* Remove protobuf guide.

Signed-off-by: Ana Hobden <[email protected]>

* Make things much more safe.

Signed-off-by: Ana Hobden <[email protected]>

* Fix a whole swack of integer sizing issues.

Signed-off-by: Ana Hobden <[email protected]>

* Update to git master lucet

Signed-off-by: Ana Hobden <[email protected]>

* Add nix env

Signed-off-by: Ana Hobden <[email protected]>

* Update lucet/tracing

Signed-off-by: Ana Hobden <[email protected]>

* Note we support wat as well

Signed-off-by: Ana Hobden <[email protected]>

* Raise some debug to info

Signed-off-by: Ana Hobden <[email protected]>

* fmt

Signed-off-by: Ana Hobden <[email protected]>

* Add artifact_cache knob

Signed-off-by: Ana Hobden <[email protected]>

* Remove WATs

Signed-off-by: Ana Hobden <[email protected]>

* Clean up formatting and some clippy lints

Signed-off-by: Ana Hobden <[email protected]>

* Fixup features in benches

Signed-off-by: Ana Hobden <[email protected]>

* Checker now needs cmake

Signed-off-by: Ana Hobden <[email protected]>

* Fmt

Signed-off-by: Ana Hobden <[email protected]>

* Fixup bench features

Signed-off-by: Ana Hobden <[email protected]>

* Add OS support to transforms on website

Signed-off-by: Ana Hobden <[email protected]>

* Wasm doesn't need wabt anymore

Signed-off-by: Ana Hobden <[email protected]>

* Fixup CI

Signed-off-by: Ana Hobden <[email protected]>

* Remove bench features haha

Signed-off-by: Ana Hobden <[email protected]>

* Correct lib authors

Signed-off-by: Ana Hobden <[email protected]>

* Yank utf-8 handling change

Signed-off-by: Ana Hobden <[email protected]>

* generate

Signed-off-by: Ana Hobden <[email protected]>

* Decouple wasm modules from workspace

Signed-off-by: Ana Hobden <[email protected]>
Signed-off-by: Brian Menges <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: transforms Anything related to Vector's transform components type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WASM plugin architecture RFC New wasm transform
7 participants