-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: how feasible is it to share Freezed objects between goja.Runtimes #232
Comments
To give you some of the context, without you having to read a whole PR, the situation is like this:
The question is, to avoid data races, do we need to serialize and unserialize (via JSON or some other mechanism) the data pieces we give to each goja runtime, so they are completely separate copies? Or can we directly return pieces of the original |
First of all, freezing a go value wrapped into goja.Object won't work, because it's not possible to store ECMAScript property attributes in the wrapper (as the wrapper is for the entire value, not its properties). If you have a goroutine-safe go value (e.g. sync.Map), it's possible to share it between runtimes as long as you create an individual wrapper for each runtime, i.e.: var m sync.Map
runtime1.Set("m", runtime1.ToValue(&m))
suntime2.Set("m", runtime2.ToValue(&m)) If you use a Proxy, then the Proxy itself and the handler also must be created per runtime. If the proxy handler is native it can be shared as long as it's goroutine-safe. It's important to understand that the wrapped value must be fully recursively goroutine-safe, for example if the sync.Map contains unsafe values (such as regular maps) there will be data races. The recursive safety could in theory be enforced by the proxy handler: when returning property values it could wrap them into another Proxy which will enforce immutability or synchronisation. Hope this helps. |
I think @na-- confused you somewhat. Or at least I think so ... Rewording the question somewhat: If we have an object that is only:
Would calling The use case is that we want users to be able to load a JSON file (or something of similar data variety) and be able to only have 1 copy in memory (or as close to 1 copy as possible) while being able to access elements of it between multiple Runtimes (read upwards of 100 - usually more). The current solution (which I like a lot more than this idea, to be honest) is simply to get the original object from the first runtime and serialize it as JSON and then unserialize only the parts we need. At this point, this is only done for arrays and technically we serialize each key separately and unserialize it into a new object that is then given to each Runtime when each element is requested. This happens through having a Proxy for each Runtime that "emulates" somewhat the way an array behaves (currently having |
Yes. Non-primitive
No, because the proxy handler will still be accessing the original value. |
Each runtime would have its own But if that's still not a good idea, do you have any suggestions how to handle the underlying problem we're trying to solve? We have some huge static data array (e.g. from parsing a huge 10 MB JSON/CSV/etc. file), and we have hundreds of concurrent goja runtimes. We want to give random access to that data to every runtime, without having a full copy of the huge array in every runtime. Streaming JSON/CSV parsers are not an option for this, each runtime should be able to request any element of the array at any time efficiently. Any suggestions? |
What do you mean by 'equivalent of runtime.ToValue(data)'? |
I tried to shorten the explanation a bit, to the detriment of quality, sorry... In reality, say that The current value for And while we can probably optimize the copying (via As a nice UX bonus, it will be very obvious that the result is read-only. Whereas with the current copying solution, the following script will be very strange: var a = jsProxyToTheSharedData[someIndex]; // e.g. this is {foo: 1}
console.log(a.foo); // prints 1
a.foo++;
console.log(a.foo); // prints 2
console.log(jsProxyToTheSharedData[someIndex].foo); // prints 1 |
Then my initial comment stands. You cannot call Object.freeze() on a native object, i.e. the following: func TestFreezeNative(t *testing.T) {
vm := New()
vm.Set("o", map[string]interface{}{})
_, err := vm.RunString(`
o.prop = true;
Object.freeze(o);
`)
if err != nil {
t.Fatal(err)
}
} will fail with What you can do is wrap the value into a Proxy that prevents modifications and also wraps any property into a similar Proxy before returning, thus making it recursive. This is currently the only way to make a native object immutable. But then you have to consider non-ASCII strings. ToValue() will convert them into UTF-16 which means you won't save any memory this way. Even for ASCII strings it will scan them to make sure they are, which means you'll waste some CPU. Strings currently are immutable so you can share them, but at the moment they can only be created in a Runtime. I could make a global function |
Sorry for the slow reply and the long post, as the saying goes "If I had more time ..." I actually rewrote the feature to To be perfectly honest I like the latest iteration more even though it is slightly less performant. The whole idea of this feature, again, is that users want to parameterize their scripts with data that they have dumped in some format (JSON, csv are common) and for that, up until now they needed to load the whole data for each VU or do some hacks as that would've meant that they have a copy of the data multiple times (a few hundred, but maybe thousands). function generateArray() {
return JSON.parse(open("./arr.json"));
}
var s = {};
if (__VU > 0 && __VU < 2) { // We will only load the json for VU 1
s = generateArray();
}
exports.default = function () { // this is just what gets called by k6 on each iteration as long as there are iterations to be done
console.log(s.length, __VU);
} where "arr.json" has 100k lines of
This shows me that the parsed JSON expands from around 4.7M to around 90-100M once parsed and saved in memory. If I load it in 10VUs I get 1.2+GB of usage which seems to confirm the number.
So for me, this is a yak shaving at this point for k6. Given that I will add some things that would've helped (but unfortunately probably all of them will need to be true in order for this to work better):
In a lot of cases what I find developing with goja is that I either need to call some amount of methods to get some I don't think though that neither the ReadOnlyWrapper nor the above list are as interesting or needed as much as just more ES6 support, as that lets us drop stuff from our babel+corejs combo and get both better performance and less memory usage. |
I can assure you that using API is way faster than running JS code: func BenchmarkGetPropertyAPI(b *testing.B) {
vm := New()
b.ResetTimer()
for i := 0; i < b.N; i++ {
vm.Get("Object").(*Object).Get("freeze")
}
}
func BenchmarkGetPropertyJS(b *testing.B) {
vm := New()
prg := MustCompile("test.js", "Object.freeze", false)
b.ResetTimer()
for i := 0; i < b.N; i++ {
vm.RunProgram(prg)
}
}
Other than that I take it that you have found a suitable solution so this can be closed? |
yeah @dop251, the solution we had previously was already ... suitable IMO :). I might try to call things through the API more, but my benchmarking of the code doesn't show it is slow enough for this to matter ... so maybe on a second iteration in the future, or if there is more time. Also, #227 will still be needed to remove most of the code we currently need. Thanks a lot for the discussion and I am going to close the issue now. |
The question is more or less in the title. The specific use case is an array with objects that mostly will contain strings, numbers, or other array/objects containing the same.
Obviously having a function and calling that function will be bad and having a regex can trigger #214, but are there such situations for the other types, and are there other things that need to be taken into account ... apart from me thinking that this is a bad idea and that w/e answer you give will change over time.
For more context, you can look at this discussion grafana/k6#1739 (comment)
The text was updated successfully, but these errors were encountered: