Extract variable api from ScriptSession, let ScriptSession guard reads #4970

niloc132 · 2023-12-20T22:34:07Z

This patch moves from assuming that the ScriptSession owns the ExecContext for locking (with UpdateGraph) and variable access (with QueryScope). Liveness is also moved to the QueryScope, as is the responsibility of unwrapping objects (but delegated to ScriptSession as needed).

Previously, the groovy script session used a lock to interact with its query scope, but python did not under the incorrect assumption that the GIL would guard those reads. Instead, a read/write lock is used to ensure that reads block while an operation is running that could affect values, but avoiding reads blocking each other.

Partial #4040

This patch moves from assuming that the ScriptSession owns the ExecContext for locking (with UpdateGraph) and variable access (with QueryScope). Previously, the groovy script session used a lock to interact with its query scope, but python did not under the incorrect assumption that the GIL would guard those reads. Instead, a read/write lock is used to ensure that reads block while an operation is running that could affect values, but avoiding reads blocking each other. Partial deephaven#4040

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

nbauernfeind · 2023-12-20T23:52:10Z

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

@@ -141,8 +151,9 @@ public synchronized final Changes evaluateScript(final String script, @Nullable
        final Changes diff;
        // retain any objects which are created in the executed code, we'll release them when the script session
        // closes
+        variableAccessLock.writeLock().lock();


The above comment applies to the try block. Let's add a comment why we hold the writeLock for the duration of the script evaluation (IIRC to avoid mid-script race conditions?).

Let's take a few minutes today and consider the three locks in play?

UG exclusive/shared lock - exclusive lock is held for the duration of any executed script code

ScriptSession state read/write lock, preventing any (well, most) writes while anyone is actively reading

ScriptSession instance lock, appears only used to guard snapshots for scope changes. Probably can be folded into the new read/write lock.

External callers could already hold UG shared/exclusive locks when reading from QueryScope, so with the current code we could hit a lock inversion issue - take the write lock, wait to acquire the UG exclusive lock, then in another thread while holding a either UG lock try and acquire the read lock.

I think we can simply move the writelock inside apply(() -> {...}), plus as above remove the synchronized keyword usage?

One more lock: the GIL. Doesn't appear to impact the above, since we take the GIL inside of the other locks, and if we're in Java, calling in to one of these, we no longer hold the GIL. Only would matter if some code deliberately held the GIL in Java (which happens, but should only be possible when doing GC sorts of stuff).

One more reply from me, then i'll defer to when we do a call: it isnt quite so simple to remove synchronized, since you can't upgrade a read lock to a write lock. If the whole write lock needs to be wider (encompass calls to takeSnapshot, observeScopeChanges, possibly createDiff), then the UG exclusive lock needs to also be correspondingly bigger as well.

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

server/src/main/java/io/deephaven/server/console/completer/PythonAutoCompleteObserver.java

nbauernfeind · 2023-12-21T01:57:54Z

server/src/main/java/io/deephaven/server/console/completer/PythonAutoCompleteObserver.java

@@ -110,7 +113,7 @@ private void handleAutocompleteRequest(AutoCompleteRequest request,
                request.getRequestId() > 0 ? request.getRequestId() : request.getGetCompletionItems().getRequestId();
        try {
            final ScriptSession scriptSession = exportedConsole.get();
-            PyObject completer = (PyObject) scriptSession.getVariable("jedi_settings");
+            PyObject completer = (PyObject) scriptSession.getVariableProvider().getVariable("jedi_settings", null);


I know you didn't do this -- but why are we providing the script session to the observer instead of the settings?

This is the only construction:

try { final ScriptSession scriptSession = scriptSessionProvider.get(); scriptSession.evaluateScript( "from deephaven_internal.auto_completer import jedi_settings ; jedi_settings.set_scope(globals())"); settings[0] = scriptSession.getVariableProvider().getVariable("jedi_settings", null); } catch (Exception err) { log.error().append("Error trying to enable jedi autocomplete").append(err).endl(); } boolean canJedi = settings[0] != null && settings[0].call("can_jedi").getBooleanValue(); log.info().append(canJedi ? "Using jedi for python autocomplete" : "No jedi dependency available in python environment; disabling autocomplete.").endl(); return canJedi ? new PythonAutoCompleteObserver(responseObserver, scriptSessionProvider, session) : new NoopAutoCompleteObserver(session, responseObserver);

We should pass the settings in instead of the script session provider. If the variable is "allowed" to change over time, then I still would rather pass in a Supplier<PyObject> instead of the script session provider.

I have a major problem with scope creep in these refactors, and would rather leave this to someone else who wants to overhaul autocomplete (again).

server/src/test/java/io/deephaven/server/appmode/ApplicationTest.java

server/test/src/main/java/io/deephaven/server/test/FlightMessageRoundTripTest.java

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

* Synchronize access to groovy variables * unmanage items removed from query scope * Ensure the script session's query scope is active when executing code

server/src/test/java/io/deephaven/server/runner/DeephavenApiServerTestBase.java

server/src/test/java/io/deephaven/server/appmode/ApplicationTest.java

nbauernfeind · 2024-01-04T00:33:33Z

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java

-    public Set<String> getVariableNames() {
-        return Collections.unmodifiableSet(scope.getKeys().collect(Collectors.toSet()));
+    protected Set<String> getVariableNames() {
+        return scope.getKeys().collect(Collectors.toUnmodifiableSet());


As is, we end up making three copies of this key-set! PyDictWrapper#keySet creates a copy of the keys (instead of a view). Then we copy it a second time in PythonDeephavenSession#getVariableNames using a collector. Finally, we copy it a third time in ScriptSessionQueryScope#getParamNames to filter out non-valid names.

We should pass a Predicate to getVariableNames so that the Groovy impl can filter under the param-map's object monitor. That makes python a two-copy impl and the groovy one a single-copy. (instead of three and two respectively)

Do we need to synchronize w/setVariable to avoid changing the map while PyDictWrapper#keySet is making the initial copy? It looks like if you some setVariable to null while iterating that the result set could include the removed variable and not the variable that was next in the dict. Also, note that PyListWrapper#iterator() is not protected from size changes and will appears to return null for the last item in the dict (if the size is decreased).

It looks like this would cause an NPE in the LinkedHashSet construction within PyDictWrapper#keySet.

Good points, will pass a predicate, stream results into the unmodifiable set.

Racing with setVariable is somewhat fraught - after all, setVariable isn't the only way to assign values into the collection, python itself could be assigning/removing something. I think technically I can use PyLib.ensureGil() here, to guarantee that reads of python structures can't change while reading, in place of the synchronized blocks that Groovy's impl uses). Any AbstractScriptSession override that makes more than one call into python should use this (and note that it is not safe to use this to call into arbitrary python, as that python code could end up dropping the GIL despite our efforts - ensureGil only lets us be sure that we have the GIL in Java, JNI, and our own C).

nbauernfeind · 2024-01-04T00:47:05Z

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

@@ -21,7 +24,7 @@
 /**
 * Variable scope used to resolve parameter values during query execution.
 */
-public abstract class QueryScope implements LogOutputAppendable {
+public abstract class QueryScope extends LivenessScope implements LogOutputAppendable {


Ryan mentioned that exposing the release method from LivenessScope is concerning to him.

I believe we can appease his concerns by changing the signature to:

public abstract class QueryScope extends ReferenceCountedLivenessNode implements ReleasableLivenessManager, LogOutputAppendable

Then there is no release() exposed to the user. We should probably override destroy in the standalone impl to clear the map for consistency.

Don't forget to add something like this to the constructor:

if (!Liveness.REFERENCE_TRACKING_DISABLED) { incrementReferenceCount(); }

REFERENCE_TRACKING_DISABLED is package-protected - I think the rest makes sense, I'll push a patch shortly with the rest of this for another pass.

This breaks "opening" the queryscope as a liveness scope when executing code:

// retain any objects which are created in the executed code, we'll release them when the script session // closes try (final S initialSnapshot = takeSnapshot(); final SafeCloseable ignored = LivenessScopeStack.open(queryScope, false)) {

We could go back to passing this, but that seems to defeat the purpose of that change...

engine/table/src/main/java/io/deephaven/engine/table/impl/lang/QueryLanguageParser.java

nbauernfeind · 2024-01-04T00:59:06Z

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java


            try {
                // Actually evaluate the script; use the enclosing auth context, since AbstractScriptSession's
                // ExecutionContext never has a non-null AuthContext
                executionContext.withAuthContext(ExecutionContext.getContext().getAuthContext())
+                        .withQueryScope(queryScope)


I don't believe this is necessary. Isn't this queryScope already assigned to the executionContext in the constructor?

It might not be necessary today, that's correct, but if not needed, it will be a no-op. Where possible (and straightforward, and obviously not-incorrect), Ryan and I were trying to make this ready for multiple script sessions, so if it does nothing, I'd be inclined to keep it.

This execution context is created in the constructor of this object with this exact query scope. Static analysis shows, even with multiple script sessions, that we're already protected for multiple script sessions.

What am I missing?

engine/table/src/main/java/io/deephaven/engine/util/NoLanguageDeephavenSession.java

nbauernfeind · 2024-01-04T01:06:29Z

engine/table/src/test/java/io/deephaven/engine/table/impl/FuzzerTest.java

@@ -117,7 +117,7 @@ private void testFuzzerScriptFile(final long timeSeed, String s, boolean realtim
            clock.now += DateTimeUtils.SECOND / 10 * timeRandom.nextInt(20);
        }

-        final TimeTable timeTable = (TimeTable) session.getVariable("tt");
+        final TimeTable timeTable = session.getQueryScope().readParamValue("tt", null);


Won't this NPE if it doesn't exist? Probably you meant to leave this as throwing a MissingVarException?

You're not wrong, but the test will fail either way.

nbauernfeind · 2024-01-04T01:08:47Z

open-api/lang-tools/src/main/java/io/deephaven/lang/completion/ChunkerCompleter.java

+        final Object instance = variables.readParamValue(o.getName(), null);
+        if (instance != null) {
+            if (instance instanceof Table) {
+                return Optional.of(Table.class);


I don't get this. Why are we restricting sub-classes of Table to the top-level interface? Does it solve a real problem?

This is inlining the old implementation that VariableProvider did, replicated without additional thought.

I would want to think long and hard before pushing BaseTable or QueryTable methods into auto-complete, to be honest. Let's not change the status quo in this PR.

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

nbauernfeind

LGTM; I believe Ryan wants to pass through this, too (but if not, it's good to me).

niloc132 · 2024-01-09T15:15:10Z

Passing nightlies: https://github.com/niloc132/deephaven-core/actions/runs/7455151234

niloc132 · 2024-01-10T14:48:35Z

Another nightly run after merging and dealing with conflicts: https://github.com/niloc132/deephaven-core/actions/runs/7467840686

rcaudy

Partial review, I think I'm covering the most important classes, want to discuss.

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

rcaudy · 2024-01-10T22:18:17Z

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

@@ -119,6 +114,12 @@ private static Object applyValueConversions(final Object value) {
        return value;
    }

+    protected QueryScope() {
+        super(false);


Since the only things we intend to manage are things we already hold strongly, maybe we should actually use enforceStrongReachability=true. That results in a simpler/cleaner implementation of reference tracking. On the other hand, if we expect query scopes to be leaked (as in garbage), that may not be what we want.

This question could now apply to ScriptSessionQueryScope, do you still want to consider it?

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java

rcaudy · 2024-01-11T15:29:00Z

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java

+                } else {
+                    wrapped = PythonObjectWrapper.wrap(newValue);
+                }
+                // This isn't thread safe, we're relying on the GIL being kind to us (as we have historically done).


Is this true, given that we've enclosing with ensureGil?

Technically no, the GIL doesn't guarantee us practically anything even if this was one call instead of two - __setitem__ is just a function that could be redefined in python in a way that lets it drop the gil while invoking it...

Practically, we hope that it continued to be kind to us.

rcaudy · 2024-01-11T15:32:25Z

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java


        // Observe changes from this "setVariable" (potentially capturing previous or concurrent external changes from
        // other threads)
        observeScopeChanges();
+
+        // This doesn't return the same Java instance of PyObject, so we won't decref it properly, but


I want to make sure I understand this, let's discuss.

Call ScriptSessionQueryScope.putParam("foo", somePyObjectInstance), and then call same = readParamValue("foo") to get it back - the PyObject instances are pointing at the "same" python object, but same == somePyObjectInstance is false - they are separate java object instances, and contain separate borrowed pointers into py/c.

What's the solution? Is @jmao-denver aware?

There isn't a great solution in general, except to cleanly close PyObject basically everywhere - java gc is required to mop up all the unclosed PyObject. This is wrong in enough places in DHC that it isn't worth dedicating effort to it here.

rcaudy

Mostly happy. A bit concerned about atomicity between listing names and iterating values. A few other nits or questions.

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

engine/context/src/main/java/io/deephaven/engine/context/StandaloneQueryScope.java

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java

rcaudy · 2024-01-18T00:18:28Z

engine/sql/src/main/java/io/deephaven/engine/sql/Sql.java

+        for (String name : queryScope.getParamNames()) {
+            Object paramValue = queryScope.unwrapObject(queryScope.readParamValue(name));


No, to make the readParamValue results be consistent with getParamNames. Otherwise, you could have a MissingVariableException due to concurrent mutation of the QueryScope.
Feels like a missing method to get values, to be honest.

rcaudy · 2024-01-18T00:20:55Z

open-api/lang-tools/src/main/java/io/deephaven/lang/completion/ChunkerCompleter.java

+        final Object instance = variables.readParamValue(o.getName(), null);
+        if (instance != null) {
+            if (instance instanceof Table) {
+                return Optional.of(Table.class);


I would want to think long and hard before pushing BaseTable or QueryTable methods into auto-complete, to be honest. Let's not change the status quo in this PR.

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

server/src/main/java/io/deephaven/server/console/ScopeTicketResolver.java

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java

engine/context/src/main/java/io/deephaven/engine/context/StandaloneQueryScope.java

server/src/main/java/io/deephaven/server/console/ScopeTicketResolver.java

rcaudy

I'm happy enough, but for the missing unwraps in Sql and ScopeTickerResolver.

rcaudy

Please merge if nightlies pass.

niloc132 added 3 commits December 20, 2023 13:28

Further shift from script session to query scope, including liveness

4e2beee

Restore method needed for python local scope hack

7041746

niloc132 added NoDocumentationNeeded python-server-side groovy ReleaseNotesNeeded Release notes are needed labels Dec 20, 2023

niloc132 added this to the December 2023 milestone Dec 20, 2023

niloc132 requested review from nbauernfeind, rcaudy and devinrsmith December 20, 2023 22:34

niloc132 commented Dec 20, 2023

View reviewed changes

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java Outdated Show resolved Hide resolved

niloc132 commented Dec 20, 2023

View reviewed changes

Integrations/src/main/java/io/deephaven/integrations/python/PythonDeephavenSession.java Show resolved Hide resolved

Fix test setup, correct unwrapping in sql

06ce7f0

nbauernfeind reviewed Dec 21, 2023

View reviewed changes

niloc132 commented Dec 21, 2023

View reviewed changes

engine/table/src/main/java/io/deephaven/engine/util/AbstractScriptSession.java Outdated Show resolved Hide resolved

niloc132 added 6 commits December 21, 2023 10:54

Partial review feedback

7b57fa2

Remove VariableProvider, use QueryScope instead

5a1a7cf

Remove dead method, dead cleanup code in flight round trip test

d309e35

Remove more dead code, fix potential lock order issue

8b7c820

Field ordering

bd2abcc

Correct comment layout

8f8102e

niloc132 requested a review from nbauernfeind December 21, 2023 18:55

niloc132 added 2 commits December 22, 2023 09:18

Review feedback:

4efb27e

* Synchronize access to groovy variables * unmanage items removed from query scope * Ensure the script session's query scope is active when executing code

Revert attempt to use optional (incompatible with null returns)

c049476

nbauernfeind reviewed Jan 4, 2024

View reviewed changes

niloc132 added 4 commits January 8, 2024 11:11

Merge branch 'main' into 4040-cleanup-5

ff6a3ba

Review feedback

deb0a77

Old requested rename, and added missing transferTo

c5c1ce7

Finish implementation (don't trust IJ's refactor tools..?)

42ba9be

nbauernfeind reviewed Jan 8, 2024

View reviewed changes

engine/context/src/main/java/io/deephaven/engine/context/QueryScope.java Outdated Show resolved Hide resolved

nbauernfeind previously approved these changes Jan 8, 2024

View reviewed changes

Review feedback

0b6caa8

niloc132 dismissed nbauernfeind’s stale review via 0b6caa8 January 9, 2024 00:55

niloc132 added 2 commits January 9, 2024 10:49

Merge branch 'main' into 4040-cleanup-5

d3f0472

Merge branch 'main' into 4040-cleanup-5

e3a0583

rcaudy reviewed Jan 11, 2024

View reviewed changes

niloc132 modified the milestones: 0. December 2023 (end of month), 1. January 2024 (end of month) Jan 11, 2024

niloc132 added 5 commits January 11, 2024 15:10

Live code review

cb8646d

Live review feedback

abb9dcb

Clean up standalone impl

25e4da7

Merge branch 'main' into 4040-cleanup-5

7843871

spotless

d2208c3

rcaudy reviewed Jan 18, 2024

View reviewed changes

Review feedback

7598e14

niloc132 requested review from rcaudy and nbauernfeind January 19, 2024 13:55

rcaudy reviewed Jan 19, 2024

View reviewed changes

Review feedback

668596d

rcaudy reviewed Jan 19, 2024

View reviewed changes

Add missing queryScope.unwrapObject() calls

1e32a0e

rcaudy approved these changes Jan 19, 2024

View reviewed changes

niloc132 merged commit d3dae15 into deephaven:main Jan 22, 2024
19 checks passed

github-actions bot locked and limited conversation to collaborators Jan 22, 2024

		for (String name : queryScope.getParamNames()) {
		Object paramValue = queryScope.unwrapObject(queryScope.readParamValue(name));

Extract variable api from ScriptSession, let ScriptSession guard reads #4970

Extract variable api from ScriptSession, let ScriptSession guard reads #4970

Conversation

niloc132 commented Dec 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nbauernfeind left a comment

Choose a reason for hiding this comment

niloc132 commented Jan 9, 2024

niloc132 commented Jan 10, 2024

rcaudy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcaudy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcaudy left a comment

Choose a reason for hiding this comment

rcaudy left a comment

Choose a reason for hiding this comment