-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115999: Specialize STORE_ATTR
in free-threaded builds.
#127838
base: main
Are you sure you want to change the base?
Conversation
STORE_ATTR
free-threaded builds.STORE_ATTR
in free-threaded builds.
Python/bytecodes.c
Outdated
DEOPT_IF(!LOCK_OBJECT(owner_o)); | ||
if (_PyObject_GetManagedDict(owner_o) == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition here doesn't look right. Previously we had:
assert(_PyObject_GetManagedDict(owner_o) == NULL)
But now the code deopts if that's the case:
if (_PyObject_GetManagedDict(owner_o) == NULL) {
UNLOCK_OBJECT(owner_o);
DEOPT_IF(true);
}
I think we need to lock the object earlier, in _GUARD_DORV_NO_DICT
or _GUARD_TYPE_VERSION
. It's a bit awkward to hold the lock across uops, but I don't know of a better way.
I think we want to check the type version tag under the lock, so maybe _GUARD_TYPE_VERSION
there should also be a _GUARD_TYPE_VERSION_AND_LOCK
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I think. I didn't use _GUARD_TYPE_VERSION_AND_LOCK
for the STORE_ATTR_WITH_HINT
op, only for STORE_ATTR_INSTANCE_VALUE
. You want to lock the dict in that case. I think the way it's coded now there is a kind of race. If another thread creates a descriptor between _GUARD_TYPE_VERSION
and the STORE_ATTR_WITH_HINT
, the descriptor will get ignored. I think that's okay. I think otherwise you would have to lock both the object and object dict before checking tp_version_tag
. The code in dictobject.c
doesn't appear to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on! This looks like a regression on the default build. I haven't had a chance to dig into it, but I suspect it might either be due to the check that Sam flagged in _STORE_ATTR_INSTANCE_VALUE
or the change to when we read the type version in _Py_Specialize_StoreAttr
. It looks like the richards benchmark is the most heavily affected, so that might be a good isolated benchmark to use for debugging.
Python/specialize.c
Outdated
@@ -946,55 +1004,28 @@ specialize_dict_access( | |||
SPECIALIZATION_FAIL(base_op, SPEC_FAIL_ATTR_NOT_MANAGED_DICT); | |||
return 0; | |||
} | |||
_PyAttrCache *cache = (_PyAttrCache *)(instr + 1); | |||
if (type->tp_flags & Py_TPFLAGS_INLINE_VALUES && | |||
_PyObject_InlineValues(owner)->valid && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to hold the critical section on owner
across the _PyObject_InlineValues(owner)->valid
check and the call to specialize_dict_access_inline
. Otherwise, I think it's possible that we race with someone invalidating the inline values between our check and when we take the critical section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've addressed this now following similar logic to _PyObject_StoreInstanceAttribute
. Lock the object and then get the managed dict again. If it was created, abort the specialization (we lost race during locking).
Do we need to an additional check to ensure the inline values are still valid? I'm not sure.
Based on my benchmarking my second commit, the regression with Regarding the |
This was actually not hard to fix. I was thinking that |
* Fix locking for `STORE_ATTR_INSTANCE_VALUE`. Create `_GUARD_TYPE_VERSION_AND_LOCK` so that object stays locked and `tp_version_tag` cannot change. Fix inverted logic bug that caused erroneous deopt. * Fix locking for `_STORE_ATTR_WITH_HINT`. Double check that `_PyObject_GetManagedDict()` hasn't disappeared since we locked the dict. - Pass `tp_version_tag` to `specialize_dict_access()`, ensuring the version we store on the cache is the correct one (in case of it changing during the specalize analysis). - Split `analyze_descriptor` into `analyze_descriptor_load` and `analyze_descriptor_store` since those don't share much logic. Add `descriptor_is_class` helper function. - In `specialize_dict_access`, double check `_PyObject_GetManagedDict()` in case we race and dict was materialized before the lock.
If the type is new and a version tag hasn't yet been assigned, we would fail to specialize it. Use `_PyType_LookupRefAndVersion()` instead of `type_get_version()`, which will assign a version.
Use provided value of `tp_version` to store in cache.
This also fixes the case if the dict is replaced with a different one.
f920dcd
to
4c484ab
Compare
I rebased on |
STORE_ATTR_INSTANCE_VALUE
,STORE_ATTR_SLOT
,STORE_ATTR_WITH_HINT
). Need a combination of locks and atomics to be safe._Py_Specialize_StoreAttr
. Avoid using borrowed references. Save and store thetp_version_tag
from the beginning of the specialization process since it might change. Use helper functions to update opcode.--disable-gil
builds #115999