-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TokenCleanupService creates a sub optimal query plan with an expensive keylookup #1304
Comments
Thank you for sharing the details of the performance of this query. We rewrote the token cleanup for v7 with the intention to fix race conditions and also make it more performant. But we do have some limitations:
We are very careful when introducing breaking changes. We considered the fix you propose to not have to load the full entities, but decided that it didn't motivate a breaking change. The fix we considered was to check if there are any subscribers to the notifications and if not, simply not load the entities at all. It is very valuable for us to get the actual performance numbers and the query plan details. The conclusions I make from this is that:
Do you have any actual durations for this, i.e. what is the query duration in milliseconds? Any optimizations to this query would also have to take the next query into account. This query will bring all the relevant database pages into RAM and CPU cache. The next query deletes the data and that operation requires access to the database pages too. That next query would benefit from the database pages already being in RAM/cache memory. Regarding the OpenJSON issue that is really not an IdentityServer issue. It is an incompatibility between EF Core and SQL Server. If it would be a major performance issue we would still have to handle it. But our estimate this far has been that the way our queries are written the OpenJSON row estimate issue should not be an issue. If you have a query plan for the deletion query that would be valuable to see to understand how OpenJSON is handled. |
I agree, the query uses the index, and this is efficient, but the key look up make the query plan suboptimal. Here are the performance metrics for: Delete expired with subquery
Here is the execution plan for the actual delete query: https://www.brentozar.com/pastetheplan/?id=B1SS51dL0 Select expired records (with expensive keylookup)
Queryplan: https://www.brentozar.com/pastetheplan/?id=SyUSxtbI0 We are running on Azure SQL, so every unnecessary page retrieval (eg: a key lookup for data that is not used by X% of the implementations) is always bad for performance. We scaled our azure sql temporary to mitigate this issue. |
Thanks for sharing the numbers. I did some calculations to understand how much of the total DB time is caused by the select. For example the first line, the delete is 580 seconds, the select is 168 seconds. The total operation time is 748. The select makes up 22% of the total DB time. If I get this right, the select only makes up 11% of the total DB time. I don't think it's worth spending more time optimizing this on the EF abstraction layer. For high volume/performance deployments we do recommend tuning the database with the features available on the DB engine used and possibly replacing/reworking the queries to be optimal with the specific database. If you are using SQL Server I think that you could possibly get performance gains by changing the indexes:
This would make the delete operations efficient as they would be a range scan on the beginning of the clustered index. The read-by-key operations would be able be completed from the index only, without having to do any lookup on the clustered index. The drawback would be increased storage requirements and probably slower write/insert. As with any DB tuning you would have to test and measure the performance on your specific environment. |
I noticed now that the execution plans looks like being the estimated execution plans. Could you please share the actual execution plans as that would be needed to find out if the OpenJSON problem gives a performance issue here. |
We will implement our own TokenCleanupService that utilizes the existing indexes. Regarding the actual execution plan, we don't have the actual plan in query store. |
For those needing the implementation.
|
It looks like you have found a solution that you are happy with and that is enough to close the issue. If anyone else runs into similar issues, please open a new issue. With more profiling data we would be happy to investigate how our implementation can be improved. |
Which version of Duende IdentityServer are you using?
7.0.5
Which version of .NET are you using?
.NET 8
Describe the bug
https://github.com/DuendeSoftware/IdentityServer/blob/e9860c6488f90e8fbc11a4452b9dd111dbfae933/src/EntityFramework.Storage/TokenCleanup/TokenCleanupService.cs#L92
Fetches the whole PersistedGrant object for a cleanup, This causes very expensive key lookups.
See execution plan: https://www.brentozar.com/pastetheplan/?id=SyUSxtbI0
This could be fixed by using the existing index correctly
IX_PersistedGrants_Expiration
if only Id & Expiration are returned.But these requires a change to the public api of
IOperationalStoreNotification
since this works with anIEnumerable<PersistedGrant>
The fix would be:
To Reproduce
Expected behavior
Cleanup should not produce an execution plan with a key lookup.
Log output/exception with stacktrace
N/A
Additional context
Adding include columns to the existing index would also solve this, but then it's just a copy of the whole table.
Query plan also contains the openjson (DuendeSoftware/IdentityServer#1564) which also screws with the estimated number of rows..., while the estimated number of rows should always be equal to the batch size. In our case it's not a big deal, but could imagine that this would go wrong.
The text was updated successfully, but these errors were encountered: