-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #38003 - more efficient handle_acs_product_removal #11214
base: master
Are you sure you want to change the base?
Fixes #38003 - more efficient handle_acs_product_removal #11214
Conversation
I dont understand the failed tests:
should be independent.
That seems relevant, but I can not reproduce it - what does the test do differently than me? 1) create a product and a repo with feed URL, 2) create ACS referring to the product, 3) delete the repo - here, some extra debugs show me that Maybe this is a consequence of (unrelated?) error
? |
Test failures are unrelated, restarting them. There was an issue with remote_execution that caused those. |
repo_content_types.add(test_repo.content_type) if (repository.id != test_repo.id) && test_repo.url.present? | ||
end | ||
root_ids = product.repositories.where.not(:id => repository.id).pluck(:root_id).uniq | ||
repo_content_types = ::Katello::RootRepository.where(:id => root_ids).where.not(:url => [nil, '']).pluck(:content_type).uniq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
repo_content_types = ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).where.not(:url => [nil, '']).distinct.pluck(:content_type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my original patch :) But wasnt sure about code readability here (or where to split the cmd into multiple lines).
Either patch is fine for me. Let me know if I shall update the PR per this suggestion.
Before PR: [root@ip-10-0-168-118 ~]# top
top - 14:31:25 up 22:44, 2 users, load average: 1.13, 1.17, 1.50
Tasks: 315 total, 2 running, 313 sleeping, 0 stopped, 0 zombie
%Cpu(s): 16.6 us, 0.3 sy, 0.0 ni, 83.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 23776.6 total, 8386.2 free, 14127.7 used, 1854.1 buff/cache
MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 9648.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
162874 foreman 20 0 9849.0m 10.7g 12032 S 100.0 40.9 15:40.76 rails
2024-11-13T14:38:54 [I|app|fedcbe8b] Completed 202 Accepted in 1107208ms (Views: 136.3ms | ActiveRecord: 26627.4ms | Allocations: 346926573) With @parthaa suggestion: |
Was the reason for this code existing tested as well? You would test it by deleting the last repository with an upstream URL in a product that belongs to a simplified ACS. The code should then remove the empty product from the ACS. |
Yes, I tested this scenario on a backport of the patch to my Sat6.15. I noticed log |
@pmoravec Are you able to push up a change with Partha's suggestion in it and rebase(that should fix the test failures) The improvement is so much faster. My satellite UI literally crashed because it took to long to return the task. |
9febdb8
to
b368813
Compare
Sure, here you are ;-) |
b368813
to
45b89be
Compare
Still same/similar errors in CI test..? :-o |
This looks like the test failure: Error: Failure: test_0009_plans ACS product removal when removing the deleting the last repo with URL
Expected: 0
Actual: 1 @ianballou do you know why that would have failed with the new DB query? |
I'm not sure just by looking at the query, the best way to find out would be to debug the test and see what's actually going wrong. |
I'm looking into it a bit, I think the issue might come down to the test data caching old root repository URLs. I was seeing unexpected results during debugging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some debugging, hopefully these results help a little:
# we need to check id because test_repo will still contain the old, non-nil url | ||
repo_content_types.add(test_repo.content_type) if (repository.id != test_repo.id) && test_repo.url.present? | ||
end | ||
repo_content_types = ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).where.not(:url => [nil, '']).distinct.pluck(:content_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the failing test, the
::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id))
part of this query is not filtering out the last root repository with a URL that is about to be destroyed:
[11] pry(#<Actions::Katello::Repository::Destroy>)> product.repositories.where.not(:id => repository.id).select(:root_id)
=> [#<Katello::Repository:0x00007f9e07c16248 id: nil, root_id: 146294567>,
#<Katello::Repository:0x00007f9e07c16180 id: nil, root_id: 355376805>,
#<Katello::Repository:0x00007f9e07c160b8 id: nil, root_id: 355376805>,
#<Katello::Repository:0x00007f9e07c15ff0 id: nil, root_id: 956784787>,
#<Katello::Repository:0x00007f9e07c15f28 id: nil, root_id: 355376805>,
#<Katello::Repository:0x00007f9e07c15e60 id: nil, root_id: 355376805>,
#<Katello::Repository:0x00007f9e07c15d98 id: nil, root_id: 355376805>,
#<Katello::Repository:0x00007f9e07c15cd0 id: nil, root_id: 355376805>]
[12] pry(#<Actions::Katello::Repository::Destroy>)> ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).pluck(:id)
=> [355376805, 956784787, 146294567]
Root repository with ID 355376805 is the one related to the repository being destroyed.
It remains in the query because it has a URL, so it is not getting removed from the ACS products.
I also edited the failing test to look like the following to be sure that the root repositories were having their URLs cleared properly:
it 'plans ACS product removal when removing the deleting the last repo with URL' do
::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_repository.id)
action = create_action action_class
action.stubs(:action_subject).with(published_repository)
# manually remove the URLs from all repos in product except repository
repository.product.root_repositories.each do |root|
root.update_column(:url, nil) unless root.id == repository.root.id
end
url_sum = repository.product.root_repositories.count do |root|
root.url.present?
end
assert_equal(1, url_sum) # double check there's only one URL left
# Since there is only one URL remaining, the product should be removed
plan_action action, published_repository, remove_from_content_view_versions: true
simplified_acs.reload
assert_equal(0, simplified_acs.products.length)
end
end
This part of the test:
repository.product.repositories.each do |repo|
repo.root.url = nil unless repo.id == repository.id
end
I believe is flawed because two of those repositories could share a root, which could cause every root url to become nil.
I think the test needs to be analyzed and corrected first so we can be sure that it isn't actually catching a regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qcjames53 might have some more context, it was a long time ago that this code was written though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianballou Thanks for bringing me in on this. Good catch with the possibility of two repositories sharing the same root in the test case. It looks like url_sum
always comes out to 1 with the test's data due to some wacky ruby shenanigans with referencing.
Another issue I found with some byebugging is that the root urls are never actually updated if you're coming at the test from the ::Katello::RootRepository
side of things, so the urls the destroy function tests against are still the old values. This isn't an issue in normal use, just a side effect of how I set up the test.
Here's a corrected version of the test. I luckily was able to repurpose the rhel_7 repo since it is the only repo pointing to the rhel_7 root repository:
class DestroyTest < TestBase
...
let(:published_rhel7_repository) { katello_repositories(:rhel_7_no_arch) }
...
it 'plans ACS product removal when deleting the last repo with URL' do
::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_rhel7_repository.id)
action = create_action action_class
action.stubs(:action_subject).with(published_rhel7_repository)
# manually remove the URLs from all repos in product except repository
testing_repo = repository.product.repositories[4]
testing_repo.product.repositories.each do |repo|
repo.root.url = nil unless repo.id == testing_repo.id || repo.root.id == testing_repo.root.id
repo.root.save!(validate: false)
end
url_sum = testing_repo.product.repositories.count do |repo|
repo.root.url.present?
end
assert_equal(1, url_sum) # double check there's only one URL left
# Since there is only one URL remaining, the product should be removed
plan_action action, published_rhel7_repository, remove_from_content_view_versions: true
simplified_acs.reload
assert_equal(0, simplified_acs.products.length)
end
end
@@ -70,11 +70,7 @@ def handle_acs_product_removal(repository) | |||
# Remove products from ACS's that contain no repositories which both | |||
# match the ACS content type and have a non-nil URL | |||
product = repository.product | |||
repo_content_types = Set.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This slow logic also exists in app/lib/actions/katello/repository/update.rb
when a user removes a product's url. Would you mind replacing the old loop with your new lookup there as well? There's a chance plans ACS product removal when removing the last URL for a product
will break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I updated the other file the same way.
45b89be
to
7ad9723
Compare
@pmoravec @qcjames53 it looks like the same test is failing :( |
Yeah.. The own manual test case "add product with last repo to ACS, delete the repo, check the ACS" against the current code (backported to a Satellite) passes to me. Is the test case updated in upstream? It is https://github.com/Katello/katello/blob/master/test/actions/katello/repository_test.rb#L210-L228 ? Also, should not I update the test on my branch? I mean, what test is executed by the CI jobs - from the given PR or from upstream? |
Just to be super safe, does it also work if you have two yum repositories in a product with an ACS, one with a URL and one without a URL, and you delete the repo with the URL? |
You can replace the failing test with the fixed one that @qcjames53 provided in the comments above. The test CI runs the tests that are in your own branch used in this PR. |
7ad9723
to
3a63647
Compare
Running test:
I got:
So here the ACS<->Product relation was properly removed. I updated the test in my PR as well, let see now.. |
I dont want to fight for my change, it can be possible I have a mistake there, just a thought about the slight difference in checks. Original check applied test:
for each repository individually. My approach applies the
and that is the same like my check does, just in different ordering of conditions. I replaced About the modified test case: what is the reference to |
With the testing being done I'm pretty confident that your PR is okay. Since your fix here brings high value, if we can't get the test working in a couple more days I suggest we temporarily disable it. We can get the test working again after the fact to release you Pavel :) |
|
||
# manually remove the URLs from all repos in product except repository | ||
repository.product.repositories.each do |repo| | ||
repo.root.update!(url: nil) unless repo.id == repository.id | ||
testing_repo = repository.product.repositories[4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qcjames53 I'm not sure if the repository at index [4] will consistently be the same one, we'd probably need to look up by name instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually.... it looks like this was applied to the wrong test? The failing test is plans ACS product removal when removing the deleting the last repo with URL
. I'll see about updating that other test to see if it helps.
@pmoravec this diff should fix all of the tests:
|
Updating also relevant test per qcjames53's and ianballou's comments. Signed-off-by: Pavel Moravec <[email protected]>
3a63647
to
76a17a8
Compare
Cool, that helped, thanks! Maybe I copy&pasted wrong patch from the discussion, sorry for the confusion. Now just rpm-build on RHEL8 is failing - I think due to this:
but I have no idea how to resolve it (though it should be independent on this PR). |
We don't need the EL8 to pass, those are gone now and removed yesterday. Will give it another test and if it looks good will merge it in. Thanks @ianballou @qcjames53 for helping Pavel getting the tests passing. |
What are the changes introduced in this pull request?
Simplified calculation of content types that remain in the product after the repo removal. One (or two) ActiveRecord queries are much faster and much less memory demanding than traversing potentially hundreds or thousands of repos in the product.
Considerations taken when implementing this change?
The PR was tested in one call flow only (CV version deletion).
What are the testing steps for this pull request?
See the foreman issue.