Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #38003 - more efficient handle_acs_product_removal #11214

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pmoravec
Copy link
Member

What are the changes introduced in this pull request?

Simplified calculation of content types that remain in the product after the repo removal. One (or two) ActiveRecord queries are much faster and much less memory demanding than traversing potentially hundreds or thousands of repos in the product.

Considerations taken when implementing this change?

The PR was tested in one call flow only (CV version deletion).

What are the testing steps for this pull request?

See the foreman issue.

@pmoravec
Copy link
Member Author

I dont understand the failed tests:

Error: Failure: test_0804_route api/v2/job_invocations/hosts should have a permission that grants access

should be independent.

Error: Failure: test_0009_plans ACS product removal when removing the deleting the last repo with URL

That seems relevant, but I can not reproduce it - what does the test do differently than me? 1) create a product and a repo with feed URL, 2) create ACS referring to the product, 3) delete the repo - here, some extra debugs show me that repo_content_types=[] (originally it was <Set: {}>, but inclusion test works the same for lists and sets) and also I do see Removing product zoo_acs with ID 352 from ACS acs_2 with ID 2 log.

Maybe this is a consequence of (unrelated?) error

ERROR: Missing required option for --name HTTP_PROXY_NAMEERROR: Missing required option for --url HTTP_PROXY_URLDEPRECATED: global use of must_include from katello/test/actions/katello/repository_set_test.rb:97. Use _(obj).must_include instead. This will fail in Minitest 6.

?

@chris1984
Copy link
Member

I dont understand the failed tests:

Error: Failure: test_0804_route api/v2/job_invocations/hosts should have a permission that grants access

should be independent.

Error: Failure: test_0009_plans ACS product removal when removing the deleting the last repo with URL

That seems relevant, but I can not reproduce it - what does the test do differently than me? 1) create a product and a repo with feed URL, 2) create ACS referring to the product, 3) delete the repo - here, some extra debugs show me that repo_content_types=[] (originally it was <Set: {}>, but inclusion test works the same for lists and sets) and also I do see Removing product zoo_acs with ID 352 from ACS acs_2 with ID 2 log.

Maybe this is a consequence of (unrelated?) error

ERROR: Missing required option for --name HTTP_PROXY_NAMEERROR: Missing required option for --url HTTP_PROXY_URLDEPRECATED: global use of must_include from katello/test/actions/katello/repository_set_test.rb:97. Use _(obj).must_include instead. This will fail in Minitest 6.

?

Test failures are unrelated, restarting them. There was an issue with remote_execution that caused those.

repo_content_types.add(test_repo.content_type) if (repository.id != test_repo.id) && test_repo.url.present?
end
root_ids = product.repositories.where.not(:id => repository.id).pluck(:root_id).uniq
repo_content_types = ::Katello::RootRepository.where(:id => root_ids).where.not(:url => [nil, '']).pluck(:content_type).uniq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

 repo_content_types = ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).where.not(:url => [nil, '']).distinct.pluck(:content_type)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my original patch :) But wasnt sure about code readability here (or where to split the cmd into multiple lines).

Either patch is fine for me. Let me know if I shall update the PR per this suggestion.

@chris1984
Copy link
Member

chris1984 commented Nov 13, 2024

Before PR:

[root@ip-10-0-168-118 ~]# top
top - 14:31:25 up 22:44,  2 users,  load average: 1.13, 1.17, 1.50
Tasks: 315 total,   2 running, 313 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.6 us,  0.3 sy,  0.0 ni, 83.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  23776.6 total,   8386.2 free,  14127.7 used,   1854.1 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.   9648.9 avail Mem
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 162874 foreman   20   0 9849.0m   10.7g  12032 S 100.0  40.9  15:40.76 rails

 2024-11-13T14:38:54 [I|app|fedcbe8b] Completed 202 Accepted in 1107208ms (Views: 136.3ms | ActiveRecord: 26627.4ms | Allocations: 346926573)

With @parthaa suggestion:
2024-11-13T10:59:58 [I|app|dfe43680] Completed 202 Accepted in 49289ms (Views: 492.7ms | ActiveRecord: 3497.9ms | Allocations: 23707580)

@ianballou
Copy link
Member

Was the reason for this code existing tested as well? You would test it by deleting the last repository with an upstream URL in a product that belongs to a simplified ACS. The code should then remove the empty product from the ACS.

@pmoravec
Copy link
Member Author

Was the reason for this code existing tested as well? You would test it by deleting the last repository with an upstream URL in a product that belongs to a simplified ACS. The code should then remove the empty product from the ACS.

Yes, I tested this scenario on a backport of the patch to my Sat6.15. I noticed log Removing product zoo_acs with ID 352 from ACS acs_2 with ID 2 and the ACS has empty list of products now (it had just the one product before).

@chris1984
Copy link
Member

@pmoravec Are you able to push up a change with Partha's suggestion in it and rebase(that should fix the test failures) The improvement is so much faster. My satellite UI literally crashed because it took to long to return the task.

@chris1984 chris1984 self-assigned this Nov 13, 2024
@pmoravec pmoravec force-pushed the efficient-handle_acs_product_removal branch from 9febdb8 to b368813 Compare November 13, 2024 21:27
@pmoravec
Copy link
Member Author

Sure, here you are ;-)

@pmoravec pmoravec force-pushed the efficient-handle_acs_product_removal branch from b368813 to 45b89be Compare November 14, 2024 07:13
@pmoravec
Copy link
Member Author

Still same/similar errors in CI test..? :-o

@chris1984
Copy link
Member

This looks like the test failure:

Error: Failure: test_0009_plans ACS product removal when removing the deleting the last repo with URL

Expected: 0
  Actual: 1

@ianballou do you know why that would have failed with the new DB query?

@ianballou
Copy link
Member

This looks like the test failure:

Error: Failure: test_0009_plans ACS product removal when removing the deleting the last repo with URL

Expected: 0
  Actual: 1

@ianballou do you know why that would have failed with the new DB query?

I'm not sure just by looking at the query, the best way to find out would be to debug the test and see what's actually going wrong.

@ianballou
Copy link
Member

I'm looking into it a bit, I think the issue might come down to the test data caching old root repository URLs. I was seeing unexpected results during debugging.

Copy link
Member

@ianballou ianballou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some debugging, hopefully these results help a little:

# we need to check id because test_repo will still contain the old, non-nil url
repo_content_types.add(test_repo.content_type) if (repository.id != test_repo.id) && test_repo.url.present?
end
repo_content_types = ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).where.not(:url => [nil, '']).distinct.pluck(:content_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the failing test, the

::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id))

part of this query is not filtering out the last root repository with a URL that is about to be destroyed:

[11] pry(#<Actions::Katello::Repository::Destroy>)> product.repositories.where.not(:id => repository.id).select(:root_id)
=> [#<Katello::Repository:0x00007f9e07c16248 id: nil, root_id: 146294567>,
 #<Katello::Repository:0x00007f9e07c16180 id: nil, root_id: 355376805>,
 #<Katello::Repository:0x00007f9e07c160b8 id: nil, root_id: 355376805>,
 #<Katello::Repository:0x00007f9e07c15ff0 id: nil, root_id: 956784787>,
 #<Katello::Repository:0x00007f9e07c15f28 id: nil, root_id: 355376805>,
 #<Katello::Repository:0x00007f9e07c15e60 id: nil, root_id: 355376805>,
 #<Katello::Repository:0x00007f9e07c15d98 id: nil, root_id: 355376805>,
 #<Katello::Repository:0x00007f9e07c15cd0 id: nil, root_id: 355376805>]
[12] pry(#<Actions::Katello::Repository::Destroy>)> ::Katello::RootRepository.where(:id => product.repositories.where.not(:id => repository.id).select(:root_id)).pluck(:id)
=> [355376805, 956784787, 146294567]

Root repository with ID 355376805 is the one related to the repository being destroyed.
It remains in the query because it has a URL, so it is not getting removed from the ACS products.

I also edited the failing test to look like the following to be sure that the root repositories were having their URLs cleared properly:

    it 'plans ACS product removal when removing the deleting the last repo with URL' do
      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_repository.id)
      action = create_action action_class
      action.stubs(:action_subject).with(published_repository)

      # manually remove the URLs from all repos in product except repository
      repository.product.root_repositories.each do |root|
        root.update_column(:url, nil) unless root.id == repository.root.id
      end
      url_sum = repository.product.root_repositories.count do |root|
        root.url.present?
      end
      assert_equal(1, url_sum) # double check there's only one URL left

      # Since there is only one URL remaining, the product should be removed
      plan_action action, published_repository, remove_from_content_view_versions: true
      simplified_acs.reload
      assert_equal(0, simplified_acs.products.length)
    end
  end

This part of the test:

      repository.product.repositories.each do |repo|
        repo.root.url = nil unless repo.id == repository.id
      end

I believe is flawed because two of those repositories could share a root, which could cause every root url to become nil.

I think the test needs to be analyzed and corrected first so we can be sure that it isn't actually catching a regression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qcjames53 might have some more context, it was a long time ago that this code was written though :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianballou Thanks for bringing me in on this. Good catch with the possibility of two repositories sharing the same root in the test case. It looks like url_sum always comes out to 1 with the test's data due to some wacky ruby shenanigans with referencing.

Another issue I found with some byebugging is that the root urls are never actually updated if you're coming at the test from the ::Katello::RootRepository side of things, so the urls the destroy function tests against are still the old values. This isn't an issue in normal use, just a side effect of how I set up the test.

Here's a corrected version of the test. I luckily was able to repurpose the rhel_7 repo since it is the only repo pointing to the rhel_7 root repository:

  class DestroyTest < TestBase
...
    let(:published_rhel7_repository) { katello_repositories(:rhel_7_no_arch) }
...
    it 'plans ACS product removal when deleting the last repo with URL' do
      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_rhel7_repository.id)
      action = create_action action_class
      action.stubs(:action_subject).with(published_rhel7_repository)

      # manually remove the URLs from all repos in product except repository
      testing_repo = repository.product.repositories[4]
      testing_repo.product.repositories.each do |repo|
        repo.root.url = nil unless repo.id == testing_repo.id || repo.root.id == testing_repo.root.id
        repo.root.save!(validate: false)
      end
      url_sum = testing_repo.product.repositories.count do |repo|
        repo.root.url.present?
      end
      assert_equal(1, url_sum) # double check there's only one URL left

      # Since there is only one URL remaining, the product should be removed
      plan_action action, published_rhel7_repository, remove_from_content_view_versions: true
      simplified_acs.reload
      assert_equal(0, simplified_acs.products.length)
    end
  end

@@ -70,11 +70,7 @@ def handle_acs_product_removal(repository)
# Remove products from ACS's that contain no repositories which both
# match the ACS content type and have a non-nil URL
product = repository.product
repo_content_types = Set.new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This slow logic also exists in app/lib/actions/katello/repository/update.rb when a user removes a product's url. Would you mind replacing the old loop with your new lookup there as well? There's a chance plans ACS product removal when removing the last URL for a product will break.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I updated the other file the same way.

@pmoravec pmoravec force-pushed the efficient-handle_acs_product_removal branch from 45b89be to 7ad9723 Compare November 15, 2024 18:30
@chris1984
Copy link
Member

@pmoravec @qcjames53 it looks like the same test is failing :(

@pmoravec
Copy link
Member Author

pmoravec commented Nov 18, 2024

Yeah.. The own manual test case "add product with last repo to ACS, delete the repo, check the ACS" against the current code (backported to a Satellite) passes to me.

Is the test case updated in upstream? It is https://github.com/Katello/katello/blob/master/test/actions/katello/repository_test.rb#L210-L228 ?

Also, should not I update the test on my branch? I mean, what test is executed by the CI jobs - from the given PR or from upstream?

@ianballou
Copy link
Member

Yeah.. The own manual test case "add product with last repo to ACS, delete the repo, check the ACS" against the current code (backported to a Satellite) passes to me.

Just to be super safe, does it also work if you have two yum repositories in a product with an ACS, one with a URL and one without a URL, and you delete the repo with the URL?

@ianballou
Copy link
Member

Also, should not I update the test on my branch? I mean, what test is executed by the CI jobs - from the given PR or from upstream?

You can replace the failing test with the fixed one that @qcjames53 provided in the comments above. The test CI runs the tests that are in your own branch used in this PR.

@pmoravec pmoravec force-pushed the efficient-handle_acs_product_removal branch from 7ad9723 to 3a63647 Compare November 19, 2024 07:54
@pmoravec
Copy link
Member Author

pmoravec commented Nov 19, 2024

Just to be super safe, does it also work if you have two yum repositories in a product with an ACS, one with a URL and one without a URL, and you delete the repo with the URL?

Running test:

hammer product create --organization-id 1 --name acs_test_product
hammer repository create --organization-id 1 --product acs_test_product --name zoo_repo --content-type yum --url https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/zoo/
hammer repository synchronize --organization-id 1 --product acs_test_product --name zoo_repo
hammer repository create --organization-id 1 --product acs_test_product --name dummy_repo --content-type yum
productid=$(hammer product info --organization-id 1 --name acs_test_product | head -n1 | awk '{ print $2 }')
hammer alternate-content-source create --name acs_repodel_test --alternate-content-source-type simplified --product-ids $productid
hammer alternate-content-source show --name acs_repodel_test
hammer repository delete --organization-id 1 --product acs_test_product --name zoo_repo
hammer alternate-content-source show --name acs_repodel_test

I got:

Product created.
Repository created.
[.....................................................................................................................................................................................] [100%]
Added Rpms: 32, Errata: 4
Total steps: 84/84
--------------------------------
Associating Content: 39/39
Downloading Artifacts: 0/0
Downloading Metadata Files: 6/6
Parsed Advisories: 4/4
Parsed Comps: 3/3
Parsed Packages: 32/32
Skipping Packages: 0/0
Un-Associating Content: 0/0
Repository created.
Alternate Content Source created.
ID:                            4
Name:                          acs_repodel_test
Label:                         acs_repodel_test
Content type:                  yum
Alternate content source type: simplified
Products:                      
 1) Id:              355
    Organization ID: 1
    Name:            acs_test_product
    Label:           acs_test_product
Smart proxies:

Repository deleted.
ID:                            4
Name:                          acs_repodel_test
Label:                         acs_repodel_test
Content type:                  yum
Alternate content source type: simplified
Products:                      

Smart proxies:

So here the ACS<->Product relation was properly removed.

I updated the test in my PR as well, let see now..

@pmoravec
Copy link
Member Author

I dont want to fight for my change, it can be possible I have a mistake there, just a thought about the slight difference in checks. Original check applied test:

.. if (repository.id != test_repo.id) && test_repo.url.present?

for each repository individually. My approach applies the has repo its URL?" per product. BUT since repository's urlmethod is delegated to:root`, the above code is equivalent to:

.. if (repository.id != test_repo.id) && test_repo.root_repository.url.present?

and that is the same like my check does, just in different ordering of conditions.

I replaced .present? method by .where.not(:url => [nil, '']) that is also slightly different (" ".present? or [].present? is evaluated to false but my code ignores that) - but the (original and also updated) test uses nil, as well as katello code in general (I think)..?

About the modified test case: what is the reference to published_rhel7_repository? I can find it in proposed test but nowhere else..?

@ianballou
Copy link
Member

With the testing being done I'm pretty confident that your PR is okay. Since your fix here brings high value, if we can't get the test working in a couple more days I suggest we temporarily disable it. We can get the test working again after the fact to release you Pavel :)


# manually remove the URLs from all repos in product except repository
repository.product.repositories.each do |repo|
repo.root.update!(url: nil) unless repo.id == repository.id
testing_repo = repository.product.repositories[4]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qcjames53 I'm not sure if the repository at index [4] will consistently be the same one, we'd probably need to look up by name instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually.... it looks like this was applied to the wrong test? The failing test is plans ACS product removal when removing the deleting the last repo with URL. I'll see about updating that other test to see if it helps.

@ianballou
Copy link
Member

@pmoravec this diff should fix all of the tests:

diff --git a/test/actions/katello/repository_test.rb b/test/actions/katello/repository_test.rb
index baa80b7774..112112217f 100644
--- a/test/actions/katello/repository_test.rb
+++ b/test/actions/katello/repository_test.rb
@@ -208,23 +208,21 @@ module ::Actions::Katello::Repository
     end
 
     it 'plans ACS product removal when removing the last URL for a product' do
-      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_rhel7_repository.id)
+      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: repository.id)
       action = create_action action_class
-      action.stubs(:action_subject).with(published_rhel7_repository)
+      action.stubs(:action_subject).with(repository)
 
       # manually remove the URLs from all repos in product except repository
-      testing_repo = repository.product.repositories[4]
-      testing_repo.product.repositories.each do |repo|
-        repo.root.url = nil unless repo.id == testing_repo.id || repo.root.id == testing_repo.root.id
-        repo.root.save!(validate: false)
+      repository.product.repositories.each do |repo|
+        repo.root.update!(url: nil) unless repo.id == repository.id
       end
-      url_sum = testing_repo.product.repositories.count do |repo|
+      url_sum = repository.product.repositories.count do |repo|
         repo.root.url.present?
       end
       assert_equal(1, url_sum) # double check there's only one URL left
 
       # Since there is only one URL remaining, the product should be removed
-      plan_action action, published_rhel7_repository, remove_from_content_view_versions: true
+      plan_action action, repository.root, :url => nil
       simplified_acs.reload
       assert_equal(0, simplified_acs.products.length)
     end
@@ -238,6 +236,7 @@ module ::Actions::Katello::Repository
     let(:published_repository) { katello_repositories(:rhel_6_x86_64) }
     let(:published_fedora_repository) { katello_repositories(:fedora_17_x86_64) }
     let(:simplified_acs) { katello_alternate_content_sources(:yum_alternate_content_source) }
+    let(:published_rhel7_repository) { katello_repositories(:rhel_7_no_arch) }
     def setup
       simplified_acs.products << published_repository.product
       ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id)
@@ -380,22 +379,24 @@ module ::Actions::Katello::Repository
       assert_not_equal(0, simplified_acs.products.length)
     end
 
-    it 'plans ACS product removal when removing the deleting the last repo with URL' do
-      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_repository.id)
+    it 'plans ACS product removal when deleting the last repo with URL' do
+      ::Katello::SmartProxyAlternateContentSource.create!(alternate_content_source_id: simplified_acs.id, smart_proxy_id: proxy.id, repository_id: published_rhel7_repository.id)
       action = create_action action_class
-      action.stubs(:action_subject).with(published_repository)
+      action.stubs(:action_subject).with(published_rhel7_repository)
 
       # manually remove the URLs from all repos in product except repository
-      repository.product.repositories.each do |repo|
-        repo.root.url = nil unless repo.id == repository.id
+      testing_repo = repository.product.repositories[4]
+      testing_repo.product.repositories.each do |repo|
+        repo.root.url = nil unless repo.id == testing_repo.id || repo.root.id == testing_repo.root.id
+        repo.root.save!(validate: false)
       end
-      url_sum = repository.product.repositories.count do |repo|
+      url_sum = testing_repo.product.repositories.count do |repo|
         repo.root.url.present?
       end
       assert_equal(1, url_sum) # double check there's only one URL left
 
       # Since there is only one URL remaining, the product should be removed
-      plan_action action, published_repository, remove_from_content_view_versions: true
+      plan_action action, published_rhel7_repository, remove_from_content_view_versions: true
       simplified_acs.reload
       assert_equal(0, simplified_acs.products.length)
     end

Updating also relevant test per qcjames53's and ianballou's comments.

Signed-off-by: Pavel Moravec <[email protected]>
@pmoravec pmoravec force-pushed the efficient-handle_acs_product_removal branch from 3a63647 to 76a17a8 Compare November 22, 2024 07:36
@pmoravec
Copy link
Member Author

Cool, that helped, thanks! Maybe I copy&pasted wrong patch from the discussion, sorry for the confusion.

Now just rpm-build on RHEL8 is failing - I think due to this:

No matching package to install: 'foreman-assets >= 3.14'
No matching package to install: 'foreman-plugin >= 3.14'
Not all dependencies satisfied
Error: Some packages could not be found.

but I have no idea how to resolve it (though it should be independent on this PR).

@chris1984
Copy link
Member

We don't need the EL8 to pass, those are gone now and removed yesterday. Will give it another test and if it looks good will merge it in. Thanks @ianballou @qcjames53 for helping Pavel getting the tests passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants