Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Django's image admin view times out #5174

Open
krysal opened this issue Nov 19, 2024 · 1 comment · Fixed by #5175
Open

Django's image admin view times out #5174

krysal opened this issue Nov 19, 2024 · 1 comment · Fixed by #5175
Assignees
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: api Related to the Django API

Comments

@krysal
Copy link
Member

krysal commented Nov 19, 2024

Description

The /admin/api/image/ view doesn't load with production data. You can confirm this by trying to access in staging (duplicates production data) if you have access: https://api-staging.openverse.org/admin/api/image/

I wondered if this was caused due to a heavy DB query and got what was being executed locally (see below). I used the same query in Grafana against the staging and production databases, and the result was returned immediately, so a problem at the DB level was discarded.

query
SELECT "image"."id", "image"."created_on", "image"."updated_on", "image"."identifier", "image"."foreign_identifier", "image"."title", 
    "image"."foreign_landing_url", "image"."creator", "image"."creator_url", "image"."thumbnail", "image"."provider", "image"."url",
    "image"."filesize", "image"."filetype", "image"."watermarked", "image"."license", "image"."license_version", "image"."source",
    "image"."last_synced_with_source", "image"."removed_from_source", "image"."view_count", "image"."tags", "image"."category",
    "image"."meta_data", "image"."width", "image"."height", COUNT("nsfw_reports"."id") AS "total_report_count",
    (COUNT("nsfw_reports"."id") - COUNT("nsfw_reports"."decision_id")) AS "pending_report_count",
    MIN("nsfw_reports"."created_at") AS "oldest_report_date"
FROM "image"
    INNER JOIN "nsfw_reports" ON ("image"."identifier" = "nsfw_reports"."identifier")
WHERE "nsfw_reports"."id" IS NOT NULL
GROUP BY "image"."id"
    HAVING (COUNT("nsfw_reports"."id") - COUNT("nsfw_reports"."decision_id")) > 0
ORDER BY 27 DESC, 28 DESC, 29 ASC, "image"."id" DESC;

The query returns a count of 458, which the admin might be trying to load all at once, causing a rendering issue. Instead, the audio view (admin/api/audio/), with only 38 audio tracks reported, loads fine. We should limit the number of media items shown on these pages and paginate them.

Reproduction

  1. Login into https://api-staging.openverse.org/admin
  2. Go to the view https://api-staging.openverse.org/admin/api/image/
  3. See error.

Screenshots

The image view when trying to access it.

The audio view loading fine.

@krysal krysal added 💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: api Related to the Django API labels Nov 19, 2024
@openverse-bot openverse-bot moved this to 📋 Backlog in Openverse Backlog Nov 19, 2024
@krysal krysal self-assigned this Nov 20, 2024
@openverse-bot openverse-bot moved this from 📋 Backlog to 📅 To Do in Openverse Backlog Nov 20, 2024
@openverse-bot openverse-bot moved this from 📅 To Do to 🏗 In Progress in Openverse Backlog Nov 20, 2024
@openverse-bot openverse-bot moved this from 🏗 In Progress to ✅ Done in Openverse Backlog Nov 22, 2024
@krysal
Copy link
Member Author

krysal commented Nov 22, 2024

The #5175 PR did not fix the issue, so I'm reopening it.


Edit: My next guess is that the issue can originate from the filters. Looking again at what occurs during these requests, I found the following query:

SELECT DISTINCT "audio"."source" FROM "audio" ORDER BY "audio"."source" ASC

Notices the DISTINCT, which is extremely resource-consuming for the image table with nearly 1 billion rows. This is auto-generated by Django from these lines:

def get_list_filter(self, request):
return (
"source",
"provider",
get_pending_record_filter(self.media_type),
)

The audio table can easily handle it (for now) but it isn't feasible for images. Luckily, we can easily fix it, given we know exactly how many sources there are and which ones they are. We only need to query with the ContentSource model.

@krysal krysal reopened this Nov 22, 2024
@openverse-bot openverse-bot moved this from ✅ Done to 📋 Backlog in Openverse Backlog Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: api Related to the Django API
Projects
Status: 📋 Backlog
Development

Successfully merging a pull request may close this issue.

1 participant