Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error warming up cache: Error 302 #27160

Open
3 tasks done
Cerberus112 opened this issue Feb 19, 2024 · 8 comments
Open
3 tasks done

Error warming up cache: Error 302 #27160

Cerberus112 opened this issue Feb 19, 2024 · 8 comments
Assignees

Comments

@Cerberus112
Copy link

Cerberus112 commented Feb 19, 2024

Bug description

Cache warm-up is not functioning when configured using the latest version (3.1.1rc1) and the previous one (3.1.0) in kubernetes enviroment (with Helm chart version 0.2.15 or earlier).

When the task is triggered, logs of superset worker throws 308 error trying to request the API endpoint.

Note: Reports are working correctly on the same worker.

How to reproduce the bug

  1. Apply cache warm-up config in kubernetes enviroment
  2. Review the logs of superset worker

Screenshots/recordings

No response

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

The values.yalm (cache warm-up configs):

  celery_conf: |
    from celery.schedules import crontab
    class CeleryConfig:
      broker_url = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      imports = (
          "superset.sql_lab",
          "superset.tasks.cache",
          "superset.tasks.scheduler",
      )
      result_backend = f"redis://{env('REDIS_HOST')}:{env('REDIS_PORT')}/0"
      task_annotations = {
          "sql_lab.get_sql_results": {
              "rate_limit": "100/s",
          },
      }
      beat_schedule = {
          "reports.scheduler": {
              "task": "reports.scheduler",
              "schedule": crontab(minute="*", hour="*"),
          },
          "reports.prune_log": {
              "task": "reports.prune_log",
              'schedule': crontab(minute=0, hour=0),
          },
          'cache-warmup-hourly': {
              "task": "cache-warmup",
              "schedule": crontab(minute="*/2", hour="*"), ## for testing
              "kwargs": {
                  "strategy_name": "dummy"
              },
          }
      }
    CELERY_CONFIG = CeleryConfig
    THUMBNAIL_SELENIUM_USER = "admin"

Superset worker logs:

[2024-02-19 14:26:00,227: INFO/ForkPoolWorker-1] fetch_url[ecc6c59f-1a81-472c-bb3c-25daf1ccb203]: Fetching http://url.of.my.site/superset/warm_up_cache/ with payload {"chart_id": 43}
[2024-02-19` 14:22:00,263: ERROR/ForkPoolWorker-3] fetch_url[ecc6c59f-1a81-472c-bb3c-25daf1ccb203]: Error warming up cache!
Traceback (most recent call last):
  File "/app/superset/tasks/cache.py", line 242, in fetch_url
    response = request.urlopen(  # pylint: disable=consider-using-with
  File "/usr/local/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 308: Permanent Redirect

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.
@Cerberus112 Cerberus112 changed the title Error warming up cache: Permanen Redirect Error warming up cache: Permanent Redirect Feb 19, 2024
@craig-rueda
Copy link
Member

Could be an issue with the worker's webdriver config or something. Unclear from the logs as to what the initial cause of the 308 is.

@Cerberus112
Copy link
Author

Could be an issue with the worker's webdriver config or something. Unclear from the logs as to what the initial cause of the 308 is.

I also considered this possibility, but the reports are being sent correctly, and they utilize the webdriver. For example:

[2024-02-21 09:32:00,077: INFO/ForkPoolWorker-1] Scheduling alert test_report eta: 2024-02-21 09:32:00
Executing alert/report, task id: 4a38bbbe-0f97-4031-afb1-6829668a754b, scheduled_dttm: 2024-02-21T09:32:00
[2024-02-21 09:32:00,082: INFO/ForkPoolWorker-1] Executing alert/report, task id: 4a38bbbe-0f97-4031-afb1-6829668a754b, scheduled_dttm: 2024-02-21T09:32:00
session is validated: id 9, executionid: 4a38bbbe-0f97-4031-afb1-6829668a754b
[2024-02-21 09:32:00,083: INFO/ForkPoolWorker-1] session is validated: id 9, executionid: 4a38bbbe-0f97-4031-afb1-6829668a754b
Running report schedule 4a38bbbe-0f97-4031-afb1-6829668a754b as user admin
[2024-02-21 09:32:00,116: INFO/ForkPoolWorker-1] Running report schedule 4a38bbbe-0f97-4031-afb1-6829668a754b as user admin
Report sent to email, notification content is {'notification_type': 'Report', 'notification_source': <ReportSourceFormat.DASHBOARD: 'dashboard'>, 'notification_format': 'PNG', 'chart_id': None, 'dashboard_id': 3, 'owners': [Superset Admin]}
[2024-02-21 09:32:14,119: INFO/ForkPoolWorker-1] Report sent to email, notification content is {'notification_type': 'Report', 'notification_source': <ReportSourceFormat.DASHBOARD: 'dashboard'>, 'notification_format': 'PNG', 'chart_id': None, 'dashboard_id': 3, 'owners': [Superset Admin]}

Anyway, I'll also include the configuration of the superset worker:

supersetWorker:
  affinity: {}
  autoscaling:
    enabled: false
    maxReplicas: 100
    minReplicas: 1
    targetCPUUtilizationPercentage: 80
  command:
  - /bin/sh
  - -c
  - |
    # Install chrome webdriver
    # See https://github.com/apache/superset/blob/4fa3b6c7185629b87c27fc2c0e5435d458f7b73d/docs/src/pages/docs/installation/email_reports.mdx
    apt-get update
    apt-get install wget unzip zip -y
    wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
    apt-get install -y --no-install-recommends ./google-chrome-stable_current_amd64.deb
    wget https://edgedl.me.gvt1.com/edgedl/chrome/chrome-for-testing/121.0.6167.85/linux64/chromedriver-linux64.zip
    #unzip chromedriver_linux64.zip
    #chmod +x chromedriver
    #mv chromedriver /usr/bin
    unzip chromedriver-linux64.zip
    chmod +x chromedriver-linux64/chromedriver
    mv chromedriver-linux64/chromedriver /usr/bin
    apt-get autoremove -yqq --purge
    apt-get clean
    #rm -f google-chrome-stable_current_amd64.deb chromedriver-linux64.zip

    # Run
    . {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker
  containerSecurityContext: {}
  deploymentAnnotations: {}
  deploymentLabels: {}
  extraContainers: []
  forceReload: false
  initContainers:
  - command:
    - /bin/sh
    - -c
    - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT"
      -timeout 120s
    envFrom:
    - secretRef:
        name: '{{ tpl .Values.envFromSecret . }}'
    image: '{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}'
    imagePullPolicy: '{{ .Values.initImage.pullPolicy }}'
    name: wait-for-postgres-redis
  livenessProbe:
    exec:
      command:
      - sh
      - -c
      - celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME
    failureThreshold: 3
    initialDelaySeconds: 120
    periodSeconds: 60
    successThreshold: 1
    timeoutSeconds: 60
  podAnnotations: {}
  podLabels: {}
  podSecurityContext: {}
  readinessProbe: {}
  replicaCount: 1
  resources: {}
  startupProbe: {}
  strategy: {}
  topologySpreadConstraints: []

and the worker's startup logs:

Saving to: ‘chromedriver-linux64.zip’
Archive:  chromedriver-linux64.zip
  inflating: chromedriver-linux64/LICENSE.chromedriver  
  inflating: chromedriver-linux64/chromedriver  
logging was configured successfully
2024-02-21 09:21:34,875:INFO:superset.utils.logging_configurator:logging was configured successfully
2024-02-21 09:21:34,878:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>
/usr/local/lib/python3.9/site-packages/flask_limiter/extension.py:293: UserWarning: Using the in-memory storage for tracking rate limits as no storage was explicitly specified. This is not recommended for production use. See: https://flask-limiter.readthedocs.io#configuring-a-storage-backend for documentation about configuring the storage backend.
  warnings.warn(
/usr/local/lib/python3.9/site-packages/celery/platforms.py:840: SecurityWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

  warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(
Loaded your LOCAL configuration at [/app/pythonpath/superset_config.py]
 
 -------------- celery@superset-worker-7db568d57c-dht8w v5.2.2 (dawn-chorus)
--- ***** ----- 
-- ******* ---- Linux-3.10.0-1160.71.1.el7.x86_64-x86_64-with-glibc2.36 2024-02-21 09:21:36
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         __main__:0x7f0fd54bbb20
- ** ---------- .> transport:   redis://superset-redis-headless:6379/0
- ** ---------- .> results:     redis://superset-redis-headless:6379/0
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery

@Cerberus112
Copy link
Author

UPDATE:

Finally, I found that the redirection was due to the WEBDRIVER_BASEURL not being configured at the service level.

WEBDRIVER_BASEURL = "http://{{ template "superset.fullname" . }}:8088/"

However, I now encounter receiving 400 errors due to missing CSRF when trying to warm up the cache, both from the worker and externally using the API.

{'errors': [{'message': '400 Bad Request: The CSRF session token is missing.', 'error_type': 'GENERIC_BACKEND_ERROR', 'level': 'error', 'extra': {'issue_codes': [{'code': 1011, 'message': 'Issue 1011 - Superset encountered an unexpected error.'}]}}]}

If I disable CSRF:

WTF_CSRF_ENABLED = False

it returns 302:

[2024-02-29 17:00:00,368: INFO/ForkPoolWorker-2] fetch_url[356f2e18-4069-4f16-a8aa-d3bee8323296]: Fetching http://superset:8088/api/v1/chart/warm_up_cache with payload {"chart_id": 49}
[2024-02-29 17:00:00,377: ERROR/ForkPoolWorker-3] fetch_url[ba8a1804-8ea0-460a-8892-da8f8fa9b733]: Error warming up cache!
Traceback (most recent call last):
  File "/app/superset/tasks/cache.py", line 242, in fetch_url
    response = request.urlopen(  # pylint: disable=consider-using-with
  File "/usr/local/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.9/urllib/request.py", line 555, in error
    result = self._call_chain(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 726, in http_error_302
    new = self.redirect_request(req, fp, code, msg, headers, newurl)
  File "/usr/local/lib/python3.9/urllib/request.py", line 664, in redirect_request
    raise HTTPError(req.full_url, code, msg, headers, fp)
urllib.error.HTTPError: HTTP Error 302: FOUND
[2024-02-29 17:00:00,378: ERROR/ForkPoolWorker-2] fetch_url[356f2e18-4069-4f16-a8aa-d3bee8323296]: Error warming up cache!
Traceback (most recent call last):
  File "/app/superset/tasks/cache.py", line 242, in fetch_url
    response = request.urlopen(  # pylint: disable=consider-using-with
  File "/usr/local/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.9/urllib/request.py", line 555, in error
    result = self._call_chain(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.9/urllib/request.py", line 726, in http_error_302
    new = self.redirect_request(req, fp, code, msg, headers, newurl)
  File "/usr/local/lib/python3.9/urllib/request.py", line 664, in redirect_request
    raise HTTPError(req.full_url, code, msg, headers, fp)
urllib.error.HTTPError: HTTP Error 302: FOUND

Related errors reported before: #24717 (comment) #24579 (comment)

@Cerberus112 Cerberus112 changed the title Error warming up cache: Permanent Redirect Error warming up cache: Error 302 Mar 4, 2024
@dmuldoonadl
Copy link

I'm experiencing the same issue.

@rusackas
Copy link
Member

rusackas commented Aug 7, 2024

Just a head-up that while we're trying to get the linked PR merged, we're also no longer supporting 3.0.x, and will stop supporting 3.x.x when Superset 4.1 is released soon. If anyone can confirm this is indeed currently a 4.x.x issue, that'd be appreciated!

@sanjaynayak007
Copy link
Contributor

I am experiencing the cache warmup issue in Superset version 4.0.2.

[2024-08-28 10:30:01,206: INFO/ForkPoolWorker-2] fetch_url[d490c4c1-2b0a-4078-8f5a-7abc7f8f96ca]: Fetching http://superset:8088/superset/warm_up_cache/ with payload {"chart_id": 125, "dashboard_id": 22}
[2024-08-28 10:30:01,212: ERROR/ForkPoolWorker-2] fetch_url[d490c4c1-2b0a-4078-8f5a-7abc7f8f96ca]: Error warming up cache!
Traceback (most recent call last):
  File "/app/superset/tasks/cache.py", line 227, in fetch_url
    response = request.urlopen(  # pylint: disable=consider-using-with
  File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 405: METHOD NOT ALLOWED

@tamarinkeisari
Copy link

tamarinkeisari commented Sep 9, 2024

I am experiencing the cache warmup issue in Superset version 4.0.2.

[2024-08-28 10:30:01,206: INFO/ForkPoolWorker-2] fetch_url[d490c4c1-2b0a-4078-8f5a-7abc7f8f96ca]: Fetching http://superset:8088/superset/warm_up_cache/ with payload {"chart_id": 125, "dashboard_id": 22}
[2024-08-28 10:30:01,212: ERROR/ForkPoolWorker-2] fetch_url[d490c4c1-2b0a-4078-8f5a-7abc7f8f96ca]: Error warming up cache!
Traceback (most recent call last):
  File "/app/superset/tasks/cache.py", line 227, in fetch_url
    response = request.urlopen(  # pylint: disable=consider-using-with
  File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/local/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/local/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/local/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 405: METHOD NOT ALLOWED

I am facing the same problem, also version 4.0.2
Have you found a solution?

I have tried to add a code someone said it will solve it and now it shows me the same error but code 400
thank you so much!

@nicholaslimck
Copy link

Error 405 was due to a change to the API endpoint; the cache code was not updated accordingly. That issue was resolved in #28706 and 4.1.1, but the new Error 400 persists on 4.1.1 due to a missing CSRF token. Waiting on #31173 to fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants