Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError('properties',) in Channel.basic_get #28

Open
LAlbertalli opened this issue Jun 5, 2013 · 11 comments
Open

KeyError('properties',) in Channel.basic_get #28

LAlbertalli opened this issue Jun 5, 2013 · 11 comments

Comments

@LAlbertalli
Copy link

Using celery I found out in the log file several error like this one:

 Task celery.chord_unlock with id 549c898e-ed49-4e7d-8b31-8bcf5c541f46 raised exception:
 "KeyError('properties',)"


 Task was called with args: ('0183df27-0cb3-4da4-8f2c-948d0e70cb88', suggestion.attachImages(<Suggestion: s625724>)) kwargs: {'max_retries': None, 'interval': 1, 'propagate': False, 'result': ['873ae99b-7763-458f-a817-6bde5eb1594a']}.

The contents of the full traceback was:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 228, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 415, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/app/builtins.py", line 95, in unlock_chord
    if deps.ready():
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 375, in ready
    return all(result.ready() for result in self.results)
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 375, in <genexpr>
    return all(result.ready() for result in self.results)
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 169, in ready
    return self.state in self.backend.READY_STATES
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 272, in state
    return self.backend.get_status(self.id)
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 295, in get_status
    return self.get_task_meta(task_id)['status']
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/amqp.py", line 163, in get_task_meta
    prev, latest, acc = latest, acc, binding.get(no_ack=False)
  File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 554, in get
    message = self.channel.basic_get(queue=self.name, no_ack=no_ack)
  File "/usr/local/lib/python2.7/dist-packages/librabbitmq/__init__.py", line 67, in basic_get
    frame['properties'],
KeyError: 'properties'

This error appears only randomly and disappear using py-amqp thus I think the problem is with librabbitmq.
Modifying the li near the 67 librabbitmq/init.py in this way:

try:
    return(self.Message(self,
                    frame['properties'],
                    frame['delivery_info'],
                    frame['body']))
except:
    raise Exception(frame)

I get this error log:

Task celery.chord_unlock with id 39cc3448-34f2-4e79-af55-0cf2f519143e raised exception:
"Exception({'delivery_info': {'redelivered': False, 'exchange': 'celeryresults', 'routing_key': 'acb84c28c0034f6f9f159fa287c7624d', 'delivery_tag': 10L}},)"


Task was called with args: ('f5914b37-7fde-4842-8b81-21a2602a1879', suggestion.attachImages(<Suggestion: s625811>)) kwargs: {'max_retries': None, 'interval': 1, 'propagate': False, 'result': ['acb84c28-c003-4f6f-9f15-9fa287c7624d']}.

The contents of the full traceback was:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 228, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 415, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/app/builtins.py", line 95, in unlock_chord
    if deps.ready():
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 375, in ready
    return all(result.ready() for result in self.results)
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 375, in <genexpr>
    return all(result.ready() for result in self.results)
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 169, in ready
    return self.state in self.backend.READY_STATES
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 272, in state
    return self.backend.get_status(self.id)
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/base.py", line 295, in get_status
    return self.get_task_meta(task_id)['status']
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/amqp.py", line 163, in get_task_meta
    prev, latest, acc = latest, acc, binding.get(no_ack=False)
  File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 554, in get
    message = self.channel.basic_get(queue=self.name, no_ack=no_ack)
  File "/usr/local/lib/python2.7/dist-packages/librabbitmq/__init__.py", line 73, in basic_get
    raise Exception(frame)
Exception: {'delivery_info': {'redelivered': False, 'routing_key': 'acb84c28c0034f6f9f159fa287c7624d', 'delivery_tag': 10L, 'exchange': 'celeryresults'}}

Checking the queue with rabbitmqctl list_queues shows that the queue still contain a message but trying to read it using librabbitmq from the python console yield no results. The issue appears randomly (let's say every 10 tasks launched) and only on the production environment (so the testing activity is limited).
The server is a Debian Wheezy with all the patch installed and the configuration is:
Django==1.4.5
MySQL-python==1.2.3
PIL==1.1.7
South==0.7.5
amqp==1.0.11
billiard==2.7.3.28
boto==2.3.0
celery==3.0.19
dnspython==1.10.0
docutils==0.8.1
eventlet==0.12.1
flower==0.4.2
greenlet==0.4.0
httplib2==0.7.7
ipython==0.13.1
jsonfield==0.9.13
kombu==2.5.10
librabbitmq==1.0.1
passlib==1.6.1
pexpect==2.4
pytz==2012c
redis==2.4.13
simplegeneric==0.8.1
simplejson==3.0.7
six==1.3.0
And is identical to test environment where the issues never happened. The issue appeared when we added an eventlet node to the celery configuration but disabling it has not solved the issue.

I don't konw if it could be useful for debugging but we start seeing also another error on a task on the get result of a starmap.

Task tnlibs.celery.sel_utilities.keySelection with id 1ebc0eb9-da2c-40f8-8ccb-06dc765d9ab3 raised exception:
"ChannelError('Bad frame read',)"
Task was called with args: ({1200128L: {'content': u" ... [details removed]

The contents of the full traceback was:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 228, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tndj-0.1.dev_r2287-py2.7.egg/tnlibs/celery/tncelery.py", line 17, in __call__
    return super(TNTask,self).__call__(*args,**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/celery/task/trace.py", line 415, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tndj-0.1.dev_r2287-py2.7.egg/tnlibs/celery/sel_utilities.py", line 335, in keySelection
    r.get()
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 449, in get
    timeout=timeout, propagate=propagate, interval=interval)
  File "/usr/local/lib/python2.7/dist-packages/celery/result.py", line 531, in join_native
    interval=interval):
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/amqp.py", line 232, in get_many
    r = self.drain_events(conn, consumer, timeout)
  File "/usr/local/lib/python2.7/dist-packages/celery/backends/amqp.py", line 202, in drain_events
    wait(timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/librabbitmq/__init__.py", line 202, in drain_events
    self._basic_recv(timeout)
ChannelError: Bad frame read

I'm not sure if this last error is related but it has also disappeared using py-amqp (even if this task is called only few times so I'm not totally sure this problem is solved).

@ask
Copy link
Contributor

ask commented Jun 5, 2013

Interesting, I got this in the development version yesterday, didn't think it was in 3.0.19 as well.

@LAlbertalli
Copy link
Author

Just a few more details.
After one day of switching py-amqp the error never happened again so I'm pretty sure the problem is in librabbitmq. Also the second problem that I thought was related (the bad frame on starmaps) never happened again so now I'm enough confident that the two problem where related.

Today or tomorrow I will try to re-enable the eventlet node and see what happens. The error occurred the first time with a mixed event let/multiprocessing environment. I guess I could provide no more help since the problem occurs only in the production environment and causes data corruption so I cannot risk of reproducing it.

Some more consideration:

  • the biggest difference between the test environment and the production one is in the load.
  • when we had the problem I tried to restart RabbitMQ and also resetting all the queues just in case there was some corrupted data hanging around.
  • the problems happens just on returns queue; of all the error I get while sorting out the problem (I analyzed some more than 400 error mail) no one was on new task sent on the execution queue (And we run more then 50000 task per day more or less).
  • as I said, I suspect some relationship with eventlet because the problem come out when I introduced them in the production environment. Nevertheless production was running on a quite old configuration (celery 3.0.12, don't remember the other lib, sorry) and after I verified the issue I've updated all the libraries to rule out bug already fixed in new releases (even if the test was run on the same configuration)

I hope all this informations helps. Let me know if you need more info, the performance hit of py-amqp over librabbitmq is around 4-5% percent of CPU, that could be peanuts but not when you are a small startup resource constrained and you run very high-load processes.

Bye

@skwash
Copy link

skwash commented Nov 4, 2013

I'm also seeing this occasionally when using group tasks. For now I've had to drop librabbitmq and use py-amqp to get around it.

  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/result.py", line 376, in ready

  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/result.py", line 376, in <genexpr>

  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/result.py", line 169, in ready
    return self.state in self.backend.READY_STATES
  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/result.py", line 273, in state
    status = state
  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/backends/base.py", line 295, in get_status
    return self.get_task_meta(task_id)['status']
  File "/opt/python/python-2.7/lib/python2.7/site-packages/celery/backends/amqp.py", line 166, in get_task_meta
    if prev:
  File "/opt/python/python-2.7/lib/python2.7/site-packages/kombu/entity.py", line 559, in get
    return message
  File "/opt/python/python-2.7/lib/python2.7/site-packages/librabbitmq/__init__.py", line 67, in basic_get
KeyError: 'properties'

Versions:

Python 2.7.4
amqp - 1.0.13
billiard - 2.7.3.34
celery - 3.0.24
kombu - 2.5.16
librabbitmq - 1.0.2

@ask
Copy link
Contributor

ask commented Nov 4, 2013

I don't see how properties can not be set, but maybe this is some other type of frame and it does not handle it properly. I'll have a look tomorrow

@LAlbertalli
Copy link
Author

@skwash : Your configuration mixes eventlet worker with multiprocessing workers? Because I have experienced the problem just on the Eventlet queue. Using py-amqp lib only on the eventlet queue solved the issues for me(it's more than five month that I'm in this config and no more error happened).
After your comments I've tried out of curiosity to re-enable rabbitmqctl on one of the three servers we are running and the error re-appeared immediately.
Versions:
Python 2.7.3
amqp - 1.0.13
billiard - 2.7.3.32
celery - 3.0.23
kombu - 2.5.14
librabbitmq - 1.0.1

Hope this helps.

@skwash
Copy link

skwash commented Nov 5, 2013

No eventlet here, just multiprocessing workers. Interestingly enough, I've been digging in and noticed that the problem is only being seen on the calling code and not the workers. The workers are running with librabbitmq 1.0.1 and some slightly older versions of everything else and have been running smoothly for many months now. It definitely seems to be something specifically related to result queues.

@LAlbertalli
Copy link
Author

That's interesting. I've seen the error too just on the return queue. Have you checked the status of the result queue? Usually the result queue contains 1 element when the result has not yet been consumed and 0 after. In my situation rabbitmqctl keep listing one element in queue when I received the error but using the python APIs (both py-amqp and librabbitmq) I wasn't able to get any result. I hope that finding the commonalities on the issue could help finding the cause.

@dwurf
Copy link

dwurf commented Feb 17, 2014

Adding my voice to this.
When using the regular per-process workers (not eventlet) I started getting the following traceback in the celery log. librabbitmq is installed. I have a chain of tasks with a group in the middle, and the error always occurs on the task immediately after the group. I can reproduce the error approx 50% of the time in my dev environment.

    [2014-02-17 22:40:27,936: ERROR/MainProcess] Task celery.chord_unlock[9ca6ef76-0585-4a42-97dc-76eea4cfdf59] raised unexpected: KeyError('properties',)
    Traceback (most recent call last):
      File "/opt/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 238, in trace_task
        R = retval = fun(*args, **kwargs)
      File "/opt/venv/local/lib/python2.7/site-packages/celery/app/trace.py", line 416, in __protected_call__
        return self.run(*args, **kwargs)
      File "/opt/venv/local/lib/python2.7/site-packages/celery/app/builtins.py", line 95, in unlock_chord
        if deps.ready():
      File "/opt/venv/local/lib/python2.7/site-packages/celery/result.py", line 445, in ready
        return all(result.ready() for result in self.results)
      File "/opt/venv/local/lib/python2.7/site-packages/celery/result.py", line 445, in <genexpr>
        return all(result.ready() for result in self.results)
      File "/opt/venv/local/lib/python2.7/site-packages/celery/result.py", line 231, in ready
        return self.state in self.backend.READY_STATES
      File "/opt/venv/local/lib/python2.7/site-packages/celery/result.py", line 340, in state
        return self.backend.get_status(self.id)
      File "/opt/venv/local/lib/python2.7/site-packages/celery/backends/base.py", line 235, in get_status
        return self.get_task_meta(task_id)['status']
      File "/opt/venv/local/lib/python2.7/site-packages/celery/backends/amqp.py", line 171, in get_task_meta
        accept=self.accept, no_ack=False,
      File "/opt/venv/local/lib/python2.7/site-packages/kombu/entity.py", line 570, in get
        message = self.channel.basic_get(queue=self.name, no_ack=no_ack)
      File "/opt/venv/local/lib/python2.7/site-packages/librabbitmq/__init__.py", line 67, in basic_get
        frame['properties'],
    KeyError: 'properties'

pip freeze:

    gdal==1.10.1
    jinja2==2.7.2
    markupsafe==0.18
    pyyaml==3.10
    sqlalchemy==0.9.1
    shapely==1.3.0
    amqp==1.4.2
    anyjson==0.3.3
    argparse==1.2.1
    backports.ssl-match-hostname==3.4.0.2
    billiard==3.3.0.14
    celery==3.1.8
    cx-oracle==5.1.2
    flower==0.6.0
    gevent==1.0
    greenlet==0.4.2
    ipython==1.1.0
    kombu==3.0.11
    librabbitmq==1.0.3
    meld3==0.6.10
    mock==1.0.1
    nose==1.3.0
    pysqlite==2.6.3
    pytz==2013.9
    pyzmq==14.0.1
    sh==1.09
    supervisor==3.0
    tornado==3.2
    wheel==0.22.0
    wsgiref==0.1.2

python -V

    Python 2.7.3

@tspike
Copy link

tspike commented Apr 23, 2014

Has there been any progress on this issue? I'm hitting the same thing with celery 3.0.20, librabbitmq 1.0.3, and amqp 1.0.13.

@chrisseto
Copy link

Any progress at all? I'm hitting the same error.

@ba1dr
Copy link

ba1dr commented May 13, 2014

+1 - the same issue:
software -> celery:3.1.11 (Cipater) kombu:3.0.15 py:2.7.5
billiard:3.3.0.17 librabbitmq:1.5.0
We had to switch to pyamqp due to this issue. Vote for fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants