Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

G9SP Response Queue IndexError Fix #54

Merged
merged 12 commits into from
Feb 27, 2025
Merged

Conversation

mslaffin
Copy link
Collaborator

@mslaffin mslaffin commented Feb 7, 2025

Context from PR#53:
image

I believe this was effectively the error that was happening:

>>> import queue
>>>
>>> q = queue.Queue()
>>> item = q.queue[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: deque index out of range
>>>

@mslaffin mslaffin added the bug Something isn't working label Feb 7, 2025
@mslaffin mslaffin self-assigned this Feb 7, 2025
@bwalkerMIR
Copy link
Collaborator

This error has showed up a lot!!

@AryanSV5
Copy link

Tested branch in lab on 2/10/2025-11:39am and the index error no longer comes up. Neither the console or logs displays the error. Ran tests with SIC individually and all also with other subsystems.

@mark11778
Copy link
Contributor

mark11778 commented Feb 11, 2025

It showed up again, but after ^ those commits, no longer being output to log ... but on disconnection (of the G9) the program stops responding

@mslaffin
Copy link
Collaborator Author

hey @mark11778 looks like you caught another instance of a direct-access of .queue[0]. I have some ideas about what's happening.

Your first change e37df8c changed to get_nowait()

- return self._response_queue.queue[0]
+ return self._response_queue.get_nowait()

this seems problematic because get_nowait actually removes the item from the queue, so each time we check the status, this actually consumes/removes the data instad of just reading the latest status.

Then it looks like you reverted back to looking at .queue[0] but with an empty check first. I think this sort of reintroduced the original thread-safety problem. The fact that the program hangs on disconnect is probably due to the GUI thread blocking because we're still trying to access the queue unsafely..

I'll push some changes to get_interlock_status that aligns this with the pattern Aryan tested in _communication_thread

@mark11778
Copy link
Contributor

mark11778 commented Feb 12, 2025

Ope, I missed your comment from before, but I was actually talking to Brandon about the same thing. I too was saying that I think it is an issue with how we are handling the threading locks.

This might also problem that the DP16 also shares to some extent

I added some of the contents that was originally in the auto comport detection, which closes the com port on disconnection, but I wonder if that is problematic with the thread ... but anyways, I was able to get it to disconnect without crashing (all be it took 30s or so(was not counting)), but on reconnection it stop responding again

@mark11778 mark11778 force-pushed the bugfix/G9-Index-out-of-bounds branch from 8172243 to 7b3e843 Compare February 13, 2025 23:22
@mark11778
Copy link
Contributor

mark11778 commented Feb 13, 2025

After some of the changes on how the bytes are being read in, the indicators now, show disconnection basically instantly, for reconnection(with only the SIC running and connected) it takes 6-7s.

But, no longer throwing index errors or crashing the GUI

Edit: tested it again with all subsystems excluding power supplies, got the same results

@mark11778
Copy link
Contributor

Should be ready to merge into develop now

@mark11778 mark11778 merged commit 5e08061 into develop Feb 27, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants