-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scpi psu reconnect #726
Scpi psu reconnect #726
Conversation
…ill has some errors
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks @agthomas-uc! And thanks for including the results of your thorough testing. A few suggestions below.
As far as coordinating with #715, currently that adds a separate class for the direct Ethernet connection, so I think a merge would be quite minimal, though we'd have to re-implement this reconnection logic in that class as well. I pointed out this PR in a recent re-review there, so maybe we can see what @CAWNoodle says?
for more information, see https://pre-commit.ci
…into scpi_psu_reconnect
for more information, see https://pre-commit.ci
I think this snuck in during a git merge, looks like a 'git' command got added to the comment in 4567a0b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updates look good, thanks for all the changes! I made one small fix to the drivers imports, which were causing most of the checks to fail. Also one small comment edit. Will merge shortly. Thanks again!
I have implemented a robust reconnecting agent to re-establish the
PsuInterface
, clear the PSU's event register, and begin reading out data. Let me know if I can help work with Daniel to get these PRs synced up. As my code is ignorant to a direct ethernet connection, there will be a compatibility issue aroundself.port
which will require me to modify_initialize_module
agent.py in the same way Daniel did to the main body of in his PR.Description
Implemented in the agent with one new driver function, these changes prevent the agent from crashing due to any disconnect after initialization and from block errors due disconnects during data readout. These changes attempt to reconnect the agent on a 5 second cadence and log each attempt and whether it was successful. Upon reconnecting, the event register is cleared to ensure previously
MEAS?
commands are not outputted to incorrect channels.Long form description (copied from simonsobs/daq-discussions#106).
.splitlines()
and attempting to use this data. However, reading data with a newline character creates a driver bug where measuring ch1 voltage returns 0 (integer zero), measuring ch1 current returns ch1 voltage, measuring ch2 voltage returns ch1 current, ..., measuring ch3 current returns ch3 voltage, and ch3 current is unrecoverable. Repeating this process does not turn additional channels to zero and instances of two \n characters still only "cycled" the data by 1. Querying the power as well did not stop the cycle. I tried many iterations of driver code and read the 2230G user's manual but did not determine why this behavior would occur.socket.timeout()
) caused the data acquisition for loop to break without reading all values to data. Since psu is set to None when a disconnect occurs, I prevent block structure errors by checking that psu is still initialized before publishing.data['data']
to twice the number of channels. However, I believe checking if psu is initialized accomplishes the same goal more efficiently. I can revert if I missed a case.socket.timeout()
occurs after 5 seconds (as set in prologix_interface.py#L10). If the connection has not been established within one minute, OSError [Errno 113] "No route to host" occurs. I deal with both by excepting any Exception to enable reconnect to attempt indefinitely.Motivation and Context
This change is required because our PSU disconnects on a regular basis, between hourly and weekly, and previously needed to be manually power cycled to be restored. This prevented us from using the PSU to heat the focal plane and has been a standing issue in our lab for over 2 years.
This addresses one agent listed here socs/issues/721 and the issue I created to break this out socs/issues/725.
How Has This Been Tested?
I tested routinely during the creation of this branch and documented issues in simonsobs/daq-discussions#69 and (simonsobs/daq-discussions#75. After attempting to resolve these issues, I disconnected and reconnected the ethernet cable irregularly over a 20 minute period. 11/11 Reconnections were successful as shown below.
On the final test I made sure to disconnect the agent during the reconnection process multiple times and disconnected and reconnected the ethernet as fast as possible for 5 seconds. As I figured, this seemed to severely confuse the device. However, it recovered after ~2.5 minutes.
I am now running a duration test to ensure the connection stays reliable for an extended period of time and have remained connected for 5 days. I will update this PR if any disconnects and automatic reconnects occur.
Types of changes
Checklist: