Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mpfs improve isr and error handling #225

Merged
merged 2 commits into from
Mar 15, 2024

Conversation

jlaitine
Copy link

Summary

This re-writes the ISR handling of can driver to improve real-time response and avoid lock-ups.

Also RX data buffer handling is made safe, the frame consistency is checked on every read from the RX buffer.

Impact

Fixes hangs

Testing

On-desk trials

@jlaitine jlaitine requested review from haitomatic and pussuw March 14, 2024 12:27
@jlaitine jlaitine marked this pull request as draft March 14, 2024 13:04
@jlaitine jlaitine force-pushed the mpfs_improve_isr_and_error_handling branch from 3033286 to f75c0a6 Compare March 14, 2024 13:25
@jlaitine jlaitine marked this pull request as ready for review March 14, 2024 13:28
@jlaitine
Copy link
Author

Now I guess it is ready for review

…andling

- Remove unnecessary looping in the interrupt handling
- Recover properly from the rx overflow error
- Check the RXMOF bit always to sychrnonize the RX frames to recover from any errors
- Clear the interrupts in a single place after the irq handler is finished

Signed-off-by: Jukka Laitinen <[email protected]>
…when in ERROR_ACTIVE state

Having error interrupts enabled while in ERROR_PASSIVE will cause interrupt storms, for example
bus error would be always asserted if the can is / gets disconnected.

Signed-off-by: Jukka Laitinen <[email protected]>
@jlaitine jlaitine force-pushed the mpfs_improve_isr_and_error_handling branch from f75c0a6 to 431b72d Compare March 14, 2024 14:11
@jlaitine
Copy link
Author

Some more testing, looks promising. The input from @vnopanen is that the drone has gone through flight test, and running on table without crashes/jams.

On desk another unit w. saluki-pi has gone through 16 hours of uav communication:
saluki> uptime
15:56:58 up 15:56, load average: 0.00, 0.00, 0.00
saluki> uavcan status
Pool allocator status:
Capacity hard/soft: 500/250 blocks
Reserved: 26 blocks
Allocated: 18 blocks

UAVCAN node status:
Internal failures: 0
Transfer errors: 0
RX transfers: 1378305
TX transfers: 22184190

CAN1 status:
HW errors: 0
IO errors: 0
RX frames: 1837892
TX frames: 22126786
CAN2 status:
HW errors: 0
IO errors: 0
RX frames: 1837892
TX frames: 22126786

ESC outputs:
INFO [mixer_module] Param prefix: UAVCAN_EC
Channel Configuration:
Channel 0: func: 101, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 1: func: 102, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 2: func: 103, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 3: func: 104, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 4: func: 0, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 5: func: 0, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 6: func: 0, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Channel 7: func: 0, value: 65535, failsafe: 65535, disarmed: 65535, min: 1, max: 8191
Servo outputs:
INFO [mixer_module] Param prefix: UAVCAN_SV
Channel Configuration:
Channel 0: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 1: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 2: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 3: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 4: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 5: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 6: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000
Channel 7: func: 0, value: 0, failsafe: 500, disarmed: 500, min: 0, max: 1000

Online nodes (Node ID, Health, Mode):
120 OK OPERAT
121 OK OPERAT
122 OK OPERAT
123 OK OPERAT

uavcan: cycle time: 50490729 events, 3433106204us elapsed, 67.99us avg, min 14us max 62608us 215.078us rms
uavcan: cycle interval: 50490729 events, 1137.46us avg, min 18us max 180918us 693.081us rms

Copy link

@haitomatic haitomatic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good simplification for the interrupt handling routine 👍 . I didnt find any questionable changes that could affect the previous functionality. Only one test that we need to conduct is https://github.com/tiiuae/mpfs_canfd_uavcan_example/tree/canfd_driver_test

it tests diff bitrate and classic / canfd payloads. the UAVCAN test only uses classic CAN.

@jlaitine
Copy link
Author

Merging this now, as it is fixing the rl issues found. Need to still run the tests @haitomatic pointed out, and if any issues are found, open new jira tickets of of those

@jlaitine jlaitine merged commit 34fa847 into master Mar 15, 2024
6 of 7 checks passed
@jlaitine jlaitine deleted the mpfs_improve_isr_and_error_handling branch March 15, 2024 11:27
@haitomatic
Copy link

yes, @vnopanen is conducting that test. Will open jira ticket if any issue found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants