Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

risc-v/backtrace: Support backtrace dump during IRQ #15815

Merged

Conversation

fdcavalcanti
Copy link
Contributor

@fdcavalcanti fdcavalcanti commented Feb 11, 2025

Summary

This PR adds support for backtrace on RISC-V on certain build settings explained below.
Up to this point, it was not printing anything useful during crashes.

Impact

Adds backtrace for task and IRQs when the RISC-V system crashes.

Testing

Tested on Espressif RISC-V devices and rv-virt.
All tests are based on:

  • defconfig: rv-virt:nsh
  • CONFIG_SCHED_BACKTRACE=y
  • CONFIG_FRAME_POINTER=y
  • CONFIG_DEBUG_SYMBOLS=y

The test is executed as follows: build, run ostest and test the backtrace in two different ways:

  • Task exception: simply do *p = 0; inside a simple test program.
  • IRQ exception: set up an interrupt handler and trigger the IRQ. Inside the interrupt routine, do *p = 0;.

The backtrace is verified manually using tools/btdecode.sh.

Test Scenario 1

  • CONFIG_SMP=n
  • CONFIG_ARCH_INTERRUPTSTACK=2048

Command: qemu-system-riscv32 -semihosting -M virt,aclint=on -cpu rv32 -bios none -kernel nuttx -nographic

  • backtrace: PASS
  • ostest: PASS

Test Scenario 2

  • CONFIG_SMP=y
  • CONFIG_SMP_NCPUS=2
  • CONFIG_ARCH_INTERRUPTSTACK=2048

Command: qemu-system-riscv32 -semihosting -M virt,aclint=on -cpu rv32 -smp 2 -bios none -kernel nuttx -nographic

  • backtrace: PASS
  • ostest: PASS

Test Scenario 3

  • CONFIG_SMP=n
  • CONFIG_ARCH_INTERRUPTSTACK=0
  • CONFIG_INIT_STACKSIZE=4096

Command: qemu-system-riscv32 -semihosting -M virt,aclint=on -cpu rv32 -bios none -kernel nuttx -nographic
In this case, CONFIG_INIT_STACKSIZE must be increased since we don't have the interrupt stack.

  • backtrace: PASS
  • ostest: PASS

Example output

nsh> backtrace irq
riscv_exception: EXCEPTION: Store/AMO access fault. MCAUSE: 00000007, EPC: 8001ca0a, MTVAL: 00000000
riscv_exception: PANIC!!! Exception = 00000007
dump_assert_info: Current Version: NuttX  10.4.0 6465108038 Feb 11 2025 10:26:06 risc-v
dump_assert_info: Assertion failed panic: at file: :0 task: backtrace process: backtrace 0x8001ca52
up_dump_register: EPC: 8001ca0a
up_dump_register: A0: 00000013 A1: 8003d294 A2: 00000000 A3: 80000003
up_dump_register: A4: 800381d8 A5: 02000000 A6: 00000000 A7: 00000000
up_dump_register: T0: 80038030 T1: 80037830 T2: 00000000 T3: 00000000
up_dump_register: T4: 00000000 T5: 00000000 T6: 00000000
up_dump_register: S0: 80037fd0 S1: 8003a000 S2: 00000013 S3: 8003a000
up_dump_register: S4: 00000000 S5: 00000000 S6: 00000000 S7: 00000000
up_dump_register: S8: 00000000 S9: 00000000 S10: 00000000 S11: 00000000
up_dump_register: SP: 80037fc0 FP: 80037fd0 TP: 00000000 RA: 80000a72
dump_stackinfo: IRQ Stack:
dump_stackinfo:   base: 0x80037830
dump_stackinfo:   size: 00002048
dump_stackinfo:     sp: 0x80037fc0
stack_dump: 0x80037fa0: 800374fc 80037460 80037fd0 00001645 deadbeef 8003a000 80037fd0 80007900
stack_dump: 0x80037fc0: 00000025 8003a000 80037fe0 80037fe0 deadbeef deadbeef 80038010 8000068e
stack_dump: 0x80037fe0: deadbeef deadbeef deadbeef 8003d294 deadbeef deadbeef 00000000 8003d420
stack_dump: 0x80038000: 80000003 8001cad2 80038020 80000630 80000003 8001cad2 80038030 80000186
stack_dump: 0x80038020: deadbeef deadbeef 8003d430 8001cad2 00000000 00000000 00000000 00000000
dump_stackinfo: User Stack:
dump_stackinfo:   base: 0x8003c498
dump_stackinfo:   size: 00004040
stack_dump: 0x8003d294: 8001cad2 8001caca 8003d420 deadbeef 00000000 00000000 00000000 00000000
stack_dump: 0x8003d2b4: 8003d430 00000000 00000013 8001c9fc 00000000 0000007e 00000001 02000000
stack_dump: 0x8003d2d4: 00000000 00000000 8003c478 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d2f4: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d314: 00003880 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d334: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d354: 00003880 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d374: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d394: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d3b4: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
stack_dump: 0x8003d3d4: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000098
stack_dump: 0x8003d3f4: 8001c9fc 8003d420 800159b6 00000000 00000000 00000000 00000000 8003c478
stack_dump: 0x8003d414: 00000000 8003d430 8001cac4 8003c050 8001ca52 8003d450 80005562 00000000
stack_dump: 0x8003d434: 00000000 8003c478 00000002 00000000 00000000 8003d460 80001b7a 00000000
stack_dump: 0x8003d454: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
sched_dumpstack: backtrace| 2: 0x8002521c 0x800196ee 0x80015ed4 0x80007d3a 0x80000a72 0x8000068e 0x80000630 0x80000186
sched_dumpstack: backtrace| 2: 0x8001ca0a 0x80037fe0 0x8000068e 0x80000630 0x80000186 0x8001cad2 0x80005562 0x80001b7a
sched_dumpstack: backtrace| 2: 0x8001ca0a 0x80037fe0 0x8000068e 0x80000630 0x80000186 0x8001cad2 0x80005562 0x80001b7a
dump_tasks:    PID GROUP PRI POLICY   TYPE    NPX STATE   EVENT      SIGMASK          STACKBASE  STACKSIZE      USED   FILLED    COMMAND
dump_tasks:   ----   --- --- -------- ------- --- ------- ---------- ---------------- 0x80037830      2048      1632    79.6%    irq
dump_task:       0     0   0 FIFO     Kthread -   Ready              0000000000000000 0x8003a5b0      2032       732    36.0%    Idle_Task
dump_task:       1     1 100 RR       Task    -   Waiting Semaphore  0000000000000000 0x8003b538      1992      1660    83.3%!   nsh_main
dump_task:       2     2 255 RR       Task    -   Running            0000000000000000 0x8003c498      4040       460    11.3%    backtrace irq
sched_dumpstack: backtrace| 0: 0x8000802a 0x8003ad90 0x800005c4 0x80000048
sched_dumpstack: backtrace| 1: 0x80007fd2 0x800164ea 0x800164fe 0x8000a01a 0x800089ee 0x80009736 0x800097d0 0x8000834a
sched_dumpstack: backtrace| 1: 0x80008158 0x800080fc 0x80005562 0x80001b7a
sched_dumpstack: backtrace| 2: 0x8002521c 0x800196ee 0x80015b2c 0x800162d0 0x80015f66 0x80007d3a 0x80000a72 0x8000068e
sched_dumpstack: backtrace| 2: 0x80000630 0x80000186 0x8001ca0a 0x80037fe0 0x8000068e 0x80000630 0x80000186 0x8001cad2
sched_dumpstack: backtrace| 2: 0x80005562 0x80001b7a 0x8001ca0a 0x80037fe0 0x8000068e 0x80000630 0x80000186 0x8001cad2
sched_dumpstack: backtrace| 2: 0x80005562 0x80001b7a

btdecode output:

Backtrace dump for all tasks:

Backtrace for task 2:
0x8002521c: sched_backtrace at sched_backtrace.c:106
0x800196ee: sched_dumpstack at sched_dumpstack.c:71
0x80015b2c: dump_backtrace at assert.c:451
0x800162d0: nxsched_foreach at sched_foreach.c:69 (discriminator 2)
0x80015f66: dump_fatal_info at assert.c:769
 (inlined by) _assert at assert.c:904
0x80007d3a: riscv_exception at riscv_exception.c:135
0x80000a72: irq_dispatch at irq_dispatch.c:160
0x8000068e: riscv_doirq at riscv_doirq.c:113
0x80000630: riscv_dispatch_irq at qemu_rv_irq_dispatch.c:140
0x80000186: exception_common at riscv_exception_common.S:228
0x8001ca0a: assert_on_interrupt_handler at backtrace_qemu_main.c:110
0x80037fe0: _sbss at ??:?
0x8000068e: riscv_doirq at riscv_doirq.c:113
0x80000630: riscv_dispatch_irq at qemu_rv_irq_dispatch.c:140
0x80000186: exception_common at riscv_exception_common.S:228
0x8001cad2: ipi_trigger at backtrace_qemu_main.c:88
 (inlined by) backtrace_main at backtrace_qemu_main.c:142
0x80005562: nxtask_startup at task_startup.c:72 (discriminator 1)
0x80001b7a: nxtask_start at task_start.c:75
0x8001ca0a: assert_on_interrupt_handler at backtrace_qemu_main.c:110
0x80037fe0: _sbss at ??:?
0x8000068e: riscv_doirq at riscv_doirq.c:113
0x80000630: riscv_dispatch_irq at qemu_rv_irq_dispatch.c:140
0x80000186: exception_common at riscv_exception_common.S:228
0x8001cad2: ipi_trigger at backtrace_qemu_main.c:88
 (inlined by) backtrace_main at backtrace_qemu_main.c:142

Backtrace for task 1:
0x80007fd2: sys_call0 at syscall.h:161
 (inlined by) up_switch_context at riscv_switchcontext.c:85
0x800164ea: nxsched_waitpid at sched_waitpid.c:165
0x800164fe: waitpid at sched_waitpid.c:618
0x8000a01a: nsh_builtin at nsh_builtin.c:166
0x800089ee: nsh_execute at nsh_parse.c:552
0x80009736: nsh_parse_command at nsh_parse.c:2791
0x800097d0: nsh_parse at nsh_parse.c:2980
0x8000834a: nsh_session at nsh_session.c:248
0x80008158: nsh_consolemain at nsh_consolemain.c:81
0x800080fc: nsh_main at nsh_main.c:82
0x80005562: nxtask_startup at task_startup.c:72 (discriminator 1)
0x80001b7a: nxtask_start at task_start.c:75

Backtrace for task 0:
0x8000802a: up_idle at riscv_idle.c:77
0x8003ad90: ?? ??:0
0x800005c4: qemu_rv_start at qemu_rv_start.c:233 (discriminator 1)
0x80000048: _init at qemu_rv_head.S:90

@github-actions github-actions bot added Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Size: M The size of the change in this PR is medium labels Feb 11, 2025
@nuttxpr
Copy link

nuttxpr commented Feb 11, 2025

[Experimental Bot, please feedback here]

Yes, this PR description meets the NuttX requirements. It clearly explains the change, its impact, and provides sufficient testing information to evaluate the functionality. The use of specific scenarios and pass/fail results is particularly helpful. It could be slightly improved by including snippets of the before/after backtrace output (even abbreviated) to demonstrate the actual change.

@lupyuen
Copy link
Member

lupyuen commented Feb 11, 2025

@nuttxpr test milkv_duos:nsh

@nuttxpr
Copy link

nuttxpr commented Feb 11, 2025

[Experimental Bot, please feedback here]

Build and Test Successful (milkv_duos:nsh)
https://gitlab.com/lupyuen/nuttx-build-log/-/snippets/4805229

$ git clone https://github.com/fdcavalcanti/nuttx nuttx --branch feature/risc-v_int_backtrace_review
$ git clone https://github.com/apache/nuttx-apps apps --branch master
$ pushd nuttx
$ git reset --hard HEAD
HEAD is now at 8aac8a457a risc-v/backtrace: Support backtrace dump during IRQ
$ popd
$ pushd apps
$ git reset --hard HEAD
HEAD is now at fb0c1e10d system/uorb: require that LIBC_FLOATINGPOINT be enabled for DEBUG_UORB
$ popd
NuttX Source: https://github.com/apache/nuttx/tree/8aac8a457a835f042498e497d1e0d38f299791bf
NuttX Apps: https://github.com/apache/nuttx-apps/tree/fb0c1e10ded2a6fb9f066b9893662cbcc86e4646
$ cd nuttx
$ tools/configure.sh milkv_duos:nsh
$ make -j
$ make -j export
$ pushd ../apps
$ ./tools/mkimport.sh -z -x ../nuttx/nuttx-export-12.1.0.tar.gz
$ make -j import
$ popd
$ genromfs -f initrd -d ../apps/bin -V NuttXBootVol
$ head -c 65536 /dev/zero
$ cat nuttx.bin /tmp/nuttx.pad initrd
$ scp Image tftpserver:/tftpboot/Image-sg2000
$ ssh tftpserver ls -l /tftpboot/Image-sg2000
$ cd /home/luppy/nuttx-build-farm
$ ssh tftpserver
OpenSBI v0.9
nsh> uname -a
NuttX 12.1.0 8aac8a457a Feb 11 2025 21:06:37 risc-v milkv_duos
nsh> ostest
arena       81000    81000
ordblks         2        3
mxordblk    7cff8    78ff8
uordblks     2660     4570
fordblks    7e9a0    7ca90
user_main: Exiting
ostest_main: Exiting with status 0
nsh> Now running https://github.com/lupyuen/nuttx-build-farm/blob/main/oz64-power.sh off
----- Power off Oz64
[]

Copy link
Member

@lupyuen lupyuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression Test is successful on Real Hardware: Oz64 SG2000 RISC-V SBC. Thanks :-)

@fdcavalcanti fdcavalcanti force-pushed the feature/risc-v_int_backtrace_review branch from 8aac8a4 to 0baab2c Compare February 11, 2025 16:58
@fdcavalcanti
Copy link
Contributor Author

I have added a backtrace example output on the PR description.

@fdcavalcanti
Copy link
Contributor Author

About breaking this up in smaller PRs.
Here's the first one #15835 .

@xiaoxiang781216
Copy link
Contributor

About breaking this up in smaller PRs. Here's the first one #15835 .

it's enough to split the big change into the indeepent patch.

@fdcavalcanti fdcavalcanti force-pushed the feature/risc-v_int_backtrace_review branch from 6656d9f to 7ebfe0f Compare February 17, 2025 10:55
@fdcavalcanti
Copy link
Contributor Author

About breaking this up in smaller PRs. Here's the first one #15835 .

it's enough to split the big change into the indeepent patch.

Please check if this separation is enough. I can open a new PR if it makes more sense.

tmedicci and others added 2 commits February 17, 2025 07:57
Adds support for backtrace when the system crashes during IRQ for RISC-V.
Tested with SMP, no SMP and no interrupt stack.
@fdcavalcanti fdcavalcanti force-pushed the feature/risc-v_int_backtrace_review branch from 7ebfe0f to 5cf2e43 Compare February 17, 2025 10:58
@lupyuen
Copy link
Member

lupyuen commented Feb 17, 2025

@nuttxpr test milkv_duos:nsh

@nuttxpr
Copy link

nuttxpr commented Feb 17, 2025

[Experimental Bot, please feedback here]

Build and Test Successful (milkv_duos:nsh)
https://gitlab.com/lupyuen/nuttx-build-log/-/snippets/4807555

$ git clone https://github.com/fdcavalcanti/nuttx nuttx --branch feature/risc-v_int_backtrace_review
$ git clone https://github.com/apache/nuttx-apps apps --branch master
$ pushd nuttx
$ git reset --hard HEAD
HEAD is now at 5cf2e434d6 arch/risc-v: improve nested interrupt assertion on riscv_doirq
$ popd
$ pushd apps
$ git reset --hard HEAD
HEAD is now at db8542d2b interpreters/python: fix patch to set `_PyRuntime` attribute
$ popd
NuttX Source: https://github.com/apache/nuttx/tree/5cf2e434d685cb75f4b67c57837184909b1f763d
NuttX Apps: https://github.com/apache/nuttx-apps/tree/db8542d2b12a93c39a154ef5d100daecceeb9863
$ cd nuttx
$ tools/configure.sh milkv_duos:nsh
$ make -j
$ make -j export
$ pushd ../apps
$ ./tools/mkimport.sh -z -x ../nuttx/nuttx-export-12.1.0.tar.gz
$ make -j import
$ popd
$ genromfs -f initrd -d ../apps/bin -V NuttXBootVol
$ head -c 65536 /dev/zero
$ cat nuttx.bin /tmp/nuttx.pad initrd
$ scp Image tftpserver:/tftpboot/Image-sg2000
$ ssh tftpserver ls -l /tftpboot/Image-sg2000
$ cd /home/luppy/nuttx-build-farm
$ ssh tftpserver
OpenSBI v0.9
nsh> uname -a
NuttX 12.1.0 5cf2e434d6 Feb 17 2025 19:38:55 risc-v milkv_duos
nsh> ostest
arena       81000    81000
ordblks         2        3
mxordblk    7cff8    78ff8
uordblks     2660     4570
fordblks    7e9a0    7ca90
user_main: Exiting
ostest_main: Exiting with status 0
nsh> Now running https://github.com/lupyuen/nuttx-build-farm/blob/main/oz64-power.sh off
----- Power off Oz64
[]

Copy link
Member

@lupyuen lupyuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested OK on Real Hardware: Oz64 SG2000 RISC-V SBC. Thanks :-)

Copy link
Contributor

@cederom cederom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fdcavalcanti :-) Just a git commit -s and we are ready to go :-)

@acassis acassis merged commit becba71 into apache:master Feb 19, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arch: risc-v Issues related to the RISC-V (32-bit or 64-bit) architecture Size: M The size of the change in this PR is medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants