Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

armv8-r/libc: optimize libc string apis with asm #14928

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

jinliangli
Copy link
Contributor

Optimize libc string apis(memcpy/memset/memmove/memchr/strcmp/strlen) with arm32 assembly instruction including vfp and neon. Add arch releated elf parsing

Note: Please adhere to Contributing Guidelines.

Summary

optimize armv8r cortex-r52 libc apis.

Impact

armv8r cpu libc apis.
Use arm32 assemble / neon instructions to optimize libc apis (memcpy/memset/memmove/memchr/strcmp/strlen)

Testing

Tested on armv8r cortex-r52 cpu.

@github-actions github-actions bot added Area: OS Components OS Components issues Size: XL The size of the change in this PR is very large. Consider breaking down the PR into smaller pieces. labels Nov 25, 2024
@nuttxpr
Copy link

nuttxpr commented Nov 25, 2024

[Experimental Bot, please feedback here]

No, this PR description does not fully meet the NuttX requirements. Here's why and how to fix it:

Missing Information/Insufficient Detail:

  • Summary: While it mentions optimization, it lacks crucial details.
    • Why is this change necessary? Is there a performance bottleneck? Are the current implementations inefficient?
    • What functional part of the code is being changed? Be specific – mention the files/functions modified (e.g., libc/string/memcpy.c).
    • How does the change exactly work? "Use arm32 assemble / neon instructions" is too vague. Briefly explain the optimization techniques used (e.g., loop unrolling, SIMD instructions). Mention if any new functions were added. The mention of "arch related elf parsing" seems unrelated to string functions and needs further explanation or belongs in a separate PR.
  • Impact: The impact descriptions are too brief.
    • Is new feature added? If the ELF parsing is new, say so explicitly. Otherwise, it's likely "Is existing feature changed?" is YES.
    • Impact on user: Probably NO, but if there are any changes to function behavior (however subtle), explain them.
    • Impact on build: Possibly YES, as assembly code is being added. Specify any new build flags, dependencies, or changes to the build system.
    • Impact on hardware: Specify the specific ARM architectures affected (armv8r, cortex-r52 is a good start, but mention if others are impacted). Are there any implications for other architectures?
    • Impact on documentation: Likely YES if new features are added or if the behavior of existing functions changes. State whether documentation updates are included in the PR.
    • Impact on security: Consider potential security implications of optimized code (e.g., buffer overflows if optimizations change memory access patterns). If none, state "NO."
    • Impact on compatibility: Are there any compatibility concerns with older versions of NuttX or other ARM architectures?
  • Testing: The testing description is inadequate.
    • Build Host(s): Provide details about the build host OS, CPU architecture, and compiler version used.
    • Target(s): Be more specific about the target board and configuration.
    • Testing logs: Absolutely essential! Include actual logs demonstrating the functionality before and after the change. Show performance benchmarks if performance improvement is the goal. Demonstrate correctness by showing input and output.

Example of an Improved Summary:

This PR optimizes several libc string functions (memcpy, memset, memmove, memchr, strcmp, strlen) for the armv8r architecture (specifically cortex-r52) to improve performance. The current C implementations are inefficient, particularly for larger data sizes. This optimization utilizes ARM32 assembly language and NEON SIMD instructions to perform operations on multiple bytes simultaneously. Key optimizations include loop unrolling and specialized instructions for memory operations. This PR also adds a new ELF parsing function [...] (Explain its purpose and connection, or remove this if unrelated).

By providing more specific details in each section, the PR will be much clearer and easier for reviewers to evaluate. Remember to follow the contributing guidelines and provide concrete evidence of testing.

Optimize libc string apis(memcpy/memset/memmove/memchr/strcmp/strlen)
with arm32 assembly instruction including vfp and neon.
Add arch releated elf parsing

Signed-off-by: Jinliang Li <[email protected]>
@jinliangli jinliangli changed the title [vcos:bugfix] armv8-r:optimize libc string apis with asm armv8-r/libc: optimize libc string apis with asm Nov 25, 2024
@xiaoxiang781216 xiaoxiang781216 merged commit f3213ef into apache:master Nov 26, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: OS Components OS Components issues Size: XL The size of the change in this PR is very large. Consider breaking down the PR into smaller pieces.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants