Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Hexagon-specific compiler-rt routines to Zig #21579

Open
alexrp opened this issue Oct 3, 2024 · 10 comments · May be fixed by #22029
Open

Port Hexagon-specific compiler-rt routines to Zig #21579

alexrp opened this issue Oct 3, 2024 · 10 comments · May be fixed by #22029
Labels
arch-hexagon Qualcomm Hexagon DSP compiler-rt contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@alexrp
Copy link
Member

alexrp commented Oct 3, 2024

To complete our Hexagon support, we will need to port the Hexagon-specific compiler-rt routines to naked functions in Zig: https://github.com/llvm/llvm-project/tree/6c25604df2f669a0403a17dbdbe5c081db1e80a1/compiler-rt/lib/builtins/hexagon

For some of these, as a stopgap, we may be able to get away with using the generic routines we already have and just exporting them with the Hexagon-specific names. That may even be preferable unless we have strong evidence that the hand-written routines are significantly better.

Here is the actual usage in LLVM:

As far as I can see, all but the memcpy helper they all use the regular C calling convention.

@alexrp alexrp added the arch-hexagon Qualcomm Hexagon DSP label Oct 3, 2024
@alexrp alexrp added this to the unplanned milestone Oct 3, 2024
@alexrp alexrp added enhancement Solving this issue will likely involve adding new logic or components to the codebase. contributor friendly This issue is limited in scope and/or knowledge of Zig internals. compiler-rt labels Oct 3, 2024
@androm3da
Copy link
Contributor

For some of these, as a stopgap

Does zig have an equivalent to -fno-builtins that might be another stopgap worth considering?

Is it useful to link against the clangrt library from the C/C++ toolchain or do most/all architectures for Zig have these built by/for zig itself?

@alexrp
Copy link
Member Author

alexrp commented Oct 9, 2024

Does zig have an equivalent to -fno-builtins that might be another stopgap worth considering?

Not at the moment. But we could just make our LLVM backend set that flag for Hexagon specifically. I'm unsure if that's sufficient to make LLVM stop emitting these libcalls though.

Is it useful to link against the clangrt library from the C/C++ toolchain or do most/all architectures for Zig have these built by/for zig itself?

As a rule, the Zig toolchain has to be completely self-contained except in cases where that's outright impossible (think *-windows-msvc). This is a requirement for our ability to cross-compile to most targets out of the box, and is one of the reasons why we maintain our own compiler-rt implementation: https://github.com/ziglang/zig/tree/master/lib/compiler_rt

@androm3da
Copy link
Contributor

As a rule, the Zig toolchain has to be completely self-contained except in cases where that's outright impossible (think *-windows-msvc). This is a requirement for our ability to cross-compile to most targets out of the box, and is one of the reasons why we maintain our own compiler-rt implementation: https://github.com/ziglang/zig/tree/master/lib/compiler_rt

Yeah that makes sense. We did contribute a similar item for rust not long ago. Though we kinda cheated there and just used a thin wrapper around the assembly. If we need to create zig implementations of these builtins with inline asm that might take a bit more doing.

@alexrp
Copy link
Member Author

alexrp commented Oct 9, 2024

It wouldn't actually be terribly hard since Zig does support naked functions. So you basically just preprocess the assembly files and then paste the resulting assembly into an asm volatile expression in a naked function with the right name, and then @export() it. It's just a bunch of boring grunt work, basically.

@androm3da
Copy link
Contributor

Yeah - I think my initial contribution for Rust looked more like that version, actually. For a couple of these algorithms, you lose a little bit of maintainability by taking the preprocessor output. But it's doable.

I don't yet have permission to contribute to zig but I'll make the request and see how it goes.

@androm3da
Copy link
Contributor

Here's the script I used to automate the "boring grunt work" for rust. Under review, this approach was rejected for rust's compiler-builtins. But if it suits zig, maybe this is a good starting point.

#!/usr/bin/env python

import re
import sys
from glob import glob
from pprint import pprint

file_text = '''
#![cfg(not(feature = "no-asm"))]
#![allow(unused_imports)]
#![allow(named_asm_labels)]

use core::intrinsics;

intrinsics! {
'''
#DEFS_PAT = re.compile(r'^\s*#define\s+(?P<val>\S+)\s+(?P<repl>\S+)')
DEFS_PAT = re.compile(r'#define\s*(?P<val>\S+)\s*(?P<repl>\S+)')
CPP_INCL_PAT = re.compile(r'\s*#\s*\d+\s*\".*\"')

def get_defs(contents):
    for def_ in DEFS_PAT.finditer(contents):
        gr = def_.groups()
        if '(' in gr[0]:
            continue
        yield gr[0], gr[1]

from subprocess import Popen, PIPE
import shlex

def xform_to_inline_asm(func_name, text, defs = None):
    if defs:
        print('defs', filename)
        pprint(defs)
        for def_val, repl in defs.items():
            pat = re.compile(r'\b' + def_val + '\b')
            text = pat.sub(repl, text)

    text = text.replace('{', '{{').replace('}', '}}')
    text = text.replace('"', r'\"')
    text = '\n'.join(f'    "{line}\\n",' for line in text.split('\n'))

    extra = r'''
".Lmemcpy_call:\n",
"jump memcpy@PLT\n",''' if func_name and 'likely_aligned_min32bytes_mult8bytes' in func_name else ''


    func_name = func_name if func_name else '__tbd'
    return f'''#[naked]
    pub unsafe extern "C" fn {func_name}() {{
        core::arch::asm!(
        {text}{extra}
        options(noreturn)
        );
    }}'''

PUB_LABEL_PAT = re.compile(r'\s*([^\.][A-Za-z0-9\._]+)\s*:\s*')
#PUB_LABEL_PAT = re.compile(r'^\s*(\S+)\s*:\s*')

def get_asm(dirname):
    for filename in glob(dirname + '*.S'):
        func = re.compile(r'^FUNCTION_BEGIN\s*(?P<func_name>\S+)$(?P<func_body>.*?)^FUNCTION_END', re.MULTILINE | re.DOTALL)
#       text = open(filename, 'rt').read()
        p = Popen(shlex.split(f'cpp {filename}'), stdout=PIPE)
        text = p.communicate()[0]
        text = text.decode('utf-8')
        text = '\n'.join(l for l in text.splitlines() if not CPP_INCL_PAT.search(l))
#       defs = dict(get_defs(text))
        defs = None

        matches = func.findall(text)
        if len(matches) > 1:
            print('too many: guessing',filename)
#           yield False, xform_to_inline_asm(None, text.strip(), defs)
            continue
        else:
            print('matches:', len(matches), filename)
        m = func.search(text)
        if not m:
            print('guessing', filename)
            l = PUB_LABEL_PAT.search(text)
            label = '__tbd'
            if l:
                label = l.groups()[0]
                print('\tfound', label)
            yield False, xform_to_inline_asm(label, text.strip(), defs)
            continue
#           raise Exception('oh no!')
        gr = m.groupdict()

        func_name = gr['func_name']
        func_text = gr['func_body'].strip()

        yield True, xform_to_inline_asm(func_name, func_text, defs)

if __name__ == '__main__':
#   print('new text')
#   print(new_text)
#   p = Popen(shlex.split(f'rustfmt {filename}'), stdout=PIPE)
    funcs = list(get_asm(sys.argv[1]))
    goodfuncs = [f for good, f in funcs if good]
    with open('src/hexagon.rs', 'wt') as f:
        f.write(file_text)
        for func in goodfuncs:
            f.write(func)
            f.write('\n\n')
        f.write('}\n')
    badfuncs = [f for good, f in funcs if not good]
    with open('src/hexagon_.rs', 'wt') as f:
        f.write(file_text)
        for func in badfuncs:
            f.write(func)
            f.write('\n\n')
        f.write('}\n')

@androm3da
Copy link
Contributor

It wouldn't actually be terribly hard since Zig does support naked functions. So you basically just preprocess the assembly files and then paste the resulting assembly into an asm volatile expression in a naked function with the right name, and then @export() it. It's just a bunch of boring grunt work, basically.

@alexrp does this look about right?

fn __hexagon_umodsi3() align(16) callconv(.Naked) void {
    @setRuntimeSafety(false);
    asm volatile (
      \\ {
      \\   r2 = cl0(r0)
      \\   r3 = cl0(r1)
      \\   p0 = cmp.gtu(r1,r0)
      \\  }
      \\  {
      \\   r2 = sub(r3,r2)
      \\   if (p0) jumpr r31
      \\  }
      \\  {
      \\   loop0(1f,r2)
      \\   p1 = cmp.eq(r2,#0)
      \\   r2 = lsl(r1,r2)
      \\  }
      \\  .falign
      \\ 1:
      \\  {
      \\   p0 = cmp.gtu(r2,r0)
      \\   if (!p0.new) r0 = sub(r0,r2)
      \\   r2 = lsr(r2,#1)
      \\   if (p1) r1 = #0
      \\  }:endloop0
      \\  {
      \\   p0 = cmp.gtu(r2,r0)
      \\   if (!p0.new) r0 = sub(r0,r1)
      \\   jumpr r31
      \\  }
        );
        unreachable;
    }

@alexrp
Copy link
Member Author

alexrp commented Nov 16, 2024

  • I think you can drop the @setRuntimeSafety(false) at the beginning and unreachable at the end. I know a lot of existing compiler-rt routines have these, but I think they're just leftovers from a time when the compiler wasn't as smart about naked functions.
  • .Naked is deprecated; the spelling is .naked now.
  • align(16) shouldn't be necessary anymore.
  • Recommend using noreturn return type for naked functions. Effectively, the signature for such functions doesn't matter because you have to @ptrCast them to a 'proper' function pointer type to call them anyway. (This might even become mandatory in the future; see Proposal: A definition of naked functions based on comptime evaluation #21415.)
  • Finally, you'll need a comptime block with @exports for each function.

I haven't reviewed the actual assembly in depth, but I'm happy to trust you on that. 🙂

@androm3da
Copy link
Contributor

  • I think you can drop the @setRuntimeSafety(false) at the beginning and unreachable at the end. I know a lot of existing compiler-rt routines have these, but I think they're just leftovers from a time when the compiler wasn't as smart about naked functions.
  • .Naked is deprecated; the spelling is .naked now.
  • align(16) shouldn't be necessary anymore.
  • Recommend using noreturn return type for naked functions. Effectively, the signature for such functions doesn't matter because you have to @ptrCast them to a 'proper' function pointer type to call them anyway. (This might even become mandatory in the future; see Proposal: A definition of naked functions based on comptime evaluation #21415.)
  • Finally, you'll need a comptime block with @exports for each function.

I haven't reviewed the actual assembly in depth, but I'm happy to trust you on that. 🙂

ok - excellent. I've made these changes locally and should be able to share them Real Soon Now.

@androm3da
Copy link
Contributor

I've made these changes locally and should be able to share them Real Soon Now.

Opened #22029

@alexrp alexrp linked a pull request Nov 22, 2024 that will close this issue
@alexrp alexrp modified the milestones: unplanned, 0.14.0 Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-hexagon Qualcomm Hexagon DSP compiler-rt contributor friendly This issue is limited in scope and/or knowledge of Zig internals. enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants