proposal: separate crates for peripherals #902

rmsyn · 2025-01-28T04:30:06Z

Split peripheral modules into crates

This proposal is to add a CLI option to split peripheral modules into separate crates in a project workspace.

Motivation

For large PAC crates, compilation on machines with less-powerful cores is excruciatingly slow. The bottleneck for any projects depending on the PAC crate is obviously the PAC, because one core is pinged to full usage, while all other cores are idle.

By splitting the PAC into separate crates for each peripheral, each core could work on a separate peripheral crate. The separate crates could also improve incremental compilation, since a change in one of the peripheral crates would not necessarily require recompilation of other peripheral crates.

Challenges

Much of the plumbing is already there for creating separate crates for each peripheral.

Each peripheral already generates its own module.

Peripheral versioning would inherit from the top-level Cargo.toml version.

The main challenge will be making minimal changes to enable the code generation, without adding too much to maintainability burden.

Example

An example project structure might look like:

- Cargo.toml
- <soc>-generic
  - Cargo.toml
  - src
    - lib.rs
- <soc>-i2c
  - Cargo.toml
  - src
    - lib.rs
    - i2c0
    - i2c1
    - ...
    - i2c<N>
- <soc>-spi
  - Cargo.toml
  - src
    - lib.rs
    - spi0
    - spi1
    - ...
    - spi<N>

Alternative approaches

Once the parallel front-end is stable in rustc, the main motivation for this proposal may be less relevant.

However, there will still be users stuck on older versions of rustc where this proposal will be useful.

The text was updated successfully, but these errors were encountered:

burrbull · 2025-01-28T06:26:02Z

I like the idea. I think it is last chunk of the puzzle.
But it requires more changes then an option.

generic` part must be moved to independent module. (what name to use?). This also makes many other things easier.
Finish this Modules (form) #430

Other notes:

groupName should be used for crate names and existing "feature-group" option.
For folder names hash values could be used instead of peripheral names.
With #[path="hash"] mod spi1 attribute. Like in Hashed peripherals djmcgill/form#24. Advantages:

Allows to reuse identical peripherals across different chips.
You can skip writting (or even generating if hash is calculated from SVD directly) peripheral if folder already exists.

burrbull · 2025-01-29T03:35:09Z

So I've tried to implement 1 step but understood that I don't know how to do this. There are 2 issues:

Encapsulation. Some methods of common structures are made private to prevent manual creating. The only solution is to just hide them from docs.
It uses type aliases so you can't add implementation from "outside" (without traits). It can be solved by using "newtypes", which have also other advantages but it creating many new types overloads compiler and it much slower (3 times and bigger). See version 0.14 and previous.

But without this change splitting on different crates makes much less sense as each one will have it's own copy of register access implementation.

I still like the idea as it gives possibility to create and publish independent crates much easier so HALs can reuse them. Like https://github.com/stm32-rs/bxcan

Now to performance issues. I'd want to see on SVDs you are using to generate rust. The best way to improve performance is to reduce of code. You can do it by using register arrays, clusters and by using derivedFrom.

rmsyn · 2025-01-29T06:27:14Z

creating many new types overloads compiler and it much slower (3 times and bigger).

Wow, that is definitely the opposite of the goal of this proposal. Thank you for doing the experiment.

I still like the idea as it gives possibility to create and publish independent crates much easier so HALs can reuse them

I had a similar thought when you brought up the module hashing, like the peripheral crates could be reused by other PACs and HALs.

That may also require parameterizing the base address in some way (either on a Peripheral::new(base_addr: usize), Peripheral<const BASE: usize>, etc.).

The best way to improve performance is to reduce of code

This was also brought up in the meeting yesterday, and it certainly seems like one of the first, most straightforward steps to take in reducing compile times (especially when one has full control over the SVD).

However, I'm already heavily utilizing as much code re-use as possible. There is maybe a bit more re-use I could gain with the SVD attribute for saying a non-contiguous peripheral is based on another definition. For example, there are peripherals that are based on the same IP, but are not in contiguous memory.

I'd want to see on SVDs you are using to generate rust.

The main PAC that I maintain is jh71xx-pac, and I use svd-generator to generate the SVD (the latter is where I could probably make some additional optimizations for code re-use).

Here is a link to the jh71xx-pac SVD

burrbull · 2025-01-29T06:46:33Z

For example, there are peripherals that are based on the same IP, but are not in contiguous memory.

Yes. This is the first you should fix.

For example UART1 should looks like

    <peripheral derivedFrom="uart0">
      <name>uart1</name>
      <groupName>UART</groupName>
      <description>Synopsys DesignWare APB UART: uart1</description>
      <baseAddress>0x10010000</baseAddress>
      <addressBlock>
        <offset>0x0</offset>
        <size>0x10000</size>
        <usage>registers</usage>
      </addressBlock>
      <interrupt>
        <name>UART1</name>
        <value>33</value>
      </interrupt>
    </peripheral>

No registers block. But derivedFrom attribute.
Also I recommend to add groupName for all peripherals which specifies peripheral type.

Besides code reduce this also gives you possibility to write generic (over peripheral) code in your HAL instead of macros.

P.S. You could also try to use patched version of form I mentioned above. But this is useful only to reduce size of code across chips. As it will not give you faster compilation but only less of code.

rmsyn · 2025-01-31T05:53:05Z

No registers block. But derivedFrom attribute.
Also I recommend to add groupName for all peripherals which specifies peripheral type.

So, I followed your advice, and applied the derivedFrom attribute where applicable.

It shaved off ~72k LoC, and ~17-20 seconds from CI compile time!

Going to try it on a low-resource, multicore SBC now.

This is encouraging. I think if we can further improve with parallelizing the PAC build as much as possible, it will greatly improve compile times.

Locally, there is still a period where there seems to be a single core pinged with other idle cores.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: separate crates for peripherals #902

proposal: separate crates for peripherals #902

rmsyn commented Jan 28, 2025

burrbull commented Jan 28, 2025

burrbull commented Jan 29, 2025 •

edited

Loading

rmsyn commented Jan 29, 2025

burrbull commented Jan 29, 2025 •

edited

Loading

rmsyn commented Jan 31, 2025

proposal: separate crates for peripherals #902

proposal: separate crates for peripherals #902

Comments

rmsyn commented Jan 28, 2025

Split peripheral modules into crates

Motivation

Challenges

Example

Alternative approaches

burrbull commented Jan 28, 2025

burrbull commented Jan 29, 2025 • edited Loading

rmsyn commented Jan 29, 2025

burrbull commented Jan 29, 2025 • edited Loading

rmsyn commented Jan 31, 2025

burrbull commented Jan 29, 2025 •

edited

Loading

burrbull commented Jan 29, 2025 •

edited

Loading