Introduce a replacement for `base::name` #422

bal-e · 2024-10-28T10:49:49Z

The base::new_name module seeks to provide a simpler and more ergonomic interface for working with domain names. Its core types, Name, RelName, etc., are not generic over an underlying octets type: they are simple unsized byte slices, which can be stored in any container. This greatly simplifies their API and makes their methods more amenable to optimization.

The idea is to gradually replace the use of base::name with base::new_name across the crate, and eventually to remove base::name and let base::new_name take its place. Given the large number of modules in the crate, this will likely take a while.

While most of the functionality of base::name has been replicated, some things have been explicitly omitted. The ToName and ToRelativeName traits have been dropped, so that most domain name functionality is supported on Name/RelName etc. but not on Chain or ParsedName. Since domain names are so small (usually less than 64 bytes, always less than 256 bytes), ParsedName objects can be copied out into regular Names before being used. By relying on direct byte slice operations, basic methods like domain name comparison have been sped up (from quadratic to linear time, in fact).

The goal of the 'new_name' module is to provide a simpler and more efficient implementation of the basic domain name types. Rather than being generic over the underlying byte sequence, 'Name' and 'RelName' are just unsized byte slices, leaving their allocation up to the user. Their methods are entirely based in slice manipulations, rather than through generic label iteration. They should be much more performant, but we are lacking benchmarks.

I was worried I'd have to iterate over the labels, but actually, it's pretty straightforward. I hope the compiler can vectorize it nicely.

Unlike the existing 'NameBuilder', this type uses a fixed-size buffer to write the name into. This results in simpler code and it should be more efficient. It provides simple methods to extract domain names by borrowing from the buffer instead of allocating. This is a rewrite of <#394>.

It was returning the value of 'total_len()'. Also fixed a clippy warning.

I also fixed a few bugs in the existing name methods.

bal-e · 2024-10-28T10:55:03Z

@partim, @Philip-NLnetLabs: I'd like to hear your opinion on base::new_name::octets. I'm still thinking about what this interface should look like -- at least some names will have to be changed. Do you think it represents a good fit for byte slices with bounded sizes? Where should this module be placed? (does it fit in octseq?) And do we need Bytes-like functionality on top of it? (while Arc<Name> works, that does not offer slicing support).

An incomplete and untested Punycode decoder has also been added.

A basic quadratic-time output builder has been implemented. At the moment, no further validation of U-labels is performed; we need a way to represent U-labels (even owned ones) and to validate and encode them from there.

I wasn't comfortable with the 'Owned' paradigm, particularly due to the generic buffer parameter. It's easier to work with an explicit buffer type, and in some cases it may even have useful additional methods.

bal-e added enhancement New feature or request needs-review labels Oct 28, 2024

bal-e self-assigned this Oct 28, 2024

bal-e marked this pull request as draft October 28, 2024 10:49

bal-e added 15 commits October 28, 2024 11:50

[base/new_name] impl 'Hash'

0a37d6d

I was worried I'd have to iterate over the labels, but actually, it's pretty straightforward. I hope the compiler can vectorize it nicely.

[base/new_name] Split into submodules

04b315b

[base/new_name] Note runtime complexity in docs

f744582

[base/new_name] Augment error types

c4b1b9f

[new_name/label] Add 'is_internationalized()'

06ac4d2

[new_name/absolute] fix rustdoc typo

68d2a5c

[base/new_name] impl conversion traits

5954d7a

[base/new_name/label] fix missing doc links

97030db

[base/new_name] Add 'Labels' for iteration

148dcfe

[base/new_name/builder] Fix bug in 'len()'

4e2da52

It was returning the value of 'total_len()'. Also fixed a clippy warning.

[base/new_name] Implement 'UncertainName'

aa70192

I also fixed a few bugs in the existing name methods.

[base/new_name] Try out a custom 'Octets' and 'Owned'

2ca8a33

[new_name/label] Add 'WILDCARD' and 'is_wildcard()'

6614676

bal-e force-pushed the new-name branch from 843a837 to 6614676 Compare October 28, 2024 10:51

[base/new_name] Use 'static for const references

7e995b4

bal-e requested a review from a team October 28, 2024 11:05

bal-e added 6 commits October 28, 2024 22:27

[new_name] Implement basic IDNA awareness

615c2eb

An incomplete and untested Punycode decoder has also been added.

[idna] add tests for A-label decoding

f2761fb

A basic quadratic-time output builder has been implemented. At the moment, no further validation of U-labels is performed; we need a way to represent U-labels (even owned ones) and to validate and encode them from there.

fix clippy warnings

ab65fd0

Merge branch 'main' into new-name

69eade9

[new_name/idna] Note 'std' requirement for tests

94371d1

[new_name] Replace 'Owned' with manual '*Buf' types

d25e482

I wasn't comfortable with the 'Owned' paradigm, particularly due to the generic buffer parameter. It's easier to work with an explicit buffer type, and in some cases it may even have useful additional methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a replacement for `base::name` #422

Introduce a replacement for `base::name` #422

bal-e commented Oct 28, 2024

bal-e commented Oct 28, 2024

Introduce a replacement for base::name #422

Are you sure you want to change the base?

Introduce a replacement for base::name #422

Conversation

bal-e commented Oct 28, 2024

bal-e commented Oct 28, 2024

Introduce a replacement for `base::name` #422

Introduce a replacement for `base::name` #422