Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APC: split_pattern on slices #457

Open
eduardorittner opened this issue Oct 7, 2024 · 0 comments
Open

APC: split_pattern on slices #457

eduardorittner opened this issue Oct 7, 2024 · 0 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@eduardorittner
Copy link

Proposal

Add a split_pattern on slices, which is similar to split on slices but takes a slice instead of a closure, and splits the first slice on every instance of the provided pattern slice.

Problem statement

Inspiration was taken from rust-lang/rust#49036, which basically suggests extending the existing String::split to a more generic slice::split_pattern which was implemented for any T: PartialEq, and maybe a Vec::split_pattern as well.

Motivating examples or use cases

From the original issue, suppose you have a Vec<u8> of non-UTF8 data and you want to split on newlines, it would be very nice to be able so simply my_vec.split_pattern(b"\n"), instead of my_vec.split(|x| *x == b'\n'). The closure on split is way clunkier when you want to match on multi-element patterns, since split runs the closure for one given element, not a slice.

Solution sketch

I didn't know about APCs, so I made a PR before this with an idea of how it could be implemented for slices, with the following struct:

pub struct SplitPattern<'a, 'b, T>
where
    T: cmp::PartialEq,
{
    v: &'a [T],
    pattern: &'b [T],
    finished: bool,
}

and the most important method:

impl<'a, 'b, T> Iterator for SplitPattern<'a, 'b, T>
where
    T: cmp::PartialEq,
{
    type Item = &'a [T];

    #[inline]
    fn next(&mut self) -> Option<&'a [T]> {
        if self.finished {
            return None;
        }

        for i in 0..self.v.len() {
            if self.v[i..].starts_with(&self.pattern) {
                let (left, right) = (&self.v[0..i], &self.v[i + self.pattern.len()..]);
                let ret = Some(left);
                self.v = right;
                return ret;
            }
        }
        self.finish()
    }
}

next_back would be implemented similarly using ends_with instead of starts_with.

The implementation for Vec would be pretty similar I think.

Alternatives

I read about SlicePattern, however the source comments said something about generalising core::str::Pattern so I wasn't sure if I should use it or not, and I also thought that doing that would be a little out of my range.

Links and related work

@eduardorittner eduardorittner added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

1 participant