Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature for easier test bootstrapping #70

Open
pitaj opened this issue Nov 22, 2020 · 12 comments
Open

Feature for easier test bootstrapping #70

pitaj opened this issue Nov 22, 2020 · 12 comments

Comments

@pitaj
Copy link
Contributor

pitaj commented Nov 22, 2020

While implementing a recent parser, I came upon an annoyance: manually slicing spans when writing test cases was very tedious.

So I modified nom_locate with a feature that allows for much easier testing by ignoring location information during test. It works by utilizing the fact that line should never be zero in normal operation:

    /// Create a span for use in testing
    /// 
    /// Spans created this way will be tested for equality purely on the fragment,
    /// ignoring the `offset` and `line`.
    #[cfg(feature = "easy-test")]
    pub fn test_new_extra(program: T, extra: X) -> LocatedSpan<T, X> {
        LocatedSpan {
            offset: 0,
            line: 0,
            fragment: program,
            extra,
        }
    }

#[cfg(feature = "easy-test")]
impl<T: AsBytes + PartialEq, X> PartialEq for LocatedSpan<T, X> {
    fn eq(&self, other: &Self) -> bool {
        if self.line == 0 || other.line == 0 {
            self.fragment == other.fragment
        } else {
            self.line == other.line && self.offset == other.offset && self.fragment == other.fragment
        }
    }
}

This allows me to define a function to easily use in tests:

function sp(s: &str) -> Span {
    Span::test_new_extra(s, Default::default())
}

And use it in a test like so:

assert_eq!(
    path(sp("a.b.c, what")),
    Ok((
        sp(", what"),
        Expression::Path {
            span: sp("a.b.c"),
            path: vec![
                PathPart::Part(sp("a")),
                PathPart::Part(sp("b")),
                PathPart::Part(sp("c"))
            ]
        }
    ))
);

I've never run into a case where span location was an issue, since it's always automatically derived correctly. So testing it in tests is a lot of tedium for very little if any benefit.

Let me know if there's a way of doing this I'm missing.

@progval
Copy link
Collaborator

progval commented Nov 22, 2020

We could implement PartialEq<T> for LocatedSpan<T>. It should work if your Expression and PathPart are generic in the string type, right?

@pitaj
Copy link
Contributor Author

pitaj commented Nov 22, 2020

Good idea! Those aren't generic in my program, but let me experiment and see if that'll work.

@pitaj
Copy link
Contributor Author

pitaj commented Nov 22, 2020

Okay I got it working, but it requires a custom impl PartialEq for each type you wish to do this on, because PartialEq is not generic over the right-hand side. It can get kinda annoying with large enums

impl<S: PartialEq<O>, O> PartialEq<Expression<O>> for Expression<S> {
    fn eq(&self, other: &Expression<O>) -> bool {
        match (self, other) {
            (Expression::StringLiteral(span), Expression::StringLiteral(other_span)) => {
                span == other_span
            }
            (Expression::Path { span, path }, Expression::Path { span: other_span, path: other_path }) => {
                span == other_span && path == other_path
            }
            (Expression::Negative { span, expr }, Expression::Negative { span: other_span, expr: other_expr }) => {
                span == other_span && **expr == **other_expr
            }
            (Expression::Helper { span, name, args }, Expression::Helper { span: other_span, name: other_name, args: other_args }) => {
                span == other_span && name == other_name && args == other_args
            }
            (Expression::LegacyHelper { span, name, args }, Expression::LegacyHelper { span: other_span, name: other_name, args: other_args }) => {
                span == other_span && name == other_name && args == other_args
            }
            _ => false
        }
    }
}

I also haven't been able to find a crate that provides a derive macro or anything to provide a better experience here, though I'm sure it's possible to create one.

I still think this is a better approach than my suggestion (it's really nice not needing to wrap every string literal in sp()).

@pitaj
Copy link
Contributor Author

pitaj commented Nov 22, 2020

This may be a dealbreaker though:

Because Result, Option, tuples, and other built-in types don't implement PartialEq like this, tests have to use .unwrap().1 all over the place because the following can't work:

        assert_eq!(
            path(sp("@value.c")),
            Ok((
                ".c",
                Expression::Path {
                    span: "@value",
                    path: vec![PathPart::Part("@value")]
                }
            ))
        );

Which means you either have to split checking rest and the parser output into two assertions, or just ignore one side or the other.

@progval
Copy link
Collaborator

progval commented Nov 22, 2020

Hmm yeah.

Did you consider writing a function that converts YourTree<LocatedSpan<T>> into YourTree<T>?
If your parse tree type isn't too large, it shouldn't be too much trouble.

Or alternatively, don't use LocatedSpan at all in most of your tests, by having your parser operate without nom_locate (with a few tests to make sure it still works with nom_locate, ofc)

@pitaj
Copy link
Contributor Author

pitaj commented Nov 22, 2020

How would you go about writing a generic parser like that without specifying tons of trait bounds all over the place? I tried defining one trait which has all of the rest in where Self: ... but I can't figure out how to get that to work for some nom requirements like <Self as nom::InputTakeAtPosition>::Item: nom::AsChar. I also can't figure out a way to allow str methods like .ends_with because span impls Deref<Target = &str> while &str impls Deref<Target = str>

@progval
Copy link
Collaborator

progval commented Nov 23, 2020

Hmm yeah, you're right, I didn't consider you could need those.

@pitaj
Copy link
Contributor Author

pitaj commented Nov 23, 2020

Did you consider writing a function that converts YourTree<LocatedSpan<T>> into YourTree<T>?
If your parse tree type isn't too large, it shouldn't be too much trouble.

I ended up just doing this, and making my own assert_eq! that calls it.

@progval
Copy link
Collaborator

progval commented Nov 23, 2020

Cool! Can I consider this issue solved?

@pitaj
Copy link
Contributor Author

pitaj commented Nov 23, 2020

Well I was able to implement a solution that worked for me, but I think this is still a pain point and warrants further study.

Also, if the default behavior of derive(PartialEq) ever changes to become more generic, it could enable your prior suggested solution.

I'd suggest keeping this issue open as a reminder.

@alixinne
Copy link
Contributor

alixinne commented Dec 1, 2020

I also had to write tests while trying to add nom_locate to an existing parser. I ended up with two traits to add (resp. remove) span information as needed for assertions: https://github.com/vtavernier/glsl/blob/aec07998d7882d92d8eb2d835379321056782aef/glsl/src/parse_tests.rs#L5-L107

/// Trait to add span information. ParseInput is a type alias for nom_locate::Span
trait Span<'c, 'd, 'e>: Sized {
  fn span(self) -> crate::parsers::ParseInput<'c, 'd, 'e>;
}

impl<'c> Span<'c, 'static, 'static> for &'c str {
  fn span(self) -> crate::parsers::ParseInput<'c, 'static, 'static> {
    crate::parsers::ParseInput::new_extra(self, None)
  }
}

// impls for other AST nodes

/// Trait to remove span information, to be implemented on syntax nodes
trait Unspan: Sized {
  type Unspanned: Sized;

  fn unspan(self) -> Self::Unspanned;
}

/// Simplest impl: nom_locate::Span back to str
impl<'c> Unspan for ParseInput<'c, '_, '_> {
  type Unspanned = &'c str;

  fn unspan(self) -> Self::Unspanned {
    self.fragment()
  }
}

// impls for Result, Option, etc.

/// Usage example:
assert_eq!(parser("test".span()).unspan(), Ok("test"))

The rest of the test harness relies on the fact that AST nodes are wrapped in a Node<T> which implements a trait to only compare node contents (regardless of span information): https://github.com/vtavernier/glsl/blob/aec07998d7882d92d8eb2d835379321056782aef/glsl/src/syntax/node.rs#L146

Regardless of the approach chosen I think we still need to have a way to compare span information for tests when needed, so having PartialEq implemented for LocatedSpan "ignoring" the span information wouldn't be the best bet IMO.

@bramp
Copy link

bramp commented Jun 24, 2021

I came across this thread after I solved this (in yet a different way):

    struct EasySpan<T, X> {
        input: LocatedSpan<T, X>,
    }

    impl<T, X> EasySpan<T, X>
    where
        LocatedSpan<T, X>: Slice<RangeFrom<usize>> + Slice<RangeTo<usize>>,
    {
        fn next(&mut self, count: usize) -> LocatedSpan<T, X> {
            let (after, before) = self.input.take_split(count);
            self.input = after;
            before
        }
    }

    #[test]
    fn test_tokenise() {
        let input = Span::new("A  1234");

        let mut span = EasySpan { input };
        let want = Tokens::new(&[
            Token::new(TokenType::Word, span.next(1)),
            Token::new(TokenType::Whitespace, span.next(2)),
            Token::new(TokenType::Number, span.next(4)),
        ]);
        let got = tokenise(input).expect("token parsing failed");
        assert_eq!(got, want);
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants