Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support warnings #2086

Merged
merged 8 commits into from
Nov 18, 2024
Merged

Support warnings #2086

merged 8 commits into from
Nov 18, 2024

Conversation

jneem
Copy link
Member

@jneem jneem commented Nov 1, 2024

This adds a bunch of infrastructure for warnings (and for reporting multiple errors, potentially), and also a warning for applying a naked function as a contract.

Most of the PR involves the threading of some extra global state through the program and the vm. I could see a few options:

  1. pass the extra state as a parameter to lots of functions
  2. pass in the extra state as something to be owned by the vm, and then allow it to be taken back out
  3. add a lifetime parameter to the program and vm, so that a &mut <global state> can be passed in.

I went with the third option. The first option would require a lot of churn (and then potentially more churn if we want to change the type of the global state). The second option would require the global state to either be a concrete type (but I wanted different types in the cli and nls) or a type parameter. So then given that it looked like we'd want an additional generic parameter anyway, I went for the lifetime parameter of the third option.

Copy link
Contributor

github-actions bot commented Nov 1, 2024

🐰 Bencher Report

Branch2086/merge
Testbedubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results
BenchmarkLatencynanoseconds (ns)
fibonacci 10📈 view plot
⚠️ NO THRESHOLD
493,760.00
foldl arrays 50📈 view plot
⚠️ NO THRESHOLD
1,676,900.00
foldl arrays 500📈 view plot
⚠️ NO THRESHOLD
6,721,000.00
foldr strings 50📈 view plot
⚠️ NO THRESHOLD
7,058,600.00
foldr strings 500📈 view plot
⚠️ NO THRESHOLD
62,395,000.00
generate normal 250📈 view plot
⚠️ NO THRESHOLD
45,063,000.00
generate normal 50📈 view plot
⚠️ NO THRESHOLD
1,947,700.00
generate normal unchecked 1000📈 view plot
⚠️ NO THRESHOLD
3,335,300.00
generate normal unchecked 200📈 view plot
⚠️ NO THRESHOLD
761,090.00
pidigits 100📈 view plot
⚠️ NO THRESHOLD
3,193,400.00
pipe normal 20📈 view plot
⚠️ NO THRESHOLD
1,494,300.00
pipe normal 200📈 view plot
⚠️ NO THRESHOLD
9,996,300.00
product 30📈 view plot
⚠️ NO THRESHOLD
824,190.00
scalar 10📈 view plot
⚠️ NO THRESHOLD
1,515,900.00
sum 30📈 view plot
⚠️ NO THRESHOLD
809,620.00
🐰 View full continuous benchmarking report in Bencher

@@ -117,7 +117,7 @@ impl Files {
// This implementation would be a little nicer if `Vector` supported mutable access.
// unwrap: we're allowed to panic if file_id is invalid
let mut old = self.get(file_id).unwrap().clone();
old.source = source.into();
old = File::new(old.name, source);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a recently-introduced but where the line_starts cache wasn't updated. Unlikely to trigger in nickel itself, but it caused some problems for nls.

@jneem jneem marked this pull request as ready for review November 4, 2024 08:10
@yannham
Copy link
Member

yannham commented Nov 12, 2024

I haven't looked at it in details yet, but why won't something like Vec<Box<dyn IntoDiagnostic>> work for 2. ? Do you need to access the actual structure of the data? As a first impression, I'm not too keen on adding yet another lifetime parameter to Vm and Program just for warnings if it can be avoided (and as long as it doesn't impact performance on the happy path, but this shouldn't).

@jneem
Copy link
Member Author

jneem commented Nov 12, 2024

Fair enough, I'll explore option 2 a little. I didn't want to tie the implementation to a vec of diagnostics, because I want to allow for the possibility that some diagnostics handlers will want to issue diagnostics immediately while evaluation continues.

I do have another secret motivation for wanting to try out a lifetime parameter on Vm, which is to allow a Cache that isn't owned by the vm, so that it can be re-used.

@yannham
Copy link
Member

yannham commented Nov 12, 2024

Fair enough, I'll explore option 2 a little. I didn't want to tie the implementation to a vec of diagnostics, because I want to allow for the possibility that some diagnostics handlers will want to issue diagnostics immediately while evaluation continues.

By issuing you mean that you might want to print a diagnostic immediately, instead of accumulating and only at the end printing them? In that case I think you can still keep the Renderer idea. But as this isn't expected to be called a lot or to matter that much for performance, I would put it under a Box<dyn _> instead of carrying a lifetime parameter everywhere. In fact, it's a separate subject but that's probably what we should havedone as well for the ImportResolver parameter, because the number of times resolve is called doesn't justify the pollution of the VirtualMachine type with R: ImportResolver everywhere, IMHO.

I do have another secret motivation for wanting to try out a lifetime parameter on Vm, which is to allow a Cache that isn't owned by the vm, so that it can be re-used.

That's a fair motivation, but probably slightly orthogonal (at least in the Box<dyn _> vs lifetime parameter debate). Also, would those lifetime be necessarily the same ? (I guess the lifetime is in fact a way of speaking about 'self in some sense?)

Copy link
Contributor

github-actions bot commented Nov 13, 2024

@jneem
Copy link
Member Author

jneem commented Nov 13, 2024

That's a fair motivation, but probably slightly orthogonal (at least in the Box vs lifetime parameter debate). Also, would those lifetime be necessarily the same ? (I guess the lifetime is in fact a way of speaking about 'self in some sense?)

I think we could get by with a single lifetime parameter, but I guess that's something to look into later anyway.

But as this isn't expected to be called a lot or to matter that much for performance, I would put it under a Box instead of carrying a lifetime parameter everywhere.

The previous iteration did in fact use a Box<dyn Reporter>, but it had a lifetime (i.e. Box<dyn Reporter + 'ctx>) because it was internally using a mutable reference to push errors into something owned by something longer-lived than the Program. The latest version removes the lifetime, and instead uses an mpsc channel to deliver the errors. It feels a little strange because I usually think of channels as involving threads, but it actually worked pretty well I think. (And it's encapsulated, so no one else has to think about the channels)

Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was initially a bit skeptical about the overall size of the diff, in the end there are many things I like in this PR, mainly:

  • We get rid of a lot of juggling caused by the need to include program in CLI errors before; now we can just cheaply clone files and call it a day, which looks much cleaner 👍 (we also got rid of unsatisfying report_xxx methods in Program)
  • Using channels for reporting stuff is actually quite elegant, and corresponds well to the mental model of sending data to a reporter or a terminal.

TODOs that I see for the future:

  • Add more warnings (another obvious one would be unused variable, which can be done currently in the free var transform, and in the future in the compiler)
  • Various flags to control warnings (I suppose if you run this version of Nickel on a large existing code base that use naked contracts you'll get a full load of warnings). Like -Werror and to disable them at least?

if self.customize_mode.is_empty() {
return Ok(program);
}

let evaled = match program.eval_record_spine() {
Ok(evaled) => evaled,
// We need a `return` control-flow to be able to take `program` out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Not anymore 🙂

cli/src/customize/mod.rs Outdated Show resolved Hide resolved
cli/src/error.rs Show resolved Hide resolved
cli/src/error.rs Outdated
Comment on lines 165 to 168
// This impl is allowed to incorrectly say that two warnings are equal. We allow
// this in order to skip comparing the full `Files` databases, which could be
// large. Since this is only used for deduplicating warnings, the impact of a
// false answer is limited.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this; do you have an example?

Edit: Ok, now I get it - you mean that you don't compare the file database. I think the wording is a bit misleading (it's just that equality ignores the file database, which is ok; initially I wondered if this impl would break the laws of Eq and PartialEq, which is not ok). Maybe saying that directly, that is "This implementation only compare the content of the warnings and ignores the Files database for obvious performance reasons. Since this is only [...]" instead of "is allowed to incorrectly say that two warnings are equal", which is less specific IMHO. Also I guess having two equal warnings (meaning with the same positions stored as well, including file ids and byte offsets) that actually points to different files is highly unlikely.

core/src/program.rs Outdated Show resolved Hide resolved
core/src/program.rs Outdated Show resolved Hide resolved
core/src/error/warning.rs Outdated Show resolved Hide resolved
@jneem
Copy link
Member Author

jneem commented Nov 18, 2024

Thanks! I'm glad the refactors made sense, because it was starting to feel like I lost track of the original goal...

Various flags to control warnings (I suppose if you run this version of Nickel on a large existing code base that use naked contracts you'll get a full load of warnings). Like -Werror and to disable them at least?

Maybe it makes sense to bikeshed some sort of global nickel config? There's a long discussion here about how to integrate layered config into clap, but it appears there's no resolution yet. (I didn't read the whole thing, just scrolled to the end)

@jneem jneem enabled auto-merge November 18, 2024 04:05
@jneem jneem added this pull request to the merge queue Nov 18, 2024
Merged via the queue into master with commit afe47ff Nov 18, 2024
5 checks passed
@jneem jneem deleted the error-files branch November 18, 2024 04:46
@yannham
Copy link
Member

yannham commented Nov 18, 2024

Maybe it makes sense to bikeshed some sort of global nickel config? There's a long discussion clap-rs/clap#2763 about how to integrate layered config into clap, but it appears there's no resolution yet. (I didn't read the whole thing, just scrolled to the end)

Yes, I agree we need a structured approach, with relations and priorities between env vars, CLI arguments and a potential config file, and a consistent option naming scheme, instead of just adding ad hoc flags here and there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants