Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make RestoreCommand handle unexpected exceptions. #5997

Open
wants to merge 18 commits into
base: dev
Choose a base branch
from

Conversation

Nigusu-Allehu
Copy link
Contributor

@Nigusu-Allehu Nigusu-Allehu commented Aug 27, 2024

Bug

Fixes: NuGet/Home#13188

Description

recreates #5878

  • Catch all exceptions, and write them to the projects asset file.
  • This is done in RestoreCommad
  • That means, RestoreCommand.ExecuteAsync never throws an exception.

For the screenshots below, the following synthetic projects were used:
- bad.csproj : A project that throws an exception upon restore.
- good.csproj: A project that restores successfully.
- multiprojectWithOneErrors.csproj: A project

Currently, restoring this solution will result in an exception that interrupts the restore from restoring projects that did not cause the exception.

image

After this change, exception are caught allowing restore to complete for other projects as following:

image

VS side

After the change, here's how the errors will be displayed on VS:

image

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc.

@Nigusu-Allehu Nigusu-Allehu requested a review from a team as a code owner August 27, 2024 23:05
@Nigusu-Allehu Nigusu-Allehu marked this pull request as draft August 27, 2024 23:05
@Nigusu-Allehu Nigusu-Allehu changed the title Dev nyenework restore exceptions Make RestoreCommand handle unexpected exceptions. Aug 27, 2024
@Nigusu-Allehu
Copy link
Contributor Author

Comment from closed PR: #5878 (review)
@nkolev92

 I'm glad we're getting this fixed. 

A few things that'd be interesting to me before we commit to the particular solution. 

* How does Visual Studio behave in the same scenario? Ideally the experience is actionable in all scenarios.
* What kind of error in particular are we trying to solve? Can we add a test for that?
* Ideally for "uncontrolled" errors, we have a code and something telling our customers to file an issue. 

Some design considerations around restore changes that aren't really documented in a single space.

* For each restore, the output is an assets file, nuget.g.props and nuget.g.targets files. We try to generate that regardless of the reason for failure.
* All warnings and errors generated for a project must be part of the assets file. To achieve that, the errors/warnigns need to be generated in the RestoreCommand.
* In .NET SDK based projects, this is needed, because otherwise the warning/error will not even show up in the error list.

@Nigusu-Allehu
Copy link
Contributor Author

Comment from closed PR: #5878 (review) @nkolev92

 I'm glad we're getting this fixed. 

A few things that'd be interesting to me before we commit to the particular solution. 

* How does Visual Studio behave in the same scenario? Ideally the experience is actionable in all scenarios.
* What kind of error in particular are we trying to solve? Can we add a test for that?
* Ideally for "uncontrolled" errors, we have a code and something telling our customers to file an issue. 

Some design considerations around restore changes that aren't really documented in a single space.

* For each restore, the output is an assets file, nuget.g.props and nuget.g.targets files. We try to generate that regardless of the reason for failure.
* All warnings and errors generated for a project must be part of the assets file. To achieve that, the errors/warnigns need to be generated in the RestoreCommand.
* In .NET SDK based projects, this is needed, because otherwise the warning/error will not even show up in the error list.

Currently, the idea is to catch all exceptions in RestoreCOmmand.ExecuteAsync(). This would ensure no unhandled exceptions interrupt the restore operation, at least at the RestoreCommand Level. I am not able to think of a scenario where we would want an exception to be thrown by RestoreCommand.

Regarding the error code, I agree that we should have a general error code for this scenario. However, if the exception thrown has an error code with it, we should use the exceptions error code.

@nkolev92
Copy link
Member

Currently, the idea is to catch all exceptions in RestoreCOmmand.ExecuteAsync(). This would ensure no unhandled exceptions interrupt the restore operation, at least at the RestoreCommand Level. I am not able to think of a scenario where we would want an exception to be thrown by RestoreCommand.

That makes sense to me.
There's already some exceptions being caught, such as FatalProtocolException. In that case, the log message is assumed to have been logged.

Regarding the error code, I agree that we should have a general error code for this scenario. However, if the exception thrown has an error code with it, we should use the exceptions error code.

One thing that might be tricky is that in the current implementation when we catch FatalProtocolException we don't really log the message, we assume it's already been logged.
So we need to make sure we don't double log messages.
It is very possible this is enough, a generic error message indicating that something uncontrolled happened and that people need to file an issue to get us to implement a more concrete error.

@Nigusu-Allehu
Copy link
Contributor Author

Nigusu-Allehu commented Aug 30, 2024

Currently, the idea is to catch all exceptions in RestoreCOmmand.ExecuteAsync(). This would ensure no unhandled exceptions interrupt the restore operation, at least at the RestoreCommand Level. I am not able to think of a scenario where we would want an exception to be thrown by RestoreCommand.

That makes sense to me. There's already some exceptions being caught, such as FatalProtocolException. In that case, the log message is assumed to have been logged.

Regarding the error code, I agree that we should have a general error code for this scenario. However, if the exception thrown has an error code with it, we should use the exceptions error code.

One thing that might be tricky is that in the current implementation when we catch FatalProtocolException we don't really log the message, we assume it's already been logged. So we need to make sure we don't double log messages. It is very possible this is enough, a generic error message indicating that something uncontrolled happened and that people need to file an issue to get us to implement a more concrete error.

Currently, the idea is to catch all exceptions in RestoreCOmmand.ExecuteAsync(). This would ensure no unhandled exceptions interrupt the restore operation, at least at the RestoreCommand Level. I am not able to think of a scenario where we would want an exception to be thrown by RestoreCommand.

That makes sense to me. There's already some exceptions being caught, such as FatalProtocolException. In that case, the log message is assumed to have been logged.

Regarding the error code, I agree that we should have a general error code for this scenario. However, if the exception thrown has an error code with it, we should use the exceptions error code.

One thing that might be tricky is that in the current implementation when we catch FatalProtocolException we don't really log the message, we assume it's already been logged. So we need to make sure we don't double log messages. It is very possible this is enough, a generic error message indicating that something uncontrolled happened and that people need to file an issue to get us to implement a more concrete error.

Erro code:

  • I decided to go with Nu1000

Double logging

  • The idea is to catch all exceptions in RestoreCommand and add the errors to the asset file for the project
  • Which would eventually be logged both in CLI and VS
  • On the other hand, as you said, we do catch exceptions like FatalProtocolException. Here: , We do catch the exception and rethrow it if it is an error. And it would probably end up being caught again in RestoreCommand. I personally think we should create another task item that aims to remove these types of loggings - Caught exception being logged and thrown again. After catching an exception, we should: either rethrow without logging or log an error and not rethrow. THis way we avoid double logging.

@Nigusu-Allehu Nigusu-Allehu marked this pull request as ready for review August 30, 2024 21:45

if (unwrappedLogMessage != null)
{
assetsFile.LogMessages.Add(new AssetsLogMessage(LogLevel.Error, unwrappedLogMessage.Code, unwrappedLogMessage.Message, null));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your CLI restore example shows that MSBuild is attributing the exception to NuGet.targets, and not the csproj file. There's a property on AssetsLogMessage so NuGet tells MSBuild which project file the error is related to, which should be filled in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the CLI example in the description section. The screenshot I had was an old one - before I started adding the error to an assets file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about building the assets file after you log the message to the logger.

Currently this adds another place where we can create a discrepancy.

The pattern currently is:

  • You log warnings to the "collector" logger.
  • When you create the assets file, the "collector" logger creates assets file warnings.
  • It'd also be ideal if the failed restore result is created into fewer places. Same reason as above, fewer chances of a discrepancy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the assets file, as it’s responsible for capturing all log messages. is the suggestion to do _logger.Log(error) instead? or to do _logger.log(error) and then follow it up with asset.LogMessage(error)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, don't add messages to the assets file.
There's a collector logger that ensures that all warnings/errors will be part of the built assets file.

Compare that to how we log errors like NU1008, or warnings like NU1603. You call the logger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the collector logger write the assets file? This try-catch block is shallow in the call stack, and I wouldn't be surprised if an exception that bubbles all the way up here will not be written to the assets file by the collector logger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to find out if the collector logger writes to the asset file. From my understanding, we do the writing into the assets file in RestoreCommand.ExecuteAsync

Copy link
Member

@nkolev92 nkolev92 Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do all the logic restore ever would and you just write to the logger and that's it.
The try catch should end before we do the transformation at line 457.

Similarly, don't create a new RestoreResult and keep it in one place. That reduces the changes we end up with restore results that don't end up getting everything and either the output is broken or the result is misinterpreted.

@aortiz-msft
Copy link
Contributor

Can we please add a test for this change that documents the desired behavior?

@aortiz-msft aortiz-msft self-requested a review September 10, 2024 22:32
@aortiz-msft aortiz-msft added the Merge next release PRs that should not be merged until the dev branch targets the next release label Sep 10, 2024
@Nigusu-Allehu Nigusu-Allehu self-assigned this Sep 10, 2024
@dotnet-policy-service dotnet-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 27, 2024
@nkolev92 nkolev92 removed the Merge next release PRs that should not be merged until the dev branch targets the next release label Sep 27, 2024
@dotnet-policy-service dotnet-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Sep 27, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Oct 4, 2024
@Nigusu-Allehu Nigusu-Allehu reopened this Oct 29, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Oct 29, 2024
@Nigusu-Allehu Nigusu-Allehu marked this pull request as draft October 29, 2024 22:01
@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 5, 2024
@dotnet-policy-service dotnet-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 5, 2024
@Nigusu-Allehu Nigusu-Allehu marked this pull request as ready for review November 7, 2024 22:32
zivkan
zivkan previously approved these changes Nov 8, 2024
Copy link
Member

@zivkan zivkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have saved me half a day of debugging, if it was merged about a month ago 😁


if (unwrappedLogMessage != null)
{
assetsFile.LogMessages.Add(new AssetsLogMessage(LogLevel.Error, unwrappedLogMessage.Code, unwrappedLogMessage.Message, null));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, don't add messages to the assets file.
There's a collector logger that ensures that all warnings/errors will be part of the built assets file.

Compare that to how we log errors like NU1008, or warnings like NU1603. You call the logger.


// Assert
Assert.Contains("The 'packageB 1.0.0' package requires NuGet client version '9.0.0' or above, but the current NuGet version is", ex.Message);
Assert.Contains("The 'packageB 1.0.0' package requires NuGet client version '9.0.0' or above, but the current NuGet version is", result.LogMessages.First().Message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none of the tests modified in this PR validate that the assets file contains the log message. I'd expect the in-memory results to contain the log because the logger has scope longer than RestoreCommand. However, RestoreCommand needs to write the assets file, including the logger's messages, and if there's an unhandled exception, there's a decent chance that isn't happening.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 16, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot removed the Status:No recent activity PRs that have not had any recent activity and will be closed if the label is not removed label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-project restore should be resiliant to one project throwing an unhandled exception
4 participants