-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#197] Canonicalize filepaths #230
Conversation
Cool, looking at your comments I like how you approach all this. Gonna take a thorough look by Monday. Please switch to another ticket meanwhile, unless you have something else to do here before my review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, better than what I expected to be done in this issue 👍
And your code generally looks well, cool.
I'm sorry for the delayed review, hope it wasn't a blocker for you.
On the failing tests: I believe that the switch from displaying absolute paths in most cases is very good. However I wonder why in case of Windows (this CI is in GitHub actions) we get relative paths with Also, I find that in case of There are tests that seem broken now while they shouldn't, e.g. "24 Ignore file with broken xrefcheck annotation: config file". And some diff seems to point to places that were invalid but now are valid, not sure how did this happen. Apparently, we had too many "Link targets a local file outside repository" errors, now it seems to work better. Could you please regenerate the golden files and carefully commit only those changes where we had "Link targets a local file outside repository" illegally, if that does not take much effort? This should give us an opportunity to document the reasoning of this change (in the commit description) and should simplify sorting through the remaining failing tests. |
I have moved |
Now the Golden test expected outputs reflect how the program output has changed. I will investigate the couple of diffs that are unexpectedly not being reported as invalid. |
429459e
to
9bc47b2
Compare
The golden test that is failing reports as "local file outside repository" a link that is not really outside the repository, but passes trough its root: the root directory is After the refactor, its path is treated as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the changes to the golden tests that were applied, looks neat.
A few more comments.
Interesting. So we got more flexible after your changes. I think this may be helpful in case the user has auto-generated documentation, so let's try to preserve this new feature of yours. But, |
9bc47b2
to
7380221
Compare
Another drawback that I have noticed with having file paths canonicalised is that, on case-insensitive systems, if the file This is why I have not fixed the Golden tests yet: some case-related errors that are reported in CI are not reported in my computer. |
511e5d0
to
dd17047
Compare
Oooh, that's a very good spot and a severe issue. This is a completely different concern compared to #197, so could you please create a separate ticket for this? And under it investigate how GitHub and GitLab handle the casing of filepaths, is it similar to anchors handling or not. Maybe we will close that new ticket with this PR too. Now on replacing Let's really go with the weakened canonicalization as you have it now, and please create another ticket for trying to get rid of |
src/Xrefcheck/System.hs
Outdated
-- | A FilePath that has been canonicalized. Should be created via | ||
-- 'canonicalizePath'. Currently, canonical paths have been made absolute, | ||
-- normalised regarding the running platform (e.g. Posix or Windows), with | ||
-- indirections syntractically expanded as much as possible and with no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, what is "syntractically"? Is it a typo?
src/Xrefcheck/System.hs
Outdated
Nothing -> [] | ||
|
||
-- | Get a relative 'FilePath' from the second given path (child) with | ||
-- repect to the first one (root). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*respect
prepare :: FilePath -> FilePath | ||
prepare path = case dropWhile FP.isPathSeparator path of | ||
"" -> "." | ||
other -> other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this name is confusing a bit, prepare for what? Wouldn't it be better to call it dropLeadingPathSeparators
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will give it a proper name. It also handles specially the empty case, returning "."
instead of ""
.
-- | ||
-- Within 'Gather', 'seFile' stores the 'FilePath' corresponding | ||
-- to the file in where the error was found. | ||
data ScanError (a :: ScanStage) = ScanError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that this 'Trees that grow' style fits in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I like your type families solution here, it makes particular sense. At your place, I would probably write it in the same way :)
However I think that two separate datatypes would work better here, and here's why: our core part serves as sort of API for potential scanners developers, and later there will be a lot of code that works with ScanError 'Parse
.
With the current approach, we win at the core side (fewer number of datatypes) and lose at the parsers' side (one has to always initialize that seFile = ()
), and the latter is more significant for the code amount and API-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be an option to use mkParseScanError
instead of initialising seFile
to ()
? I can also give type aliases to ScanError 'Parse
and ScanError 'Gather
also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, we already use a smart constructor there, so avoiding ()
each time won't be a problem.
Okay, let's go with it.
RIFileLocal -> checkRef rAnchor riRoot file "" | ||
RIFileRelative -> do | ||
let relativeToRoot = getPosixRelativeOrAbsoluteChild riRoot (takeDirectory file) | ||
</> toString rLink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, AFAIU, this is not always relative, from the function name. But even better it would be to call it shownPath
, like inside the checkRef
function, it seems that it doesn't really matter that it is relative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually so, I agree here. Having relativeToRoot
would make the code slightly more readable IMO, but here it is not necessarily relative.
[ testGroup "Canonicalization" | ||
[ testCase "Trailing separator" $ do | ||
path <- canonicalizePath "./example/dir/" | ||
getPosixRelativeOrAbsoluteChild current path @?= "example/dir", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please format it like this:
exceptions =
[ InvalidStatusCode
, MissingContentHeader
, InternalServerError
]
I'm perfectly fine with this PR, titanic work! I approved it. |
I will wait also for @Martoon-00 's review. And then, get ready for conflicts in our currently open PRs 🙊 |
Oh, looks like |
I added
Do we need it for something that I have missed? Or should I keep it just in case it may be useful in the future? |
Ah, I didn't put my thought concrete. My comment "about poking files outside of the repository" only affected where But checking for indirections has to be there in either way, we cannot throw this check away altogether. My subsequent comment mentioning |
@Martoon-00 right! I misunderstood it. Now the Golden tests output shows the same behaviour as before the refactor regarding both virtual files and links that pass through the repository root. |
c095d6e
to
742aa00
Compare
Generated by 🚫 Danger |
Feel free to prettify the commits history, I'll make the final review pass then. |
Problem: the danger checks were failing because it was configured to fetch only and partially the current PR branch. Solution: force the danger checks CI to get all the repository branches.
5309f71
to
1d3e863
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your titanic work!
I believe we achieved quite a lot, both from the perspective of code cleanness and in how much better the output now looks like.
I'm leaving one more comment about commits history prettification, and going to approve after that.
RIFileLocal -> checkRef rAnchor riRoot file "" | ||
RIFileRelative -> do | ||
let relativeToRoot = getPosixRelativeOrAbsoluteChild riRoot (takeDirectory file) | ||
</> toString rLink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually so, I agree here. Having relativeToRoot
would make the code slightly more readable IMO, but here it is not necessarily relative.
Problem: our code contains a where statement with a 5-space indentation. Solution: indent it with 6 spaces instead.
Problem: the current usage of filepaths is error-prone and can be simplified. Solution: canonicalize filepaths at the boundaries, so their management will be safer and will simplify the codebase.
1d3e863
to
0886062
Compare
Hmhm, danger complains about But we can actually allow them. You can update |
Description
Problem: The current usage of filepaths is error-prone and can be simplified.
Solution: Canonicalize filepaths at the boundaries, so their management will be safer and will simplify the codebase. This has the side effect of changing the program output regarding file paths, so we should also probably decide how these will be shown after the refactor.
Related issue(s)
Fixes #197
✅ Checklist for your Pull Request
Ideally a PR has all of the checkmarks set.
If something in this list is irrelevant to your PR, you should still set this
checkmark indicating that you are sure it is dealt with (be that by irrelevance).
Related changes (conditional)
Tests
silently reappearing again.
Documentation
Public contracts
of Public Contracts policy.
and
Stylistic guide (mandatory)