Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools/licenses: Fix pattern files and add checks so that it does not happen again. #11330

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

jcarrano
Copy link
Contributor

@jcarrano jcarrano commented Apr 2, 2019

Contribution description

Recently a couple of files without license slipped through in #10290 . Some investigation revealed that they were being incorrectly classified as licensed under "mit-espressif".

License templates are matched with "pcregrep" against a pre-processed version of source files. This pre processing normalizes whitespace, which includes flattening the file to a single line.

Additionally, "pcregrep" pattern files have one pattern per line.

This means that a pattern written in multiple lines will most likely not do what the author thinks it does, and that was the case with "mit-espressif", which was matching all files.

I've fixed pattern files, and added a check so that it does not happen again in the future. I also included a small unrelated fix in check.sh.

Testing procedure

Note that depending on your setup, you may have to prefix the check command with BASE_BRANCH=whatever_the_upstream_master_is_called_on_your_machine

1: observe the issue

Checkout the first commit (the "REMOVE ME!!!" one). Run ./dist/tools/licenses/check.sh (or BASE_BRANCH=your-upstream-master ./dist/tools/licenses/check.sh)

Look how the file without license examples/hello-world/random_file_without_license.c is accepted.

2: Verify that the 1-line sanity check works.

Checkout "tools/licenses: Check that license patterns...". Run the same command. See how you get an error message.

3: Verify that files without license are correctly reported, and that I did not break anything.

Checkout the a last commit. Run the check command again. examples/hello-world/random_file_without_license.c is rejected. In the REMOVEME commit I placed one example file for each of the licenses that I "fixed". This is to verify that I did not break the pattern.

The commit message explains how I did the pattern conversion.

Issues/PRs references

Found in #10290.
Check out #11329 .

jcarrano added 4 commits April 2, 2019 14:45
The file without license is to test that the bug is fixed and these files
are detected.

The other ones are to check that we do not break the other licenses after
converting them to a single line.
The "ls" command is meant for human consumption only. It is bad practice
to parse that command. That is what "find" and globs are for.

For more information, see:

- http://mywiki.wooledge.org/ParsingLs
- https://unix.stackexchange.com/questions/128985/why-not-parse-ls-and-what-do-to-instead
This patch checks that all license patterns (or templates) have exactly
one line. A non-compliant pattern was recently causing all files without
a license to be (erroneously) assigned to that pattern's license.

License templates are matched with "pcregrep" agains a pre processed
version of source files. This pre processing normalizes whitespace, which
includes flattening the file to a single line.

Additionally, "pcregrep" pattern files have one pattern per line.

All of this means that license templates must be written in a single line.
Pattern files must be written in a single line.

To fix these patterns, the following function was used:

fix_pattern () {
	mv "$1" "$1.bkp" && (
		cat "$1.bkp" | sed -e 's/[\/\*'"${TAB_CHAR}"']/ /g' -e 's/$/ /' | tr -d '\r\n' | sed -e 's/  */ /g;s/
\+$//g' && echo
	) > "$1"
}

The code was taken from dist/tools/licenses/check.sh. This was followed by some
manual retouchuing of stops and parentheses.
@jcarrano jcarrano added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: tools Area: Supplementary tools labels Apr 2, 2019
@jcarrano jcarrano requested a review from cladmi April 25, 2019 09:37
@cladmi
Copy link
Contributor

cladmi commented Apr 26, 2019

I would prefer if the files in dist/tools/licenses/patterns/ could stay multi-lines and be converted on the fly. This would even remove the need for the new functions.

@jcarrano
Copy link
Contributor Author

jcarrano commented Jun 3, 2019

@cladmi I don't have time to do it now, and in the meanwhile the thing is still broken. Can we go with this solution and then when time allows make it pretty. Until then, developers will have to turn on line wrapping in the editor.

@jcarrano
Copy link
Contributor Author

jcarrano commented Jun 3, 2019

BTW, the PR is smaller than +122 (the first commit is a throwaway and has a lot of lines).

@fjmolinas fjmolinas requested review from fjmolinas and removed request for cladmi January 22, 2020 13:09
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 22, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: tools Area: Supplementary tools Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants