Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to import licenses from SPDX #68

Merged
merged 4 commits into from
Oct 14, 2021

Conversation

goneall
Copy link
Contributor

@goneall goneall commented Sep 26, 2021

Adding a script to import specific licenses from SPDX to this repo per suggestion in issue #64

Signed-off-by: Gary O'Neall [email protected]

@pchestek
Copy link

I'm not following exactly what will be happening. The comment is "Adding a script to import specific licenses from SPDX to this repo." The OSI is the authoritative resource on the language of licenses it has approved (acknowledging that the OSI recordkeeping leaves something to be desired). I also understand that the SPDX text and the OSI text are not always identical, for various legitimate reasons. But if the plan is to import license text from SPDX, that's not appropriate. If there is a question about what the approved license language is, that should be ascertained from OSI records. If the issue instead is to automate adding SPDX identifiers, that's fine.

@webmink
Copy link
Member

webmink commented Sep 29, 2021

Per #64 I believe this is just a convenience function to use while we're cleaning the data - some of the records here lack the details they should have while SPDX has them. I would expect this script to be deprecated once the data here is cleaned. Is that correct, @goneall ?

@goneall
Copy link
Contributor Author

goneall commented Sep 29, 2021

Per #64 I believe this is just a convenience function to use while we're cleaning the data - some of the records here lack the details they should have while SPDX has them. I would expect this script to be deprecated once the data here is cleaned. Is that correct, @goneall ?

@webmink Correct on all points. I'll be using the script to address #62 all of which were uncovered comparing the contents of this repo with the SPDX license list. Once that is cleaned up, the script would only be used if the SPDX license list gets ahead of OSI which I would not expect to happen once everything is up to date.

I added the PR based on this comment: #64 (comment)

@pchestek
Copy link

pchestek commented Sep 29, 2021 via email

@webmink
Copy link
Member

webmink commented Sep 29, 2021

Indeed, totally agree, Pam. We are just observing that, having taken the canonical data from the OSI web site and updated SPDX with it, the fastest way to get the license database in the API to also be updated is to copy the OSI data from SPDX as over there it's already in a machine-readable format. That's what @goneall has done in this script. No-one intends to use this function beyond the scope of that pragmatism.

@pchestek
Copy link

pchestek commented Sep 30, 2021 via email

@goneall
Copy link
Contributor Author

goneall commented Sep 30, 2021

I have no problem with correcting the SPDX data on the OSI website and the SPDX data should be the canonical source for that, assuming there is no question that it is matched to the right license.

When creating my pull requests, I'll run the SPDX license diff against the OSI website to identify any difference in the text itself. I'll add a separate commit to my pull requests to fix any text differences to match the OSI website as closely as is possible when using HTML formatted text (e.g. white space characters are not preserved in most HTML pages).

If I run into a case where the license text on OSI doesn't match the license text in SPDX per the SPDX license matching guidelines using the SPDX online tools check license, I'll create an issue in this repo, the SPDX license-list-XML repo, or both depending on the situation.

BTW - All of this should be documented in CONTRIBUTING.md to make it easier and more reliable for future contributors this this repo. If I get some time this week, I'll make a proposed CONTRIBUTION.md PR for review.

- Add a text link to the OSI web page if it does not exist
- Add required name field
- Use the OSI ID for the text file name rather than the SPDX ID (bug)
- README file IMPORTING_FROM_SPDX.md created with the steps

Signed-off-by: Gary O'Neall <[email protected]>
@goneall
Copy link
Contributor Author

goneall commented Sep 30, 2021

@pchestek @webmink I added a file IMPORTING_FROM_SPDX.md that describes the process and steps one should take to verify the OSI text and license name is not overridden per the above conversations.

I also made a number of changes to the script to resolves issue I ran into on the second license pull request which has a different OSI ID than the SPDX ID.

@webmink
Copy link
Member

webmink commented Oct 1, 2021

Super, thanks. @paultag is in the process of catching up on all the task I have cruelly passed him, which includes merging this one.

@paultag
Copy link
Collaborator

paultag commented Oct 14, 2021

I have some style nits, but they're mostly cosmetic. The code works as expected, so I'm going to land this and send an MR later. Thanks, @goneall

@paultag paultag merged commit 0105364 into OpenSourceOrg:master Oct 14, 2021
@goneall
Copy link
Contributor Author

goneall commented Oct 14, 2021

I have some style nits, but they're mostly cosmetic.

Feel free to comment on any suggested changes. Python is not my native language, so I'm always willing to improve my coding style.

@goneall goneall deleted the importscript branch November 7, 2021 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants