Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for unary "+" operator in license identifiers #314

Merged
merged 6 commits into from
Jan 29, 2022

Conversation

carmenbianca
Copy link
Member

@carmenbianca carmenbianca commented Jan 28, 2021

Fixes #123
Fixes #275

Example: if Apache-1.0+ appears as a declared license, it should not be identified as missing, bad, or unused if LICENSES/Apache-1.0.txt exists. It is, however, identified separately as a used license.

This does not fix all of #123. GPL-3.0+ and GPL-3.0-or-later are (still) recognised as separate licenses. I guess this is kind of in line with how SPDX treats GPL anyway. Do we want to fix this?

Furthermore, this patch allows for weird stuff like GPL-3.0-or-later+. I guess we can just ignore that weirdness.

@mxmehl
Copy link
Member

mxmehl commented Feb 5, 2021

Thanks! However, I have the feeling that we should avoid conflicts like GPL-2.0-only+ if that's possible somehow.

@carmenbianca
Copy link
Member Author

@mxmehl I agree, but I'm not entirely sure how to go about this. The codebase doesn't currently treat any licenses as special edge cases, which is nice. Now, there's a very good case to be made that the GPL family of licenses should be given special treatment, and it wouldn't be especially difficult to implement this, but I'm not exactly sure what that special treatment should be.

Normally I'd defer to SPDX, but there's currently nothing in the SPDX spec that disallows funny business such as GPL-2.0-only+. Should we open an issue against SPDX for this?

@mxmehl
Copy link
Member

mxmehl commented Feb 5, 2021

Oh yes, SPDX does not seem to be clear about this either. I found this thread, but didn't read it fully yet: https://lists.spdx.org/g/Spdx-tech/topic/or_later_operator_is_not/32049933

@kilbswhitecrow
Copy link

As I understand it, SPDX dropped support for "+" and replaced it with "-or-later", because SPDX identifiers are just strings - if the strings match, they're the same license, otherwise they're different licenses. "+" would have required rules about how to match several different licenses as increments in a family. SPDX also doesn't make any statements about license compatibility, so there's nothing to say "GPL-2.0" is compatible with "GPL-2.0+" or "GPL-2.0-or-later" - that's up to a human to do. (Or a separate tool, but not the SPDX spec.)

@lbruno
Copy link

lbruno commented Apr 2, 2021

The SPDX discussion is a bit detached from how there's a mass of files already using the LGPL-2.0+ and/or LGPL-2.1+ syntaxes1, but there's no backwards compatibility discussion that I can see.

I'd say that including this PR would improve on the current situation without detriment, similar rationale to the above comment.

Footnotes

  1. in the case I'm looking at, in the same project

@mxmehl mxmehl added this to the 0.13.0 milestone Apr 27, 2021
@mxmehl mxmehl removed this from the 0.14.0 milestone Jan 22, 2022
@carmenbianca carmenbianca added this to the v0.15 milestone Jan 22, 2022
@carmenbianca
Copy link
Member Author

I think it would be good to have a review here. This is a pretty significant bug—I'm honestly quite surprised there haven't been more issues about this!

@floriansnow Could you spend some time reviewing this?

@floriansnow
Copy link
Contributor

I have to have a closer look at how you gather the list of used licenses because I'm not yet very familiar with this part of the codebase.

What I definitely think we should do, is not allow the + operator on GNU licenses because that is specifically not allowed by SPDX: https://spdx.dev/ids/
That way, it would be easy to specify a set of licenses that does not allow the + operator and we would at the same time cover cases such as GPL-3.0-or-later+.

@lbruno I get what you're saying, but IMHO, we're linting against SPDX and if SPDX says not to use the + operator for certain licenses, then REUSE should not accept those either.

@carmenbianca
Copy link
Member Author

What I definitely think we should do, is not allow the + operator on GNU licenses because that is specifically not allowed by SPDX

@floriansnow This is not strictly true. The SPDX spec has nothing to say about this case, and unary + expressions are still supported. GPL-3.0 is deprecated-but-supported. History lesson: The FSF (or rather: Stallman) didn't want to support SPDX unless it deprecated its original GPL-3.0 ID (et al) and replaced it with GPL-3.0-or-later and GPL-3.0-only, out of fear that people would write down GPL-3.0 instead of GPL-3.0+. By forcing the choice, this 'error' cannot be made.

Now, it could be argued that MIT+ is also strictly supported by the SPDX spec, but semantically nonsensical.

The question is:

  • Do we want to explicitly disallow semantically nonsensical expressions?
  • If no, must we figure out all licenses for which it is semantically nonsensical?
  • Must we explicitly disallow semantical nonsense for copyright holders? © nobody is valid REUSE.
  • And in any case, must we make a special case for GNU's -or-later/-only licences?

I don't think that 'validate whether the licensing actually makes sense' is part of the purpose of REUSE. Nothing stops someone from writing down SPDX-License-Identifier: GPL-3.0-or-later AND CC-BY-NC-4.0 using REUSE, and if we allow jokers to write down all sorts of nonsense, it's hard to stop jokers from writing down GPL-3.0-only+, given that it's valid SPDX (for good or for ill).

But more importantly, the only reason I mentioned GPL-3.0-or-later+ in the first place is because I thought it was funny 🤡

Apache-1.0+ is not missing or bad if LICENSES/Apache-1.0.txt exists.

Signed-off-by: Carmen Bianca Bakker <carmenbianca@fsfe.org>
Signed-off-by: Carmen Bianca Bakker <carmenbianca@fsfe.org>
Apache-1.0+ is not unused if LICENSES/Apache-1.0.txt exists.

Furthermore, Apache-1.0+ is separately identified as a used license.

Signed-off-by: Carmen Bianca Bakker <carmenbianca@fsfe.org>
Signed-off-by: Carmen Bianca Bakker <carmenbianca@fsfe.org>
Signed-off-by: Carmen Bianca Bakker <carmenbianca@fsfe.org>
@floriansnow
Copy link
Contributor

I'm aware of the history, but I don't feel it has any bearing on this either way.

I know that the link I provided above is not the spec, but still, in the section about allowing later versions, it clearly states:

GNU licenses
For GNU licenses, do not use just the bare license ID, such as “GPL-2.0”.
Instead, always use either the suffix “-only“ or the suffix “-or-later“ with GNU licenses.

For other licenses, it says to use the +.

I think we should respect that. That doesn't mean we validate whether or not an expression makes sense, but we do validate best practice. So at the very least, we should output a deprecation warning.

@mxmehl
Copy link
Member

mxmehl commented Jan 24, 2022

I agree that we should not raise an error on e.g. GPL-3.0 or GPL-3.0+. Many projects still use them, most prominently the Linux kernel (and they probably won't switch anytime soon as I understood @gregkh).

I'd be fine with throwing an informative message though on explicitly deprecated licenses, e.g. something along the lines of "GPL-3.0+ is a supported license identifier but SPDX recommends to replace it with the newer GPL-3.0-or-later", and same with all other (L|A)GPL special cases.

I'd also agree with Carmen that we can ignore MIT+ or Apache-1.0+, and the code for this LGTM.

And yes, REUSE's job is to make licensing and copyright easy for developers, but we cannot check for license compatibility and obviously wrong statements or license expressions.

@carmenbianca carmenbianca merged commit af6abf1 into fsfe:master Jan 29, 2022
@mxmehl mxmehl mentioned this pull request Apr 29, 2022
@mxmehl mxmehl modified the milestones: Backlog, v1.0.0 May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad license when operator '+' is used Handle +, -or-later, and -only
5 participants