Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authors scan in testfiles finds holder although it should not #216

Closed
nakami opened this issue Mar 9, 2016 · 4 comments
Closed

Authors scan in testfiles finds holder although it should not #216

nakami opened this issue Mar 9, 2016 · 4 comments

Comments

@nakami
Copy link

nakami commented Mar 9, 2016

test_authors.py (scancode-toolkit/tests/cluecode/test_authors.py) expects author_young_c-c.c (scancode-toolkit/tests/cluecode/data/authors/author_young_c-c.c) to have four author hits.

Two apply for Tim Hudson whereas the first one should be only found by the holders scan and not the authors scan.

author_young_c-c.c:

12 * lhash, DES, etc., code; not just the SSL code.  The SSL documentation
13 * included with this distribution is covered by the same copyright terms
14 * except that the holder is Tim Hudson (tjh@mincom.oz.au).

test_authors.py:

221    def test_author_young_c(self):
222        test_file = self.get_test_loc('authors/author_young_c-c.c')
223        expected = [
224            u'written by Eric Young (eay@mincom.oz.au).',
225            u'Tim Hudson (tjh@mincom.oz.au).',
226            u'written by Eric Young (eay@mincom.oz.au)',
227            u'written by Tim Hudson (tjh@mincom.oz.au)',
228        ]
229        check_detection(expected, test_file, what='authors')

Correct me if I'm wrong.

For clarity, I did not run the tests myself, just compared the source and the expected hits in test_authors.py.

@pombredanne
Copy link
Contributor

@nakami this is a tough one.
Holders are detected as part of a copyright statement: https://github.com/nexB/scancode-toolkit/blob/c7f808cce1e9a46fe83c529c8cb789dbc57af958/src/cluecode/copyrights.py#L717

Authors on the other hand are detected on their own: https://github.com/nexB/scancode-toolkit/blob/c7f808cce1e9a46fe83c529c8cb789dbc57af958/src/cluecode/copyrights.py#L437

So since holder is Tim Hudson (tjh@mincom.oz.au) is not a proper copyright statement it would not be reported as a holder. But what could be done is to report that as an author?

Now the relation question is also: is this the holder is <Tim Hudson or someone else> a common occurrence and if so does it ever happen in a file where there would not be a copyright for the same person too?

@pombredanne
Copy link
Contributor

actually not a tough one. Just needs some thinking

@pombredanne pombredanne modified the milestone: v2.0 Aug 5, 2016
@pombredanne pombredanne modified the milestones: v2.0, v2.1 Mar 24, 2017
@pombredanne pombredanne removed this from the v2.1 milestone Oct 4, 2017
@pombredanne
Copy link
Contributor

@nakami you never replied to this? Shall I close? do my explanation made sense?

pombredanne added a commit that referenced this issue Jun 8, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 8, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 15, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 15, 2021
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Contributor

We no detect this:

"files": [
    {
      "path": "foo5",
      "type": "file",
      "copyrights": [
        {
          "value": "holder is Tim Hudson (tjh@mincom.oz.au)",
          "start_line": 3,
          "end_line": 3
        }
      ],
      "holders": [
        {
          "value": "Tim Hudson",
          "start_line": 3,
          "end_line": 3
        }
      ],
      "authors": [],
      "scan_errors": []
    }

so this is fixed!
Closing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants