Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe ScanCode output 'copyrights' type #618

Closed
jdaguil opened this issue May 15, 2017 · 3 comments
Closed

Describe ScanCode output 'copyrights' type #618

jdaguil opened this issue May 15, 2017 · 3 comments
Assignees
Milestone

Comments

@jdaguil
Copy link
Contributor

jdaguil commented May 15, 2017

The following is an example excerpt from ScanCode's JSON output:

"copyrights": [
  {
    "statements": [
      "Copyright (c) 1991, 1999 Free Software Foundation, Inc."
    ],
    "holders": [
      "Free Software Foundation, Inc."
    ],
    "authors": [],
    "start_line": 4,
    "end_line": 7
  },
  {
    "statements": [
      "copyrighted by the Free Software Foundation"
    ],
    "holders": [
      "Free Software Foundation"
    ],
    "authors": [],
    "start_line": 426,
    "end_line": 433
  }
],
  1. Why are statements arrays instead of simple strings?
  2. Why are the corresponding start_line and end_line not arrays?
@pombredanne
Copy link
Contributor

  1. Why are statements arrays instead of simple strings?
    Because each item corresponds to an actual distinct copyright statement
  1. Why are the corresponding start_line and end_line not arrays?
    Because the copyright detection is not super precise in reporting actual lines, so we only get a range of lines and within this range we report all the statements detected...

I get this is a tad confusing and eventually we should report instead an object of sorts such as:

"copyrights": [
  {
    "statement": "Copyright (c) 1991, 1999 Free Software Foundation, Inc.",
    "holder": "Free Software Foundation, Inc.",
    "start_line": 4,
    "end_line": 4
  },
  {
    "statement": "Copyright (c) 2015 The Eclipse Foundation",
    "holder": "The Eclipse Foundation",
    "start_line": 4,
    "end_line": 4
  },
 ....
],
"authors": [
  {
    "name": "John Doe",
    "start_line": 5,
    "end_line": 5
  },
 ....
]

.....

Would this make more sense? The schema would definitely be cleaner this way!

@jdaguil
Copy link
Contributor Author

jdaguil commented Sep 19, 2017

@pombredanne Yes, this makes more sense 😄

@pombredanne
Copy link
Contributor

This is tracked in #112 now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants