Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and improve copyright holders detection #930

Closed
pombredanne opened this issue Feb 14, 2018 · 2 comments
Closed

Refactor and improve copyright holders detection #930

pombredanne opened this issue Feb 14, 2018 · 2 comments

Comments

@pombredanne
Copy link
Contributor

There are a few pending tickets on holders and this is going to be a master ticket for these.

And this was originally reported offline by email:

Any codebase that has any non-trivial number of contributors has often some issues with reporting clean holders.

Some examples of cleanups:

  • Leading and trailing punctuation
  • Inc vs Incorporated,
  • X and Y should be, in general, split into two separate entries so you don’t get X and Y, X and Z, Y and Z, …
  • There appear to be cases where the output has random > and other punctuation.

Some pointers for normalization: https://github.com/alephdata/fingerprints and https://github.com/datamade/probablepeople

@pombredanne
Copy link
Contributor Author

@tsteenbe @jeffmcaffer ping, just so you are kept abreast of the progress here :P

pombredanne added a commit that referenced this issue Mar 2, 2018
 * this allow to to setup a good base for improvements

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 2, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 2, 2018
 * ensure that POS tagging creates proper names and use this to improve
   holders reporting

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 2, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 2, 2018
 * add tests for holders summaries
 * improve tagging of names
 * improve collection of holders and tracing

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
 * minor changes mostly from scanning several npms

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
 * minor changes mostly from scanning several npms

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
 * minor changes mostly from scanning several npms

Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Mar 12, 2018
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 11, 2018
* remove the returning of solo years: this is not used anywhere
* optionally exclude year from returned copyright
* handle some corner cases found in license texts
* add new function for future reorg of returned data structure for #255
  for now, not yet used as functions detect_copyrights2() and
  CopyrightDetector.detect2()


Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
pombredanne added a commit that referenced this issue Jun 11, 2018
* remove the returning of solo years: this is not used anywhere
* optionally exclude year from returned copyright
* handle some corner cases found in license texts
* add new function for future reorg of returned data structure for #255
  for now, not yet used as functions detect_copyrights2() and
  CopyrightDetector.detect2()


Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
@pombredanne
Copy link
Contributor Author

This has been done and merged in develop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant