-
Notifications
You must be signed in to change notification settings - Fork 27
DeltaCode Scoring
A Delta object represents the file-level comparison (i.e., the "delta") of two codebases, typically two versions of the same codebase, using ScanCode-generated JSON
output files as input for the comparison process.
Based on how the user constructs the command-line input, DeltaCode's naming convention treats one codebase as the "new" codebase and the other as the "old" codebase:
deltacode -n [path to the 'new' codebase] -o [path to the 'old' codebase] [...]
A DeltaCode codebase comparison produces a collection of file-level Delta objects. Depending on the nature of the file-level change between the two codebases, each Delta object is characterized as belonging to one of the categories listed below. Each category has an associated score intended to convey its potential importance -- from a license/copyright compliance perspective -- to a user's analysis of the changes between the new
and old
codebases.
In descending order of importance, the categories are:
-
added
: A file has been added to thenew
codebase. -
modified
: The file is contained in both thenew
andold
codebase and has been modified (as reflected, among other things, by a change in the file'ssha1
attribute). -
moved
: The file is contained in both thenew
andold
codebase and has been moved but not modified. -
removed
: A file has been removed from theold
codebase. -
unmodified
: The file is contained in both thenew
andold
codebase and has not been modified or moved.
Note: Files are determined to be Moved by looping thru the added
and removed
Delta objects and checking their sha1 values.
The score of a Delta object characterized as added
or modified
may be increased based on the detection of license- and/or copyright-related changes. See License Additions and Changes and Copyright Holder Additions and Changes below.
Each Delta object includes the following fields and values:
-
factors
: One or more strings representing the factors that characterize the file-level comparison and resulting score, e.g., in JSON format:
"factors": [
"added",
"license info added",
"copyright info added"
],
-
score
: A number representing the magnitude/importance of the file-level change -- the higher thescore
, the greater the change. -
new
: The ScanCode-based file attributes (path
,licenses
,copyrights
etc.) for the file in the codebase designated by the user asnew
. -
old
: The ScanCode-based file attributes for the file in the codebase designated by the user asold
.
Note that an added
Delta object will have a new
file but no old
file, while a removed
Delta object will have an old
file but not a new
file. In each case, the new
and old
keys will be present but the value for the missing file will be null
.
Certain file-level changes involving the license-related information in a Delta object will increase the object's score.
-
An
added
Delta object's score will be increased:- If the
new
file contains one or more licenses (factors
will includelicense info added
). - If the the
new
file contains any of the following Commercial/Copyleft license categories (factors
will include, e.g.,copyleft added
):- 'Commercial'
- 'Copyleft'
- 'Copyleft Limited'
- 'Free Restricted'
- 'Patent License'
- 'Proprietary Free'
- If the
-
A
modified
Delta object's score will be increased:- If the
old
file has at least one license and thenew
file has no licenses (factors
will includelicense info removed
). - If the
old
file has no licenses and thenew
file has at least one license (factors
will includelicense info added
). - If both the
old
file andnew
file have at least one license and the license keys are not identical (e.g., theold
file includes anmit
license and anapache-2.0
license and thenew
file includes only anmit
license) (factors
will includelicense change
). - If any of the Commercial/Copyleft license categories listed above are found in the
new
file but not in theold
file (factors
will include, e.g.,proprietary free added
).
- If the
-
An
added
Delta object's score will be increased if thenew
file contains one or more copyrightholders
(factors
will includecopyright info added
). -
A
modified
Delta object's score will be increased:- If the
old
file has at least one copyrightholder
and thenew
file has no copyrightholders
(factors
will includecopyright info removed
). - If the
old
file has no copyrightholders
and thenew
file has at least one (factors
will includecopyright info added
). - If both the
old
file andnew
file have at least one copyrightholder
and theholders
are not identical (factors
will includecopyright change
).
- If the
As noted above in Basic Scoring, from a license/copyright compliance perspective, the three least significant Delta categories are moved
, removed
and unmodified
.
In the current version of DeltaCode, each of these three categories is assigned a score of 0, with no options to increase that score depending on the content of the Delta object.
However, it is possible that both moved
and removed
will be assigned some non-zero score in a future version. In particular, removed
could be significant from a compliance viewpoint where, for example, the removal of a file results in the removal of a Commercial/Copyleft license obligation.