You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the main source (licenses/licenses is it?) all the licences with the family field have "family": "" (empty)... It is, perhaps, because the field descriptor at datapackage.json is also empty, and I not see any text explaining it in other place. What is "family"?
Suggestion 1
The field maintainer is used in some other places, so, if we adopt family as brand, it will be very correlated with maintainer, not adding new information...
A final-user demand is to group similar licences, so this is the main suggestion (!).
... Well, how to group?
Suggestion 2
A simple first step is to use existent information as clue for "family assign process". Fields is_by and is_sa are good clues to util groping. Generalizing: the "summary of the main clauses of the license" (as ex. tldrlegal.com) have the best set of attributes for grouping inference...
Example: ODC-BY v1, GFDL v1.3, ... CC-BY v4 can be grouped as similar licenses (and also versions as CC-BY v1, CC-BY v2, etc.) because have same clauses in their summaries.
Suggestion 3
Some groups are obvious, others not so evident, and can be changed in the time with new analysis and discussions... The change of "family assign" in one item is normal, no impact, but the change in the family's name is problematic... As biologists and linguists handle this problem, canonicalization is a good solution. In the set of grouped elements, you (the curators) elect a "typical element" and use its name (or ex. prefix of the name) as the family name.
In the example, the most popular licence of the group is the CC-BY, so is natural to use cc-by as family name.
Conclusion: the (suggested) family field is the name of the similarity-group assigned to the license, and this name is obtained from the (name of) canonical license of the group.
The text was updated successfully, but these errors were encountered:
@Stephen-Gates i'm not entirely sure about this column. Let's first agree what this column would look like before we add items 😄 so do you want to set out here what kind of thing you are thinking of adding.
Context
In the main source (licenses/licenses is it?) all the licences with the family field have
"family": ""
(empty)... It is, perhaps, because the field descriptor at datapackage.json is also empty, and I not see any text explaining it in other place. What is "family"?Suggestion 1
The field maintainer is used in some other places, so, if we adopt family as brand, it will be very correlated with maintainer, not adding new information...
A final-user demand is to group similar licences, so this is the main suggestion (!).
... Well, how to group?
Suggestion 2
A simple first step is to use existent information as clue for "family assign process". Fields is_by and is_sa are good clues to util groping. Generalizing: the "summary of the main clauses of the license" (as ex. tldrlegal.com) have the best set of attributes for grouping inference...
Example: ODC-BY v1, GFDL v1.3, ... CC-BY v4 can be grouped as similar licenses (and also versions as CC-BY v1, CC-BY v2, etc.) because have same clauses in their summaries.
Suggestion 3
Some groups are obvious, others not so evident, and can be changed in the time with new analysis and discussions... The change of "family assign" in one item is normal, no impact, but the change in the family's name is problematic... As biologists and linguists handle this problem, canonicalization is a good solution. In the set of grouped elements, you (the curators) elect a "typical element" and use its name (or ex. prefix of the name) as the family name.
In the example, the most popular licence of the group is the CC-BY, so is natural to use
cc-by
as family name.Conclusion: the (suggested) family field is the name of the similarity-group assigned to the license, and this name is obtained from the (name of) canonical license of the group.
The text was updated successfully, but these errors were encountered: