Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Standard methods to Ontology #246

Merged
merged 16 commits into from
Nov 2, 2021

Conversation

francescalb
Copy link
Collaborator

@francescalb francescalb commented Sep 29, 2021

Description:

Closes #228.

hash (based on base_iri) defined for ontology.Ontology (only req to allow for eq implementation)

eq (based on all triples) defined for ontology.Ontology. Note that consistency between blank nodes is not checked.

This is tested in test_load_emmo.py.
The test also checks that two equal emmo-ontologies are no longer equal when a new class or new relation is added to one of them.

Type of change:

  • Bug fix.
  • New feature.
  • Documentation update.

Checklist:

This checklist can be used as a help for the reviewer.

  • Is the code easy to read and understand?
  • Are comments for humans to read, not computers to disregard?
  • Does a new feature has an accompanying new test (in the CI or unit testing schemes)?
  • Has the documentation been updated as necessary?
  • Does this close the issue?
  • Is the change limited to the issue?
  • Are errors handled for all outcomes?
  • Does the new feature provide new restrictions on dependencies, and if so is this documented?

Comments:

@francescalb francescalb changed the title Flb/close 228 ontology stadand magic methods Added Standard methods to Ontology Sep 30, 2021
ontopy/ontology.py Outdated Show resolved Hide resolved
ontopy/patch.py Outdated Show resolved Hide resolved
@CasperWA CasperWA force-pushed the flb/close-228-Ontology_stadand_magic_methods branch from 08cc67d to 3fa0e52 Compare October 25, 2021 11:57
@francescalb francescalb marked this pull request as ready for review October 25, 2021 21:07
Copy link
Collaborator

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great. I have only some minor code optimizations and a more significant question concerning the _sorted_entities() method.

ontopy/ontology.py Outdated Show resolved Hide resolved
ontopy/ontology.py Outdated Show resolved Hide resolved
ontopy/ontology.py Outdated Show resolved Hide resolved
ontopy/ontology.py Outdated Show resolved Hide resolved
tests/test_load_emmo.py Outdated Show resolved Hide resolved
tests/test_load_emmo.py Outdated Show resolved Hide resolved
tests/test_load_emmo.py Outdated Show resolved Hide resolved
tests/test_load_emmo.py Outdated Show resolved Hide resolved
tests/test_load_emmo.py Show resolved Hide resolved
francescalb and others added 4 commits November 1, 2021 13:32
Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
ontopy/ontology.py Outdated Show resolved Hide resolved
@francescalb
Copy link
Collaborator Author

Why? I guess changing s to subject, p to predicate and o to obj can be done, but spo is quite standard when looking at triples I think.

Then, doing the _unabreviate in the yield, do wish for that because it is faster or just because you think it is prettier? If it is faster I agree, but otherwise I am not a super fan of compacting code all the time, as it is more difficult to read for new people coming in I think.

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 1, 2021

Why? I guess changing s to subject, p to predicate and o to obj can be done, but spo is quite standard when looking at triples I think.

Single character variables is very bad practice when collaborating on code.
While it may be "standard" to use s, p, and o as one character variable names for subject, predicate and object in ontologies, it is extremely opaque for a non-initiated. Writing the variables out as subject, predicate, obj (not object, because that is a built-in variable), makes it much more transparent what we are dealing with.
At the very least, to make the variables similar, one could call them sub, pre, obj, however, even that becomes unclear, as "sub" is merely a prefix for anything, as is "pre". It could be anything.

In general, using single character variables is a leftover bad practice from programmers of compiled languages, where one wants to compact as much as possible, while it's uncommon multiple people get to work with the source code who's not deeply initiated in everything about the code base. For open code projects and general readability one should strive to be more explicit and forgiving with the number of characters used.

As a note, this is something I've gone through and changed in the whole of the code base in #245 as well.

Then, doing the _unabreviate in the yield, do wish for that because it is faster or just because you think it is prettier? If it is faster I agree, but otherwise I am not a super fan of compacting code all the time, as it is more difficult to read for new people coming in I think.

I am generally against creating unnecessary variables.
The definitions of s, p, and o originally, overwriting the iterating variables is bad practice - one shouldn't change these in the loop. This both avoids doing that as well as simplifying the code.
One could make it slightly more explicit perhaps here by splitting it on several lines, like such:

return (
    _unabbreviate(s),
    _unabbreviate(p),
    _unabbreviate(o),
)

I think, in general you are correct to be a bit more explicit in the code in terms of middle-steps, however, in this case it's so simplistic there is no need for this, I should think?

Co-authored-by: Casper Welzel Andersen <43357585+CasperWA@users.noreply.github.com>
@francescalb
Copy link
Collaborator Author

OK, fixed.

@jesper-friis
Copy link
Collaborator

Took a look at the conversation above. I am not sure I fully agree that using one-letter is always bad practice. In general I agree that descriptive variable names are good, but sensible use one-letter variables makes the the lines shorter and the code easier to read. If you just need an integer variable in a loop, I think that i or n are completely fine to use. The same is the case with s, p and o in triples. In contrary I am not much fan of _ as a variable you actually use, but if you only want to loop over the objects in a set of triples I think it is fine to write for _, _, o in triples: ....

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

Took a look at the conversation above. I am not sure I fully agree that using one-letter is always bad practice. In general I agree that descriptive variable names are good, but sensible use one-letter variables makes the the lines shorter and the code easier to read. If you just need an integer variable in a loop, I think that i or n are completely fine to use. The same is the case with s, p and o in triples. In contrary I am not much fan of _ as a variable you actually use, but if you only want to loop over the objects in a set of triples I think it is fine to write for _, _, o in triples: ....

The underscore is the accepted Pythonic way of defining un-used variables, specifically in the example you show above, but also in other cases, e.g., when a function returns a Triple type and all you need are one of those.

I think exactly because they are single-character variables they are not easy to read, quite the contrary, it makes it impenetrable to understand, especially when the method or function names do not mention triples at all.
To align the variable naming a bit more, one can write out the "object" variable name, but append an underscore, as this is the way to avoid renaming built-ins according to PEP-8, i.e.: for subject, predicate, object_ in get_me_some_triples(): ....

I agree that single-character variables can be used as iterating variables. This is common practice across several programming languages - but for ease of readability I'd like to strongly support more descriptive variable names throughout the code base.

Copy link
Collaborator

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent to me now :) Thank you @francescalb !

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

Everything said - we can always introduce our own standards for some variable naming, but I think it should be documented well then so there is a resource to go to if one is looking at this as a coder, but without an understanding of the content that's being worked with.

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

This is all good for me. Let's discuss naming conventions perhaps in #245? As I'm doing a lot of different updates in the way of naming in that PR.

@francescalb francescalb merged commit ffdecd8 into master Nov 2, 2021
@francescalb francescalb deleted the flb/close-228-Ontology_stadand_magic_methods branch November 2, 2021 12:04
@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

Going through the code locally in my editor, I realize now you have overwritten the inherited get_triples() function. Was this intentional? If yes, perhaps it should reuse the same parameters and such? Either that, or the method should be renamed, e.g., to get_unabbreviated_triples() or something?

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

The latest commit from #245 includes my suggested change/fix for this.

@francescalb
Copy link
Collaborator Author

You are right. That is not good at all. Your fix should be included.
Thanx!

@CasperWA
Copy link
Collaborator

CasperWA commented Nov 2, 2021

Great - I'll make an extra issue and PR for it and redact my other PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Standard dunder/magic methods for Ontology
3 participants