Verdi node delete 923 #1083

lekah · 2018-01-26T18:15:56Z

verdi node delete via command line is introduced, making the secret script not longer necessary.
Made a test for that functionality and updated the documentation.

…the command-line interface. A new module in utils takes care of the abstract part (querying of nodes to delete, with the provenance followed in reverse. Backend-specific utilities take care of the deletion in the DB

…hich means that aiidateam#923 is resolved

…entation for the verdi node subcommands

nmounet · 2018-01-30T14:09:44Z

aiida/cmdline/commands/node.py

+            action='store_true')
+        # Commenting this option for now
+        # parser.add_argument('-f', '--force', help='force deletion, disables final user confirmation', action='store_true')
+        parser.add_argument('-v', '--verbosity', help='verbosity level', action='count', default=1)


for the verbosity level: not sure it's very clear like this...

This has been addressed in 3fcab81

nmounet

Seems good, but since this is quite dangerous I think one needs at least one more review (and maybe an approval from GP?)

szoupanos

Overall and from a fast look it looks OK (but the details are also important). For example, I am not totally sure about the link status and if we cover all the cases with the tests. Maybe we can chat directly.

It's not a bad idea to involve @giovannipizzi to the discussion.

szoupanos · 2018-01-30T16:00:43Z

aiida/backends/sqlalchemy/utils.py

+
+    session = sa.get_scoped_session()
+
+    with session.begin(subtransactions=True):


At aiida.orm.implementation.sqlalchemy.code.delete_code I see the following

from aiida.backends.sqlalchemy import get_scoped_session session = get_scoped_session() session.begin(subtransactions=True) try: code.dbnode.delete() session.commit() except: session.rollback() raise

Which should be the verbose way of what you wrote. I suppose that at the end of the with clause/statement, there is an auto-commit (according to http://docs.sqlalchemy.org/en/latest/orm/session_transaction.html#session-autocommit).

It is worth verifying that for any kind of error in the queries, there is a rollback.I suppose there is but...
The same for Django

The django part is taken from the existing script, and the same logic is replicated for sqlalchemy.
The django code is working on big databases, I was using it in a while in the current form.
Regarding your comment on verbosity, I disagree: I do explicit deletion of the links, since it's more robust than relying on deletion cascades (which I'm not sure is implemented) for the links. Maybe above example for code deletion should be checked?

From the docs that you cite I understand that using the session in a try-except clause or with a context-manager is equivalent. As for the 'it is worth verifying': That's the reviewers job ;)

It is not actually a proof of correctness that it runs on big databases (it's an indication). I also never mentioned to do deletion propagation etc.

An overall comment. The reviewers' job is to make some constructive comments on the code and highlight some potential corner cases (of course they should be somehow answered constructively since the author asked the reviewer's opinion & the reviewer invested some time to give feedback).

Unfortunately, I don't have the time to check your code for all these corner cases.

Me see problem, I make fix... Seriously, let's not have a big discussion here, about proofs of correctness, and the role of reviewers. I added the explicit try-and-except clause that you wanted and made some more tests with deliberately wrong queries. The rollback works. Actually, I found that, different from Django, the cascading is not implemented for group-membership, so I also delete on the dbnode_dbgroup table in 83b0478 .

Thanks for the changes

lekah · 2018-01-30T17:20:23Z

Welcome to the party @giovannipizzi
I tried to make the command both transparent in usage and safe.
By default, it asks for user confirmation before deleting anything. Also, there is a dry-run option.
This particular feature has been requested by a multitude of users. I agree that it's dangerous, which is why the precautions and prompt for user confirmation.
I'm very open to putting another confirmation question, and/or printing a warning in all-caps.
(ARE YOU REALLY SURE YOU WANT TO DELETE X NODES?)

…ement for node deletion for sqlalchemy because it wasn't liked, and actually because the cascading for dbgroup_dbnodes was failing (if node belonged to a group) in sqlalchemy

…omments

giovannipizzi · 2018-01-31T09:57:32Z

I gave my comments to @lekah personally, after which he did the two new commits. I think the way it is now it's acceptable. Note that the follow-returns has been removed from the command line (I think it's too dangerous, it's not what people want). It's still in the code to avoid to remove the implementation)>

giovannipizzi · 2018-01-31T09:59:10Z

Important question: what to do when deleting outputs of sealed calculations? @sphuber @muhrin
It shouldn't be allowed, what do you think? Otherwise the calculation looks ok but it misses some important output. Is this a constraint we want to have? As a consequence, should there be a flag for deleting 'creating' calculations?

giovannipizzi · 2018-01-31T10:12:31Z

After further discussion with @lekah we realised that there might be a number of reasons why users want to delete outputs. They should at least be aware that they are doing it. So, proposed solution:

there is a check flag, True by default, that checks
- if there are any Data node to be deleted, CREATEd by a calculation that is not going to be deleted
- if there are any Calculation node to be deleted, CALLed by a calculation that is not going to be deleted
In this case, it will print out a message with the pairs of nodes (input -> output) and a message explaining why in general this is discouraged
If the user, anyway, wants to continue, we add an entry to the CalculationLog table (visible with verdi calculation logshow or verdi work report) that says that the user asked to delete an output, similarly to what is done when a calculation is killed - so at least it is written down, and when checking back old calculations, if something weird is seen, i.e. missing outputs, one can check why this is so

sphuber · 2018-01-31T10:50:01Z

Do these final checks hold for all calculations or only sealed ones? Do we now have a clear definition and implementation of the sealed concept. I for one am not sure what exactly it entails.

lekah · 2018-01-31T17:03:29Z

For all calculations

…a logging mechanism for calculations that lose created data or called instance. Implemented a printout when this happens and storing to DbLog. To be reviewed for further improvements.

lekah · 2018-01-31T18:34:37Z

Commit c0da5c4 addresses the first set of remarks by @giovannipizzi . There is a query to find all calculations that lose created data, and a query for calculations that lose called instances.
prompts to stdout inform the user, and in case of deletion, information is written to DbLog.
No check-flag yet, this will follow asap.

szoupanos · 2018-02-01T10:41:23Z

I had more time today and I looked at it more carefully.
What I still miss is which are the available links that we have now (this is more or less "easy" to find), and how these are created (apart from the the obvious: CREATE, RETURN, INPUT etc). So I need a bit more input to closely follow the checks at aiida/utils/delete_nodes.py and if they are sufficient enough.

Do we have a draft on the existing link types?

lekah · 2018-02-01T11:31:54Z

Current link types are enumerated (literally) here: http://aiida-core.readthedocs.io/en/latest/_modules/aiida/common/links.html
For how these are created, check for example the Node._add_dblink_from method and everything that calls it, if you're interested in higher-level implementation. @sphuber , is there a draft on currently existing link types?

sphuber · 2018-02-01T12:32:28Z

If with 'draft' you are referencing to the document that I started writing and presented at some point, that is all still purely hypothetical and none of it has been implemented. You already pointed to the link types that currently exist in v0.11.0. As to how they are used, the best thing would be to look at the discussion in issue #687 where we defined the migration rules to retroactively write the link types for older databases. These should reflect the rules that the code currently uses to create new links.

…ations that would lose created data or called instance will get this action written into the log. Additional warnings are printed, but it is still allowed to delete data without its creator or called workflows without their callers, since there are usecases

lekah · 2018-02-01T19:41:22Z

71cb8df addresses the comments and suggestions by @giovannipizzi . Calculations will have it written to the log if created or called instances are deleted. There is a flag to disable this checks, but it cannot be enabled via command-line.

giovannipizzi

I am going to approve this. The documentation on the link types is indeed missing but it is a different issue. When the documentation is ready, we can check if we need to add a few more checks (the only one coming to my mind is a warning for 'RETURN'ed stuff similarly to CREATEd. The rest should be ok as we automatically delete all children via INPUT or CREATE, at least in this version.

giovannipizzi · 2018-02-02T10:50:45Z

I don't merge in case someone wants to give some final comment.

szoupanos · 2018-02-02T11:23:18Z

Yes, I agree. I also had a short chat with Leo and I also agree with what @giovannipizzi said.
We can add more checks (if we discover that we may want them - something is missing) in the future,

lekah added 4 commits January 26, 2018 15:00

Added tests that check the node deletion. They work as anticipated, w…

ffd9032

…hich means that aiidateam#923 is resolved

Updates SQLAlchemy deletion since there was a bug

396d489

Updated documentation, removing the secret script, and updating docum…

c44d3f5

…entation for the verdi node subcommands

lekah requested review from sphuber, nmounet and szoupanos January 26, 2018 18:16

Removed the node-deletion script from the cookbook

417cb9d

nmounet reviewed Jan 30, 2018

View reviewed changes

szoupanos reviewed Jan 30, 2018

View reviewed changes

szoupanos requested a review from giovannipizzi January 30, 2018 16:24

lekah added 5 commits January 30, 2018 19:19

Merge branch 'develop' into verdi_node_delete_923

85bf11e

Added more martial warnings for node deletion. Also, rewrote the stat…

83b0478

…ement for node deletion for sqlalchemy because it wasn't liked, and actually because the cascading for dbgroup_dbnodes was failing (if node belonged to a group) in sqlalchemy

Merge branch 'develop' into verdi_node_delete_923

4878fad

Updated command-line options for aiidateam#923. Also fixed types in c…

3fcab81

…omments

Merge branch 'develop' into verdi_node_delete_923

5f54b4d

This addresses some requests in aiidateam#1083, specifically to have …

c0da5c4

…a logging mechanism for calculations that lose created data or called instance. Implemented a printout when this happens and storing to DbLog. To be reviewed for further improvements.

giovannipizzi approved these changes Feb 2, 2018

View reviewed changes

szoupanos approved these changes Feb 2, 2018

View reviewed changes

nmounet approved these changes Feb 5, 2018

View reviewed changes

Merge branch 'develop' into verdi_node_delete_923

c5baf65

nmounet merged commit 9ea1a5e into aiidateam:develop Feb 5, 2018

This was referenced Mar 1, 2018

verdi node delete #923

Closed

Node deletion script is out of date #984

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verdi node delete 923 #1083

Verdi node delete 923 #1083

lekah commented Jan 26, 2018

nmounet Jan 30, 2018

lekah Jan 31, 2018

nmounet left a comment

szoupanos left a comment

szoupanos Jan 30, 2018

lekah Jan 30, 2018

szoupanos Jan 30, 2018

lekah Jan 31, 2018

szoupanos Feb 1, 2018

lekah commented Jan 30, 2018

giovannipizzi commented Jan 31, 2018

giovannipizzi commented Jan 31, 2018

giovannipizzi commented Jan 31, 2018

sphuber commented Jan 31, 2018

lekah commented Jan 31, 2018

lekah commented Jan 31, 2018

szoupanos commented Feb 1, 2018

lekah commented Feb 1, 2018

sphuber commented Feb 1, 2018

lekah commented Feb 1, 2018

giovannipizzi left a comment

giovannipizzi commented Feb 2, 2018

szoupanos commented Feb 2, 2018


		session = sa.get_scoped_session()

		with session.begin(subtransactions=True):

Verdi node delete 923 #1083

Verdi node delete 923 #1083

Conversation

lekah commented Jan 26, 2018

nmounet Jan 30, 2018

Choose a reason for hiding this comment

lekah Jan 31, 2018

Choose a reason for hiding this comment

nmounet left a comment

Choose a reason for hiding this comment

szoupanos left a comment

Choose a reason for hiding this comment

szoupanos Jan 30, 2018

Choose a reason for hiding this comment

lekah Jan 30, 2018

Choose a reason for hiding this comment

szoupanos Jan 30, 2018

Choose a reason for hiding this comment

lekah Jan 31, 2018

Choose a reason for hiding this comment

szoupanos Feb 1, 2018

Choose a reason for hiding this comment

lekah commented Jan 30, 2018

giovannipizzi commented Jan 31, 2018

giovannipizzi commented Jan 31, 2018

giovannipizzi commented Jan 31, 2018

sphuber commented Jan 31, 2018

lekah commented Jan 31, 2018

lekah commented Jan 31, 2018

szoupanos commented Feb 1, 2018

lekah commented Feb 1, 2018

sphuber commented Feb 1, 2018

lekah commented Feb 1, 2018

giovannipizzi left a comment

Choose a reason for hiding this comment

giovannipizzi commented Feb 2, 2018

szoupanos commented Feb 2, 2018