-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use special "deleted" journalist for associations with deleted users #6225
Conversation
8cc2ea4
to
0a18cb0
Compare
This pull request introduces 1 alert when merging 0a18cb0 into f768a1a - view on LGTM.com new alerts:
|
0a18cb0
to
5abb6f4
Compare
This pull request introduces 1 alert when merging 5abb6f4 into f768a1a - view on LGTM.com new alerts:
|
Codecov Report
@@ Coverage Diff @@
## develop #6225 +/- ##
===========================================
- Coverage 85.12% 84.06% -1.07%
===========================================
Files 59 60 +1
Lines 4088 4186 +98
Branches 487 503 +16
===========================================
+ Hits 3480 3519 +39
- Misses 491 548 +57
- Partials 117 119 +2
Continue to review full report at Codecov.
|
5abb6f4
to
312106a
Compare
Is there a way to suppress the lgtm warning? The thing it's complaining about is intentional. And codecov is claiming that the migration script isn't covered, but AFAICT the test I wrote for it is running and works. The other new thing it said wasn't covered I just added a test for. |
This pull request introduces 1 alert when merging 312106a into f768a1a - view on LGTM.com new alerts:
|
@@ -784,13 +790,57 @@ def to_json(self, all_info: bool = True) -> Dict[str, Any]: | |||
|
|||
return json_user | |||
|
|||
def delete(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend explicitly passing the db_session as an argument:
- It will make testing easier.
- Importing the db object/module directly creates the implicit assumption that the caller has setup the special (flask) context where this import will work. This is usually fine in the running app, but can be tricky in utility scripts and in the test suite. Making the db_session an argument makes this requirement explicit/documented, and leaves it to the caller to get the db session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other methods in Journalist don't do this, so I'd prefer to keep it consistent in the class for now. What you explained seems reasonable to me, I just think it should be done separately.
312106a
to
7854b88
Compare
This pull request introduces 1 alert when merging 7854b88 into f768a1a - view on LGTM.com new alerts:
|
7854b88
to
3685cb0
Compare
This pull request introduces 1 alert when merging 3685cb0 into b4c2b3a - view on LGTM.com new alerts:
|
3685cb0
to
1176a7d
Compare
In testing the client PR, I realized I overlooked that all the |
This pull request introduces 2 alerts when merging 1176a7d into b4c2b3a - view on LGTM.com new alerts:
|
1176a7d
to
68d1b25
Compare
This pull request introduces 1 alert when merging 68d1b25 into b4c2b3a - view on LGTM.com new alerts:
|
68d1b25
to
de8e49f
Compare
This pull request introduces 1 alert when merging de8e49f into b4c2b3a - view on LGTM.com new alerts:
|
securedrop/models.py
Outdated
def __init__(self) -> None: | ||
# password field is unused (we override set_password below), plus | ||
# it's too short to be valid anyway. | ||
super().__init__(username='deleted', password='unused') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A valid OTP secret is generated for the deleted
user - attempting to log in with said credentials fails with the following raceback:
ValueError
ValueError: Should never happen: pw_salt is none for legacy Journalist 4
Traceback (most recent call last)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 2309, in __call__
return self.wsgi_app(environ, start_response)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 2295, in wsgi_app
response = self.handle_exception(e)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 1741, in handle_exception
reraise(exc_type, exc_value, tb)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/venvs/securedrop-app-code/lib/python3.8/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/user/securedrop/securedrop/journalist_app/main.py", line 30, in login
user = validate_user(request.form['username'],
File "/home/user/securedrop/securedrop/journalist_app/utils.py", line 97, in validate_user
return Journalist.login(username, password, token)
File "/home/user/securedrop/securedrop/models.py", line 730, in login
if not user.valid_password(password):
File "/home/user/securedrop/securedrop/models.py", line 571, in valid_password
raise ValueError(
ValueError: Should never happen: pw_salt is none for legacy Journalist 4
Login should fail, but not this explosively! There was existing code to bail out of logins for invalid usernames, of which deleted
is currently the only one, but it's not getting hit because it assumes that the invalid username and UUID are the same.
I'm kindof coming around to the idea of a "login_enabled" field or something in the journalist table, which would be checked first when validating user logins. This will probably have to be added anyway in the future to support account locking, but it could simplify this PR now, and allow for the removal of that INVALID_USERNAMES array in favour of a db query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, setting/forgetting a valid diceware password for the deleted user, and dropping the overriden set_password method, would both rule out the exception above and clear the new LGTM alert. I recognise that it's an intentional, not accidental, change, but the side effects of overriding the method seem undefined, which is a worry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Login should fail, but not this explosively! There was existing code to bail out of logins for invalid usernames, of which deleted is currently the only one, but it's not getting hit because it assumes that the invalid username and UUID are the same.
Uh, I missed that UUID check somehow. Fixing that and writing a test for this as well.
It is semi-intentional on my part that it does explode, the idea was that it should be impossible to log into "deleted" on multiple levels, first being the INVALID_USERNAMES check (or login_enabled in the future), and if you somehow bypassed it (as happened in this case), you'd still get stuck because there's no salt/password/TOTP. If we are fine with just having a random diceware+TOTP values to provide that second layer, I can try to switch to that. It is likely that will add additional complexity to the database migration script since we need to go through all that logic without being able to reuse the Journalist
class, but it seems doable. We would still need the override for check_username_acceptable, but the consequences of that seem much more predictable (compared to your concern with the override of set_password).
I think we still want to keep INVALID_USERNAMES even with a potential login_enabled because we want to prevent people from manually creating "deleted" account. So I can work on adding in login_enabled but I don't think it'll help that much here compare to the increased scope of another schema change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just leery of something like that throwing an undhandled exception, but the point about migration woes is well-taken. Maybe we don't even need the login_enabled
field just yet: if you had a Journalist.is_unlocked()
that just returned true you could override that in DeletedJournalist
to return false without falling foul of constructor overrides. If we ever did add account locking (and said field) it would still be useful, the Journalist
version of the method would probably just query the db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a PassphraseGenerator in the migration step seems to work with minimal copy and paste, so I pushed changes that now set a real passphrase/totp secret for "deleted", and no longer override "set_password". I also fixed the pre-existing tests for the deleted journalist login to work with the new style of having a real UUID.
if you had a
Journalist.can_log_in()
that just returned true you could override that inDeletedJournalist
to return false without falling foul of constructor overrides
This wouldn't work because on the login form when we query by username, we get an instance of Journalist
back, not DeletedJournalist
. Now DeletedJournalist
is only used to bypass the username validation in check_username_acceptable.
Now that we no longer need it to override password handling, we could just get rid of the class and do something like:
deleted = Journalist(username="placeholder", password=...)
deleted.username = "deleted"
to bypass the constructor validation. And DeletedJournalist.get()
would become Journalist.get_deleted()
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM - will hold off on further review for now, lemme know when it's updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zenmonkeykstop done!
de8e49f
to
eb9184b
Compare
This pull request introduces 1 alert when merging eb9184b into 5daebb6 - view on LGTM.com new alerts:
|
eb9184b
to
0b648d5
Compare
This pull request introduces 1 alert when merging 0b648d5 into 5daebb6 - view on LGTM.com new alerts:
|
0b648d5
to
4a15c14
Compare
(Rebased to fix the merge conflict after the Flask 2.0 PR landed) |
This pull request introduces 1 alert when merging 4a15c14 into a20c843 - view on LGTM.com new alerts:
|
Currently when a journalist is deleted, most referential tables are updated with `journalist_id=NULL`, forcing all clients to accomodate that case. Instead, we are now going to either re-associate those rows with a special "deleted" journalist account, or delete them outright. All journalist_id columns are now NOT NULL to enforce this. Tables with rows migrated to "deleted": * replies * seen_files * seen_messages * seen_replies Tables with rows that are deleted outright: * journalist_login_attempt * revoked_tokens The "deleted" journalist account is a real account that exists in the database, but cannot be logged into. It is not possible to delete it nor is it shown in the admin listing of journalists. It is lazily created on demand using `Journalist.get_deleted()`. For consistency reasons, a randomly generated diceware passphrase and TOTP secret are set in the database for this account, but never revealed to anyone. We use a placeholder username to bypass Journalist's normal validation check that the username is not "deleted". Because we now have a real user to reference, the api.single_reply endpoint will return a real UUID instead of the "deleted" placeholder. Journalist objects must now be deleted by calling the new delete() function on them. Trying to directly `db.session.delete(journalist)` will most likely fail with an Exception because of rows that weren't migrated first. The migration step looks for any existing rows in those tables with journalist_id=NULL and either migrates them to "deleted" or deletes them. Then all the columns are changed to be NOT NULL. Fixes #6192.
4a15c14
to
6685e28
Compare
This pull request introduces 1 alert when merging 6685e28 into a20c843 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran through test plan, LGTM!
Also:
- spun up a prod VM install in 2.1.0
- added a journalist account, viewed submissions with it and deleted the account
- confirmed nulls in seen_* tables
- upgraded to packages built off this branch
- confirmed the creation of a
deleted
user with a valid UUID, and that seen_* table entries with nulls now reference this user's id - confirmed the API reply behaviour.
Status
Ready for review. Note that this depends on #6223 to work properly.
Description of Changes
Currently when a journalist is deleted, most referential tables are
updated with
journalist_id=NULL
, forcing all clients to accomodatethat case. Instead, we are now going to either re-associate those rows
with a special "deleted" journalist account, or delete them outright.
All journalist_id columns are now NOT NULL to enforce this.
Tables with rows migrated to "deleted":
Tables with rows that are deleted outright:
The "deleted" journalist account is a real account that exists in the
database, but cannot be logged into and has no passphase set. It is not
possible to delete it nor is it shown in the admin listing of
journalists. It is lazily created on demand using a special
DeletedJournalist subclass that bypasses username and passphrase
validation.
Because we now have a real user to reference, the api.single_reply
endpoint will return a real UUID instead of the "deleted" placeholder.
Journalist objects must now be deleted by calling the new delete()
function on them. Trying to directly
db.session.delete(journalist)
will most likely fail with an Exception because of rows that weren't
migrated first.
The migration step looks for any existing rows in those tables with
journalist_id=NULL and either migrates them to "deleted" or deletes
them. Then all the columns are changed to be NOT NULL.
Fixes #6192.
Testing
Deployment
This has a schema change and data migration for existing installs.
Checklist
make lint
) and tests (make test
) pass in the development container