Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add security notification script sdw-notify #445

Merged
merged 14 commits into from
Feb 13, 2020

Conversation

eloquence
Copy link
Member

@eloquence eloquence commented Feb 7, 2020

This script checks whether updates have been applied recently enough using the preflight updater (AKA launcher) that is part of the SecureDrop Workstation. "Recently enough" is defined as follows:

  • If the updater has never successfully run, uptime does not exceed a warning threshold.
  • If the updater has run successfully, it is not longer ago than the warning threshold. However, we allow for a grace period.

In this PR, the warning threshold is set to 5 days, and the grace period to 30 minutes.

The script runs as a cron job and displays a PyQt warning if warranted according to the above criteria. Lock files stored in /run/user are used to avoid showing the warning multiple times, or while the preflight updater is running.

As part of this work, code from sdw-launcher has been reorganized, to allow for shared use in a utility library by other system components like sdw-notify.

NOTE: Due to packaging changes, this PR bumps the version of securedrop-workstation. The packaging changes address parts of #449 (the creation of *.pyc and *.pyo files) by disabling implicit Python bytecode compilation.

Status

Ready for initial review. Draft until re-verification in Qubes and RPM updates.

Description of Changes

Fixes #236

Testing

Installation:

  • Check out this PR in your development VM
  • make clone && make prep-dom0 in dom0

Regression testing:

  • Test that the preflight updater is still working as expected by clicking the "SecureDrop" desktop icon
  • Test that the preflight updater still creates logs in ~/.securedrop_launcher/launcher.log

Testing new preflight updater functionality:

  • Test that you cannot run another preflight updater while it is already running

Testing new notification functionality:

Before running any commands, I recommend running watch tail -n 25 ~/.securedrop_launcher/logs/sdw-notify.log in a separate terminal. All notable events below should produce log entries.

  • Test that running /opt/securedrop/launcher/sdw-notify.py has the expected results for your current workstation state per the behavior design doc
  • Test that setting the timestamp written to ~/.securedrop_launcher/sdw-last-updated to a date well in the past and running /opt/securedrop/launcher/sdw-notify.py causes the notification to appear, if the system has been up for more than 30 minutes (uptime grace period).
    • You can manually edit UPTIME_GRACE_PERIOD in /opt/securedrop/launcher/sdw_notify/Notify.py to force a run.
  • Test that setting the timestamp written to ~/.securedrop_launcher/sdw-last-updated to today and running /opt/securedrop/launcher/sdw-notify.py causes the notification to not appear.
    • Ensure that the uptime grace period has elapsed or is set to 0.
  • Test that removing the timestamp ~/.securedrop_launcher/sdw-last-updated causes the notification to appear. (Updated per discussion below.)
  • Test that creating a timestamp file ~/.securedrop_launcher/sdw-last-updated with a bad value causes the notification to appear. (Updated per discussion below.)
  • Test that all of the above produces log entries.
  • Test that the cron job runs as expected (easiest by modifying the 0 to * in /etc/crontab to change it from "every hour" to "every minute")

Checklist

  • Linting (make flake8) passes
  • Tests (make test) pass in dom0 of a Qubes install
    • I'm seeing unrelated test failures related to kernel versions and the RPM key in sys-firewall.
  • I have added/removed files, and have updated packaging logic in MANIFEST.in and rpm-build/SPECS/securedrop-workstation-dom0-config.spec

Authorship

This PR represents a collaboration between @pierwill and @eloquence, with a lot of helpful input from @emkll.


relpath_notify = "../sdw_notify/Notify.py"
path_to_notify = os.path.join(os.path.dirname(os.path.abspath(__file__)), relpath_notify)
notify = SourceFileLoader("Notify", path_to_notify).load_module()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the pattern for relative imports that was already used in sdw-launcher, updated here to avoid the deprecation warning caused by the use of imp. However, I'm sure a cleaner relative import is possible and would be happy to refactor both files - pointers appreciated!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path insertions aren't ideal. Since we have an RPM package now, we can install the Python modules at the system level and import them directly. Let's not do that now, especially with #450 still pending review. Not critical pre-pilot, but would make the updater/notifier code much more maintainable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. Note that this is just copying the approach that was already in place (slightly revised to avoid a deprecation warning for imp), but we should definitely clean the launcher organization up a bit.

p = Pool(processes=1)
lock_result = p.apply(notify.can_obtain_updater_lock)
p.close()
assert lock_result is False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've struggled to find a clean way to provoke a lock conflict in a test that works well across platforms. Pointers appreciated here as well. This method does appear to work, but as far as I can tell, it causes the code coverage for those lines to be misreported.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here it would be fine to mock open or lockf and have IOException (not tested, but as example, using the following decorators :

@mock.patch("builtins.open", side_effect=IOError())
and
@mock.patch("fcntl.lockf", side_effect=IOError())

depending on which part of the lock logic you want to test

@@ -26,6 +26,7 @@ jobs:
virtualenv .venv
source .venv/bin/activate
pip install --require-hashes -r test-requirements.txt
sudo apt install lsof
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lsof is available on Mac and under Linux, and therefore seemed like the best fit to perform a cross-platform test for exclusive lock access. It is, however, not installed in this particular CircleCI container by default.

- marker_start: "### BEGIN securedrop-workstation ###"
- marker_end: "### END securedrop-workstation ###"
- content: |
0 * * * * {{gui_user}} DISPLAY=:0 /opt/securedrop/launcher/sdw-notify.py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DISPLAY=:0 environment varialbe is necessary to display a graphical warning from a cron job without a display.

# warning threshold.
dom0-crontab-update-notify:
file.blockreplace:
- name: /etc/crontab
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cron.hourly will run as root, so modifying /etc/crontab appears to be the cleanest way to run as the gui_user, which is necessary in order to run with the same permissions as the preflight updater.

@eloquence eloquence force-pushed the uptime-notification-squashed branch from 3ee6bbd to 59fdef4 Compare February 8, 2020 01:24
@eloquence eloquence marked this pull request as ready for review February 8, 2020 01:46
@emkll emkll self-requested a review February 10, 2020 16:42
Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pierwill and @eloquence, this is great work. I've gone through the test plan and a visual review. I think that prior to merge, we should address the following points. Let me know what you think!

  1. The only option presented to a user in this notification is to acknowledge via the OK button. In order to better guide the user into the next steps, it may be a good idea to allow the user to open the launcher/updater directly from this notification. If the resulting diff is small, I propose we introduce this change here. If it's a larger change, I think filing a follow-up issue is sufficient for the purposes of this PR.
  2. Test coverage is not complete (100%) for Notify.py and Updater.py. (there was a comment inline re: coverage, perhaps my reply will help resolve). We should ensure 100% coverage in those two files, as they have somewhat complex logic and will help reduce the likelihood of regressions in the future. Let me know if you would like help to write further tests to complete coverage.
  3. The handling of the case where sdw-last-updated flag is not present is insufficiently defensive. Asking the user to update in the absence of sdw-last-updated flag may be more prudent here. We could open the updater at the end of running make all to ensure everything is updated (and the flag is dropped), or populate the flag on install. The approach where the flag is absent and won't prompt will fail open, should the updater never run.
  4. Related to point 3 above, the case where the timestamp is invalid will also fail open:
2020-02-10 15:48:56,459 - sdw_notify.Notify:110(is_update_check_necessary) ERROR: Data in /home/m/.securedrop_launcher/sdw-last-updated not in the expected format. Expecting a timestamp in format '%Y-%m-%d %H:%M:%S'.

In other words, for 3 and 4, I think that the absence or the invalidity of a timestamp should trigger the notification/alert. This will make the logic fail closed, and the updater will repopulate a valid flag after running.

  1. You make a great point re: rpm permissions on updater scripts. While this should be mitigated by Enforce dom0 salt state as part of preflight updates #427, it would be safer to specify these in the rpm spec in case there's an edge-case where the config is not applied, and an update is needed (the updater will no longer work). Let's update this as part of this PR since we are touching/testing these parts of the code, if you don't mind (https://github.com/freedomofpress/securedrop-workstation/pull/445/files#r376673672)

Otherwise, reporting a successful test plan 🎉
sdw-notify

Functional testing

Thank you for the comprehensive test plan. Functional test plan worked as expected.

Installation:

  • Check out this PR in your development VM
  • make clone && make prep-dom0 in dom0

Regression testing:

  • Test that the preflight updater is still working as expected by clicking the "SecureDrop" desktop icon
  • Test that the preflight updater still creates logs in ~/.securedrop_launcher/launcher.log

Testing new preflight updater functionality:

  • Test that you cannot run another preflight updater while it is already running

Testing new notification functionality:

Before running any commands, I recommend running watch tail -n 25 ~/.securedrop_launcher/logs/sdw-notify.log in a separate terminal. All notable events below should produce log entries.

  • Test that running /opt/securedrop/launcher/sdw-notify.py has the expected results for your current workstation state per the behavior design doc
  • Test that setting the timestamp written to ~/.securedrop_launcher/sdw-last-updated to a date well in the past and running /opt/securedrop/launcher/sdw-notify.py causes the notification to appear, if the system has been up for more than 30 minutes (uptime grace period).
    • You can manually edit UPTIME_GRACE_PERIOD in /opt/securedrop/launcher/sdw_notify/Notify.py to force a run.
  • Test that setting the timestamp written to ~/.securedrop_launcher/sdw-last-updated to today and running /opt/securedrop/launcher/sdw-notify.py causes the notification to not appear.
    • Ensure that the uptime grace period has elapsed or is set to 0.
  • Test that removing the timestamp ~/.securedrop_launcher/sdw-last-updated and running /opt/securedrop/launcher/sdw-notify.py with uptime lower than 5 days causes the notification to not appear.
    ❗ The issue i see with this approach is that it fails open: if the updater never runs, these notifications will never display, if the uptime always remains < 5 days.
  • Test that removing the timestamp ~/.securedrop_launcher/sdw-last-updated and running /opt/securedrop/launcher/sdw-notify.py with uptime higher than 5 days causes the notification to appear.
    • You can manually edit WARNING_THRESHOLD in /opt/securedrop/launcher/sdw_notify/Notify.py to force a run.
  • Test that all of the above produces log entries.
  • Test that the cron job runs as expected (easiest by modifying the 0 to * in /etc/crontab to change it from "every hour" to "every minute") - also let it run for a bit and inpected log

@@ -0,0 +1,35 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for doing this. There uptime, lockfile, but I think this definitely sufficient for merge in this iteration. If we were to move further (non-logging) logic here, we should have test coverage here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a stab at refactoring the lockfile logic into the util library to further reduce duplication and increase testability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now done.

show_update_warning()


def show_update_warning():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how straightforward the QT logic is here, I think it's fine to leave as is, but if ever we introduce further complexity/stying here, I would recommend we decouple the main from the QT code, and simply instanciate the QT app object in the main, and handle all other QT tasks in a separate file/class


# The file and format that contains the timestamp of the last successful update
LAST_UPDATED_FILE = os.path.join(BASE_DIRECTORY, "sdw-last-updated")
LAST_UPDATED_FORMAT = "%Y-%m-%d %H:%M:%S"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: using the variable from Updater.py or storing in Utils.py could help reduce duplication here

p = Pool(processes=1)
lock_result = p.apply(notify.can_obtain_updater_lock)
p.close()
assert lock_result is False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here it would be fine to mock open or lockf and have IOException (not tested, but as example, using the following decorators :

@mock.patch("builtins.open", side_effect=IOError())
and
@mock.patch("fcntl.lockf", side_effect=IOError())

depending on which part of the lock logic you want to test

@@ -12,7 +12,7 @@
path_to_script = os.path.join(
os.path.dirname(os.path.abspath(__file__)), relpath_updater_script
)
updater = imp.load_source("Updater", path_to_script)
updater = SourceFileLoader("Updater", path_to_script).load_module()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for squashing this deprecation warning 👍

@eloquence
Copy link
Member Author

Thanks for the detailed review! A couple of quick responses:

In other words, for 3 and 4, I think that the absence or the invalidity of a timestamp should trigger the notification/alert.

That makes sense, let's just show the warning immediately in those two cases without a grace period, because they should not occur on a normal system, and we don't want to delay letting the user know that something is wrong.

I really like your suggestion of populating sdw-last-updated during the installation, that would avoid showing the warning if the user waits a little bit after the install, and would also ensure the updater doesn't have to run immediately again after the install once #402 is merged. I think that can be done as a separate and independent change.

The only option presented to a user in this notification is to acknowledge via the OK button. In order to better guide the user into the next steps, it may be a good idea to allow the user to open the launcher/updater directly from this notification.

I completely agree; I deliberately kept the notification super simple for this first iteration, but I think we should tweak it a bit:

  1. vary the text of the notification based on the actual system state
  2. give the user an option to launch the preflight updater if the client is not already running

The additional logic needed for 2) makes me think this is a good change for a follow-up PR (plus I'm not versed in PyQt yet, so it may be faster for someone else to take this on).

@eloquence
Copy link
Member Author

I really like your suggestion of populating sdw-last-updated during the installation, that would avoid showing the warning if the user waits a little bit after the install

Thinking this through a bit more, there are a couple of issues with that approach:

  1. Right now we're not enforcing updates to dom0 during the provisioning (it's just recommended in the README).
  2. I think to be able to trust the state of sdw-last-updated it makes sense that it should only ever be written by a single source of truth (preflight updater), so we don't potentially have to deal with inconsistencies.
  3. Even if we write this during the provisioning, the notification might pop up if the cronjob happens to run first.
  4. It's probably a good idea to recommend to anyone installing the workstation that they run the preflight updater immediately after the installation, since it's such a crucial system component. So I'm not sure we want to suppress it by marking the system as up-to-date during provisioning.

This is IMO beyond the scope of this PR, but what do y'all think about storing the installation date in a file like sdw-install-date as a final step in a make all run / securedrop-admin --apply? Then the notifier could stay silent if that file doesn't exist (because it indicates that the installation is still in progress, or broken). Recording this date could also be useful for debugging. Unless there's an easy way to infer that the installation is complete?

@eloquence eloquence force-pushed the uptime-notification-squashed branch 4 times, most recently from 894223b to 2893f56 Compare February 11, 2020 02:01
@eloquence eloquence force-pushed the uptime-notification-squashed branch from 2893f56 to 8c500c9 Compare February 11, 2020 05:20
@eloquence
Copy link
Member Author

eloquence commented Feb 11, 2020

Re: mocking the lock (lock-mocking?):

I think here it would be fine to mock open or lockf and have IOException (not tested, but as example, using the following decorators :

That's a great point, I think it's reasonable to treat fcntl itself as reliable and focus on testing the business logic here. Done in e95e4c4 for the existing test.

# To ensure forward-compatibility of RPMs regardless of updates to the system
# Python, we disable the creation of bytecode at build time via the build
# root policy.
%undefine py_auto_byte_compile
Copy link
Member Author

@eloquence eloquence Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the change discussed in #449 (comment) and subsequent comments.

"""
with TemporaryDirectory() as tmpdir, \
mock.patch("Notify.LOCK_FILE_NOTIFIER",
os.path.join(tmpdir, "sdw-notify.lock")):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I'm using this pattern to patch the global variable, to avoid unintended side effects either across tests or in a single test; please let me know if there's a preferable pattern for the same result.

@eloquence
Copy link
Member Author

eloquence commented Feb 12, 2020

Recap of recent changes:

  • We're at 100% test coverage for Updater, Util and Notify
  • RPM packaging now handles permissions for executables and disables BRP that implicitly generates *.pyc and *.pyo files
  • Util now includes all lockfile handling
  • We (should) now show a warning when there's no timestamp file
  • We (should) now show a warning when the timestamp file is in the wrong format.

I've not fully re-tested yet, but you're welcome to take a spin through it again @emkll; I'll officially move it to "Ready" again once I've done another test through myself. Thanks for all the comments so far!

Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eloquence for the changes, they look good to me. I have reviewed your changes locally, went though the test plan and did a visual review, all works as expected, and addresses comments 2,3,4,5. Happy to open a ticket for 1), as it would be nice to address as part of the beta milestone (pending freedomofpress/securedrop-ux#99).

Prior to merge, could you please run the black formatter on launcher/ subdirectory

@eloquence
Copy link
Member Author

Prior to merge, could you please run the black formatter on launcher/ subdirectory

Done. Will re-test now, as well, and await Conor's visual review.

@eloquence eloquence force-pushed the uptime-notification-squashed branch from 7daf102 to 3a42dd3 Compare February 12, 2020 18:20
@eloquence
Copy link
Member Author

(Rebased to latest upstream master.)

@eloquence
Copy link
Member Author

Changed text of warning slightly, to "This computer has not been checked for security updates recently." This is so it's accurate under all circumstances when the warning is shown.

(As discussed above, additional logic for launching the updater or showing more specific messages can be handled in follow-up changes.)

@eloquence
Copy link
Member Author

eloquence commented Feb 12, 2020

I've retested this in Qubes and it works as expected. @conorsch, now would be a good time for a visual review (I don't think you need to do a full functional test since both @emkll and I have done so extensively, but up to you of course).

Prior to merge, I will squash the changes once more.

pierwill and others added 13 commits February 13, 2020 08:47
This script checks whether updates have been applied recently enough
using the preflight updater (AKA launcher) that is part of the
SecureDrop Workstation. "Recently enough" is defined as follows:

- If the updater has never successfully run, uptime does not exceed a
  warning threshold.
- If the updater has run successfully, it is not longer ago than the
  warning threshold. However, we allow for a grace period.

In this commit, the warning threshold is set to 5 days, and the grace
period to 30 minutes.

The script runs as a cron job and displays a PyQt warning if warranted
according to the above criteria. Lock files stored in /run/user are
used to avoid showing the warning multiple times, or while the
preflight updater is running.

As part of this commit, code from sdw-launcher has been reorganized,
to allow for shared use in a utility library by other system
components like sdw-notify.

This commit represents a collaboration between @pierwill and @eloquence.
- Show warning if we have no timestmap
- Show warning if timestamp is in unexpected format
@eloquence eloquence force-pushed the uptime-notification-squashed branch from 245e083 to 453cbe8 Compare February 13, 2020 16:48
emkll
emkll previously approved these changes Feb 13, 2020
Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, this looks good to merge from my perspective. Approving but not merging, as @conorsch wanted to review as well.

Copy link
Contributor

@conorsch conorsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed visual review. Overall code is clean and well commented. Would like to shore up the RPM packaging strategy to treat the Python code as proper system-level modules, but this PR isn't the right time for that.

Noticed some formatting errors in the tests, where regular expressions are only partially validated, due to string concatenation quirks. Once those are cleaned up, fine for merge!


relpath_notify = "../sdw_notify/Notify.py"
path_to_notify = os.path.join(os.path.dirname(os.path.abspath(__file__)), relpath_notify)
notify = SourceFileLoader("Notify", path_to_notify).load_module()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path insertions aren't ideal. Since we have an RPM package now, we can install the Python modules at the system level and import them directly. Let's not do that now, especially with #450 still pending review. Not critical pre-pilot, but would make the updater/notifier code much more maintainable.

UPDATER_WARNING_REGEX = (
r"^Last successful update \(.* hours ago\) is above warning threshold "
)
r"\(.* hours\). Uptime grace period of .* hours has elapsed (uptime: .* hours)."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like L30 & L31 should be swapped. As written, UPDATER_WARNING_REGEX is resolving to the first half of the pattern. The problem of strings-not-concatenating occurs throughout test_notify.py, will comment in-line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, great flag. black helpfully inserted the parentheses, which should have made it obvious to me that the concatenation was wrong. 10003bd should fix these concatenations throughout.

Sometimes Python makes me miss semicolons. ;-)

# Regex for info log if we've updated too long ago, but grace period still ticking
GRACE_PERIOD_REGEX = r"Last successful update \(.* hours ago\) is above "
r"warning threshold \(.* hours\). Uptime grace period of .* hours has not elapsed "
r"yet \(uptime: .* hours\)."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String parts aren't concatenating, try:

# Regex for info log if we've updated too long ago, but grace period still ticking
GRACE_PERIOD_REGEX = (
    r"Last successful update \(.* hours ago\) is above "
    r"warning threshold \(.* hours\). Uptime grace period of .* hours has not elapsed "
    r"yet \(uptime: .* hours\)."
)

NO_WARNING_REGEX = (
r"Last successful update \(.* hours ago\) is below the warning threshold "
)
r"\(.* hours\)."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, strings are not concatenating; try:

# Regex for info log if we've updated recently enough
NO_WARNING_REGEX = (
    r"Last successful update \(.* hours ago\) is below the warning threshold "
    r"\(.* hours\)."
)

@conorsch conorsch merged commit 6b41516 into freedomofpress:master Feb 13, 2020
cfm pushed a commit that referenced this pull request Apr 1, 2024
Add security notification script sdw-notify
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display system alert recommending shutdown after 5 days of activity
4 participants