Issue 639 prints in cdxj #640

anatoly-scherbakov · 2020-04-27T14:01:20Z

No description provided.

Update from main repo

anatoly-scherbakov · 2020-04-27T14:05:25Z

Closes #639

codecov · 2020-04-27T14:32:27Z

Codecov Report

Merging #640 into master will increase coverage by 2.18%.
The diff coverage is 55.84%.

@@            Coverage Diff             @@
##           master     #640      +/-   ##
==========================================
+ Coverage   27.07%   29.26%   +2.18%     
==========================================
  Files           7       10       +3     
  Lines        1241     1244       +3     
  Branches      190      184       -6     
==========================================
+ Hits          336      364      +28     
+ Misses        880      857      -23     
+ Partials       25       23       -2

Impacted Files	Coverage Δ
ipwb/__main__.py	`0.00% <0.00%> (ø)`
ipwb/settings.py	`0.00% <0.00%> (ø)`
ipwb/replay.py	`14.06% <22.22%> (+0.26%)`	⬆️
ipwb/indexer.py	`50.19% <66.66%> (-0.19%)`	⬇️
ipwb/util.py	`43.71% <80.76%> (+8.62%)`	⬆️
ipwb/error_handler.py	`100.00% <100.00%> (ø)`
ipwb/exceptions.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4fca791...16bbb8f. Read the comment docs.

ipwb/__main__.py

ipwb/check_for_update.py

machawk1 · 2020-04-27T17:45:45Z

ipwb/util.py

-def logError(errIn):
-    print(errIn, file=sys.stderr)
+    except Exception as e:
+        raise Exception('Unknown error in retrieving daemon status.') from e


Will the removal of logError(sys.exc_info()[0]) provide a sufficient traceback for debugging?

Here is an example:

In [1]: try: ...: raise ValueError('Ugly internal error, GRRRRR!!!') ...: except ValueError as err: ...: raise Exception('Nice user-friendly error :)') from err ...: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-1-8ed378f4cf49> in <module> 1 try: ----> 2 raise ValueError('Ugly internal error, GRRRRR!!!') 3 except ValueError as err: ValueError: Ugly internal error, GRRRRR!!! The above exception was the direct cause of the following exception: Exception Traceback (most recent call last) <ipython-input-1-8ed378f4cf49> in <module> 2 raise ValueError('Ugly internal error, GRRRRR!!!') 3 except ValueError as err: ----> 4 raise Exception('Nice user-friendly error :)') from err 5 Exception: Nice user-friendly error :)

As you can see, the user will see the error message you supplied in raise, together with its stack trace. And if the user is to scroll a bit upper they will see the original exception properly preserved.

Ok, I am convinced. PEP 415'sraise exc from None will also help suppress context, which is useful for development but noisy for the end user who should just be told directly what is wrong. I am fine using the pattern provided in this PR.

machawk1 · 2020-04-27T17:47:00Z

ipwb/util.py


+    except OSError:
+        raise Exception(


Is raising an Exception within the catch of an exception good practice? Prior implementation exited when this occurred but I do not see this same behavior.

The prior implementation exited; an unhandled Exception causes the program to do the same. So, no fundamental difference. But, using raise ... from ... prints the original stacktrace without any extra actions - just using the language's built-in mechanisms. That's why I prefer it to explicit sys.exit().

What is the reason of re-raising the exception though? The same reason which probably motivated to catch these exceptions, in the first place - to provide a bit more friendly error message to the user, I believe.

The more proper way would be to create custom exception classes and to catch them at an upper level of the application converting to nice readable messages, but I believe this can be implemented a bit later.

ipwb/util.py

ibnesayeed

Some functions such as isDaemonAlive are now raising exceptions instead of logging errors inline, which is great. However, when these functions are called, exceptions are not caught there, which may result in ugly stack trace for end users. Should this be rectified or am I missing something here?

ibnesayeed · 2020-04-27T17:53:27Z

Looks like Mat and I were looking at the same code at the same time. 😄

anatoly-scherbakov · 2020-04-28T05:21:24Z

I do not believe unhandled exceptions are inherently a bad thing. What I believe needs to be done is creating custom exceptions for various situations specific for the system, and handling them all in one place.

Something like this:

@dataclasses.dataclass(frozen=True)
class IndexNotFound(Exception):
    path: str
    backend: str

    def __str__(self):
        return f'The CDXJ index file {self.path} was not found by {self.backend} backend. Please check the correctness of the path.'
...

try:
    return replay(path)

except (IndexNotFound, InvalidIndex, ...) as err:
    if verbose is True:
        logger.exception()
    else:
        logger.warning(str(err))
    return

But this is tightly coupled with backends system because most of the operations with the index which can cause exceptions are probably going to be in the backend.

ibnesayeed · 2020-05-02T17:38:40Z

I am happy with raising custom exceptions if generic exception messages are too vague and a custom one can provide more context-specific message. However, I am against leaving known exceptions unhanded and let the program crash. There are situations where the program must exit/terminate on specific exceptions, but those exits should be preformed after catching those exceptions. There is a thin line between an application exit and application crash, unhanded exceptions suggest the latter. While stack trace is helpful for development, it is so not welcome by end users. Besides, if we know the cause of certain exceptions because we raised them, then we do not need to see the stack trace. Unknown issues should be the ones that need to show stack trace and be handled after being discovered. I have no issues in having a top level exception handler, as long as it is specific enough to not mask previously unknown exceptions.

anatoly-scherbakov · 2020-05-04T15:09:15Z

I will look into implementing some custom exceptions and their printouts in nearest days.

Rebase

Rebase on oduwsdl master

Issue 639 prints in cdxj

update from oduwsdl/master

anatoly-scherbakov · 2020-07-05T17:34:17Z

I again fell into the same pit of accumulating changes, so let me try to describe them in detail.

Communicating errors using exceptions

I added a wrapper called exception_handler(), which is used as a decorator around def main(). It catches unhandled exceptions. When caught an exception, it does one of two things:

if environment variable DEBUG is False (the default) - we will see only a logging message. Like this:

$ ipwb index ~/projects/covid-in-russia/warc/stopcoronavirus-russia.1588018596.cdxj
2020-07-05 23:51:43,336 [CRITICAL] ipwb: Daemon is not running at: /dns/localhost/tcp/5001/http

But we also can get full traceback:

$ DEBUG=true ipwb index ~/projects/covid-in-russia/warc/stopcoronavirus-russia.1588018596.cdxj
# --snip --
...exceeded with url: /api/v0/id?stream-channels=true (Caused by NewConnectionError('<ipfshttpclient.requests_wrapper.HTTPConnection object at 0x7f7bba80f4f0>: Failed to establish a new connection: [Errno 111] Connection refused'))

The above exception was the direct cause of the following exception:

# -- snip --

    raise Exception(f'Daemon is not running at: {daemonMultiaddr}') from err
Exception: Daemon is not running at: /dns/localhost/tcp/5001/http

The DEBUG variable also increases the logging level to ease debugging.

IPFS client

After adding commits, codecov was unhappy about their coverage. I decided to try improving the situation by adding some tests, and when that didn't help - went on to try simplifying a few things. Fewer the lines, better the coverage.

createIPFSClient() function is changed to ipfs_client(). That function is doing what the original function did - it creates the client.

However, it also is wrapped in @lru_cache decorator which means that the value of the client will be cached. This removes the need for global IPFS_API variables - and thus I removed them.

...and only at that point did I realize that my goal of passing coverage barrier should have been achieved by just writing tests for exception_handler() - and I got to write them, which seems to finally make codecov happy.

anatoly-scherbakov · 2020-07-05T17:53:06Z

@machawk1 fyi :)

ibnesayeed

These are commendable contributions @anatoly-scherbakov, applauds for taking care of this. I have a couple of minor suggestions and a concern which I would like to hear your thoughts on.

ipwb/util.py

ipwb/__main__.py

ibnesayeed · 2020-07-05T21:01:48Z

ipwb/util.py

+    except IOError as err:
+        raise Exception(
+            'IPFS config not found. Have you installed ipfs and run ipfs init?'
+        ) from err


Since sys.exit() call is removed, and the wrapper exception handler will simply log the exception message when DEBUG is set to false, how will the process terminate? There is no point in keeping the service alive in an defined state when critical pieces are missing, no matter DEBUG is on or off. DEBUG should make logging more verbose with stack trace, but it should not change the process termination behavior. Current implementation, if I could understand it well, means it will terminate the process by raising the exception when DEBUG is on even when the exception is rather innocent and handlable and will not terminate the process even on critical exceptions when the DEBUG is off. Is this the case or am I missing something here?

@ibnesayeed let us describe the calls sequence:

if __name__ == "__main__": main()

seemingly calls main(), but actually calls wrapper(*args, **kwargs) (see the implementation of exception_logger).

Then, wrapper calls the original main() inside a try - except block.

Now what happens if an unhandled exception pops out somewhere in main() or one of the functions it calls? Being unhandled, that exception pops up through the call stack, reaches main(), passes it and arrives at wrapper().

The execution of main() at this point is already interrupted by a propagating exception. wrapper() now can either reraise the exception or just log it - that does not matter because, when its except clause is done its work, wrapper() function returns and the process stops.

At least it should if everything works as I believe it does.

@anatoly-scherbakov I understand how wrapper/decorator functions work in Python. I have actually used them heavily in an HTTP testing framework I wrote a while ago. My concern here is a little different. Not every exception needs to be reraised to bubble out of the main and let the process die. For example, if the system is configured to use a local IPFS node (which is the case for now as the code needs some changes to decouple IPFS from IPWB) and the IPFS daemon is not accessible because the binary is not installed on the system then the process should die because there is no hope it will ever be accessible. However, if the daemon is not accessible because it is not running, but is installed, then we may not want to let the process die, because the admin control panel has an option to attempt to start/stop IPFS daemon. In the earlier code, some exceptions were logging a message and returning None while other exceptions were logging a message and exiting. I am not sure if that behavior is still preserved. I think it is great that we have a catch-all wrapper exception handler, but we may want to handle some exceptions inline and take appropriate actions instead of bubbling every exception up with a custom message.

Never mind, I think the custom exception IPFSDaemonNotAvailable is handled inline in the replay. This looks good to me now. Thanks @anatoly-scherbakov for your valuable contributions. We look forward to see more from you.

e → err Co-authored-by: Sawood Alam <ibnesayeed@gmail.com>

f-strings forever! Co-authored-by: Sawood Alam <ibnesayeed@gmail.com>

ibnesayeed

@machawk1 please merge this.

ibnesayeed · 2020-07-06T19:48:33Z

ipwb/util.py

+    except IOError as err:
+        raise Exception(
+            'IPFS config not found. Have you installed ipfs and run ipfs init?'
+        ) from err


Never mind, I think the custom exception IPFSDaemonNotAvailable is handled inline in the replay. This looks good to me now. Thanks @anatoly-scherbakov for your valuable contributions. We look forward to see more from you.

machawk1 · 2020-07-06T19:55:05Z

Did some manual verification on my end as well. Moving to merge this into master. Thanks, @anatoly-scherbakov!

anatoly-scherbakov added 3 commits April 27, 2020 18:48

Merge pull request #2 from oduwsdl/master

f478ece

Update from main repo

Replace print() with a logging call

15abc94

Little cleanup, closes oduwsdl#639

2f2be09

check_for_update.py and better test coverage

5f47023

machawk1 mentioned this pull request Apr 27, 2020

ipwb index outputs unrelated information into CDXJ file #639

Closed

machawk1 reviewed Apr 27, 2020

View reviewed changes

ipwb/__main__.py Outdated Show resolved Hide resolved

ibnesayeed requested changes Apr 27, 2020

View reviewed changes

ipwb/check_for_update.py Outdated Show resolved Hide resolved

ipwb/check_for_update.py Outdated Show resolved Hide resolved

anatoly-scherbakov added 3 commits April 28, 2020 00:05

Removed the PyPI version check

9c39bf2

Remove logError()

501e381

A few reformatted imports

6ae17f2

anatoly-scherbakov requested review from machawk1 and ibnesayeed April 27, 2020 17:30

machawk1 reviewed Apr 27, 2020

View reviewed changes

ipwb/util.py Show resolved Hide resolved

ibnesayeed reviewed Apr 27, 2020

View reviewed changes

machawk1 mentioned this pull request May 4, 2020

Next Release (planning) #644

Closed

2 tasks

machawk1 added 3 commits June 28, 2020 21:09

Merge branch 'issue-639-prints-in-cdxj' into issue-639-rebase

b38909b

Merge pull request #1 from oduwsdl/issue-639-rebase

422ed3f

Rebase

Merge pull request #3 from oduwsdl/master

b6a6ef4

Rebase on oduwsdl master

machawk1 mentioned this pull request Jul 1, 2020

Issue 639 prints in cdxj anatoly-scherbakov/ipwb#3

Merged

machawk1 and others added 3 commits July 1, 2020 10:39

Add missing json dependency

abfff15

Merge pull request #3 from machawk1/issue-639-prints-in-cdxj

2088f74

Issue 639 prints in cdxj

introduce error_handler() decorator

1ca49d7

anatoly-scherbakov added 11 commits July 5, 2020 22:29

Merge pull request #4 from oduwsdl/master

6184276

update from oduwsdl/master

Fixed conflicts

9780d0c

Fixed issues after merging conflicts

76d6063

Linter error fixed, docstring improved

9bfac3a

tests for check_daemon_is_alive()

a544698

ipfs_client() is now caching the instance of IPFS client

436e257

Fix tests

04aa90c

More tests to improve coverage

f460e8b

try-except clauses

e71715a

Tests for error_handler.exception_logger()

e5ad8ef

Make linter a bit happier

144d897

anatoly-scherbakov requested a review from ibnesayeed July 5, 2020 17:34

ibnesayeed reviewed Jul 5, 2020

View reviewed changes

anatoly-scherbakov and others added 2 commits July 6, 2020 10:56

Update ipwb/util.py

df09869

e → err Co-authored-by: Sawood Alam <ibnesayeed@gmail.com>

Update ipwb/__main__.py

16bbb8f

f-strings forever! Co-authored-by: Sawood Alam <ibnesayeed@gmail.com>

ibnesayeed approved these changes Jul 6, 2020

View reviewed changes

machawk1 merged commit 1575b1e into oduwsdl:master Jul 6, 2020

machawk1 mentioned this pull request Jul 7, 2020

Move ipfs daemon start/stop code from replay to util #65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 639 prints in cdxj #640

Issue 639 prints in cdxj #640

anatoly-scherbakov commented Apr 27, 2020

anatoly-scherbakov commented Apr 27, 2020

codecov bot commented Apr 27, 2020 •

edited

Loading

machawk1 Apr 27, 2020

anatoly-scherbakov Apr 28, 2020

machawk1 Apr 28, 2020

machawk1 Apr 27, 2020

anatoly-scherbakov Apr 28, 2020

ibnesayeed left a comment

ibnesayeed commented Apr 27, 2020

anatoly-scherbakov commented Apr 28, 2020

ibnesayeed commented May 2, 2020 •

edited

Loading

anatoly-scherbakov commented May 4, 2020

anatoly-scherbakov commented Jul 5, 2020

anatoly-scherbakov commented Jul 5, 2020

ibnesayeed left a comment

ibnesayeed Jul 5, 2020

anatoly-scherbakov Jul 6, 2020

ibnesayeed Jul 6, 2020

ibnesayeed Jul 6, 2020

ibnesayeed left a comment

ibnesayeed Jul 6, 2020

machawk1 commented Jul 6, 2020

Issue 639 prints in cdxj #640

Issue 639 prints in cdxj #640

Conversation

anatoly-scherbakov commented Apr 27, 2020

anatoly-scherbakov commented Apr 27, 2020

codecov bot commented Apr 27, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ibnesayeed left a comment

Choose a reason for hiding this comment

ibnesayeed commented Apr 27, 2020

anatoly-scherbakov commented Apr 28, 2020

ibnesayeed commented May 2, 2020 • edited Loading

anatoly-scherbakov commented May 4, 2020

anatoly-scherbakov commented Jul 5, 2020

Communicating errors using exceptions

IPFS client

anatoly-scherbakov commented Jul 5, 2020

ibnesayeed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ibnesayeed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

machawk1 commented Jul 6, 2020

codecov bot commented Apr 27, 2020 •

edited

Loading

ibnesayeed commented May 2, 2020 •

edited

Loading