-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import with folders as tags #199
Conversation
@@ -20,11 +21,11 @@ class ImportResult: | |||
failed: int = 0 | |||
|
|||
|
|||
def import_netscape_html(html: str, user: User): | |||
def import_netscape_html(html: str, user: User, tags_from_folders: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tags_from_folders
can just be a boolean, also it would make sense to set a default value IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The form data for checkboxes is "on"
(true) / ""
(false). Does Django convert it into True
/False
if it's declared boolean
? I'm not that proficient in Django.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my concern is more that this part of the application should be free from frontend semantics. It would be enough to convert into a boolean when calling this function: request.POST['tags_from_folders'] == 'on'
@@ -12,18 +14,81 @@ class NetscapeBookmark: | |||
tag_string: str | |||
|
|||
|
|||
class BookmarkParser(HTMLParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not OK with adding a second parser implementation to the codebase. This should either extend the current parser, or replace the existing one. Kind of like the approach here with HTMLParser
, the implementation might be easier to understand than the current one, and it handles the missing closing tags pretty well.
Also this needs tests. Unfortunately there are no existing tests for the parser itself, but test_importer.py
could be used to add test cases for converting folders to tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, going to re-implement the other parser with a subclass of HTMLParser
and add a test for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - just to make sure, the goal is to have a single parser that can handle both cases. I think your existing parser could be parameterized to either create tags from folders, or not.
getattr(self, name)(data) | ||
|
||
def handle_start_dl(self, attrs: Dict[str, str]): | ||
print('<DL>') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug code, or use debug logging
@@ -53,7 +118,7 @@ def extract_description(tag): | |||
bookmark_tag.addParseAction(extract_bookmark) | |||
|
|||
|
|||
def parse(html: str) -> [NetscapeBookmark]: | |||
def parse(html: str) -> List[NetscapeBookmark]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiousity, why use List[...]
rather than [...]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding [...]
is incorrect. The brackets are not type annotations like in C/C++ (int vec[]
being an array of int
s) but sort of a parenthesis for specifying generic types, like in a C++ template specifier (template MyType<class T>
) where angle brackets serve the same purpose.
@@ -20,11 +21,11 @@ class ImportResult: | |||
failed: int = 0 | |||
|
|||
|
|||
def import_netscape_html(html: str, user: User): | |||
def import_netscape_html(html: str, user: User, tags_from_folders: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my concern is more that this part of the application should be free from frontend semantics. It would be enough to convert into a boolean when calling this function: request.POST['tags_from_folders'] == 'on'
@@ -12,18 +14,81 @@ class NetscapeBookmark: | |||
tag_string: str | |||
|
|||
|
|||
class BookmarkParser(HTMLParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool - just to make sure, the goal is to have a single parser that can handle both cases. I think your existing parser could be parameterized to either create tags from folders, or not.
title=data, | ||
description=self.description, | ||
date_added=self.add_date, | ||
tag_string=','.join(self.tag_stack[:-1]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's missing here is also applying the tags from the anchor tag itself, currently the explicit tags on the anchor tag are ignored and only the tags created from folders are added.
|
||
def handle_start_dl(self, attrs: Dict[str, str]): | ||
print('<DL>') | ||
self.tag_stack.append(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part, and the related logic in handle_h3_data
and handle_a_data
seems a bit convoluted. I think it might be easier to just add a tag to the stack when encountering an <h3>
, and then to remove a tag from the stack when encountering and </dl>
- unless the stack is already empty.
This has been requested before, and I can see this being useful, so I might still work on this at some point in the future. The parser proposed in this PR has already been adopted in #261. |
Provide an option to have folders in bookmark imports serve as tags.