Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTMLParser: undocumented not implemented method #76025

Closed
srittau mannequin opened this issue Oct 23, 2017 · 10 comments
Closed

HTMLParser: undocumented not implemented method #76025

srittau mannequin opened this issue Oct 23, 2017 · 10 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@srittau
Copy link
Mannequin

srittau mannequin commented Oct 23, 2017

BPO 31844
Nosy @srittau, @ezio-melotti, @berkerpeksag, @csabella
PRs
  • bpo-31844: Remove _markupbase.ParserBase.error() #8562
  • bpo-31844: Move whatsnew note to 3.10.rst #21504
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ezio-melotti'
    closed_at = <Date 2020-07-16.06:39:25.857>
    created_at = <Date 2017-10-23.08:27:23.318>
    labels = ['type-bug', 'library', '3.10']
    title = 'HTMLParser: undocumented not implemented method'
    updated_at = <Date 2020-07-16.06:39:25.857>
    user = 'https://github.com/srittau'

    bugs.python.org fields:

    activity = <Date 2020-07-16.06:39:25.857>
    actor = 'berker.peksag'
    assignee = 'ezio.melotti'
    closed = True
    closed_date = <Date 2020-07-16.06:39:25.857>
    closer = 'berker.peksag'
    components = ['Library (Lib)']
    creation = <Date 2017-10-23.08:27:23.318>
    creator = 'srittau'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31844
    keywords = ['patch']
    message_count = 10.0
    messages = ['304782', '304783', '305303', '305306', '322662', '322672', '323968', '371500', '373745', '373746']
    nosy_count = 5.0
    nosy_names = ['srittau', 'ezio.melotti', 'berker.peksag', 'cheryl.sabella', 'William Ayd']
    pr_nums = ['8562', '21504']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue31844'
    versions = ['Python 3.10']

    @srittau
    Copy link
    Mannequin Author

    srittau mannequin commented Oct 23, 2017

    HTMLParser derives from _markupbase.ParserBase, which has the following method:

    class HTMLParser:
    ...
    
        def error(self, message):
            raise NotImplementedError(
                "subclasses of ParserBase must override error()")

    HTMLParser does not implement this method and the documentation for HTMLParser (https://docs.python.org/3.6/library/html.parser.html) does not mention that its sub-classes need to override it.

    I am not sure whether this is a documentation omission, whether HTMLParser should provide an (empty?) implementation, or whether ParserBase should not raise a NotImplementedError (to make linters happy).

    @srittau
    Copy link
    Mannequin Author

    srittau mannequin commented Oct 23, 2017

    The quoted code above should have used ParserBase:

    class ParserBase:
    ...
    
        def error(self, message):
            raise NotImplementedError(
                "subclasses of ParserBase must override error()")

    @WilliamAyd
    Copy link
    Mannequin

    WilliamAyd mannequin commented Oct 31, 2017

    Would we be open to setting the meta class of the ParserBase to ABCMeta and setting error as an abstract method? That at the very least would make the expectation clearer for subclasses.

    I haven’t contributed to Python before but am open to this as a first attempt if the direction makes sense.

    @WilliamAyd
    Copy link
    Mannequin

    WilliamAyd mannequin commented Oct 31, 2017

    And assuming that subclass requirement is intentional we could add an optional keyword argument to the HTMLParser that indicates what to do with errors, much like how encoding issues are handled within codecs. For backwards compatibility it can default to ignore, but fail and warn could be two alternate approaches that the error method could account for

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 31, 2017
    @berkerpeksag
    Copy link
    Member

    HTMLParser.error() method was deprecated in Python 3.4 (88ebfb1#diff-1a7486df8279dbac7f20abd487947845R157) and removed in Python 3.5 (73a4359#diff-1a7486df8279dbac7f20abd487947845L171)

    _markupbase is a private and undocumented module and its only user is HTMLParser (sgmllib was removed from the stdlib in 2008) Since we already have removed HTMLParser.error(), I think we can just remove _markupbase.ParserBase.error() without a deprecation period.

    @berkerpeksag berkerpeksag added the 3.8 (EOL) end of life label Jul 30, 2018
    @srittau
    Copy link
    Mannequin Author

    srittau mannequin commented Jul 30, 2018

    Good call. Maybe it's actually time to retire _markupbase and merge ParserBase into HTMLParser.

    @berkerpeksag
    Copy link
    Member

    After triaging bpo-34480, I realized that we can't simply remove the error() method because the _markupbase.ParserBase() class still uses it. I've just closed PR 8562.

    @ezio-melotti ezio-melotti self-assigned this Aug 25, 2018
    @csabella
    Copy link
    Contributor

    @berker.peksag's last comment was he closed the PR on 23 August 2018. However, he reopened it on 6 January 2020 as @ezio.melotti mentioned that they are both needed.

    The PR for this issue is waiting to be re-reviewed by Ezio.

    @csabella csabella added 3.10 only security fixes and removed 3.8 (EOL) end of life labels Jun 14, 2020
    @berkerpeksag
    Copy link
    Member

    New changeset e34bbfd by Berker Peksag in branch 'master':
    bpo-31844: Remove _markupbase.ParserBase.error() (GH-8562)
    e34bbfd

    @berkerpeksag
    Copy link
    Member

    New changeset d4d127f by Berker Peksag in branch 'master':
    bpo-31844: Move whatsnew note to 3.10.rst (GH-21504)
    d4d127f

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    return42 added a commit to return42/searxng that referenced this issue Oct 18, 2023
    In python versions <py3.10 there is an issue with an undocumented method
    HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed
    in Python 3.5.
    
    To be compatible to higher versions (>=py3.10) an error method is implemented
    which throws an AssertionError exception like the higher Python versions do [3].
    
    [1] python/cpython#76025
    [2] https://bugs.python.org/issue31844
    [3] python/cpython#8562
    
    Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
    return42 added a commit to return42/searxng that referenced this issue Oct 19, 2023
    In python versions <py3.10 there is an issue with an undocumented method
    HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed
    in Python 3.5.
    
    To be compatible to higher versions (>=py3.10) an error method is implemented
    which throws an AssertionError exception like the higher Python versions do [3].
    
    [1] python/cpython#76025
    [2] https://bugs.python.org/issue31844
    [3] python/cpython#8562
    
    Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
    dalf pushed a commit to searxng/searxng that referenced this issue Oct 22, 2023
    In python versions <py3.10 there is an issue with an undocumented method
    HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed
    in Python 3.5.
    
    To be compatible to higher versions (>=py3.10) an error method is implemented
    which throws an AssertionError exception like the higher Python versions do [3].
    
    [1] python/cpython#76025
    [2] https://bugs.python.org/issue31844
    [3] python/cpython#8562
    
    Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants