The is an HTML template parser. It is a modified version of Python's HTMLParse library, expanded to handle template tags.
pip install html-template-parser
# or
poetry add html-template-parser
A basic usage example is remarkably similar to Python's HTMLParser:
from HtmlTemplateParser import Htp
from HtmlTemplateParser import AttributeParser
class MyAttributeParser(AttributeParser):
def handle_starttag_curly_perc(self, tag, attrs, props):
print("starttag_curly_perc", tag, attrs, props)
# get the position of the element relative to the original html
print(self.getpos())
# get the original html text
print(self.get_element_text())
def handle_endtag_curly_perc(self, tag, attrs, props):
print("endtag_curly_perc", tag, attrs, props)
def handle_value(self, value):
print("value", value)
class MyHTMLParser(Htp):
def handle_starttag(self, tag, attrs):
print("Encountered a start tag:", tag)
print(self.getpos())
MyAttributeParser(attrs).parse()
def handle_endtag(self, tag):
print("Encountered an end tag :", tag)
def handle_data(self, data):
print("Encountered some data :", data)
parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head>'
'<body {% if this %}ok{% endif %}><h1>Parse me!</h1></body></html>')
- comment
<!-- -->
- comment_curly_hash
{# data #}
- comment_curly_two_exlaim
{{! data }}
- starttag_comment_curly_perc
{% comment "attrs" %}
- endtag_comment_curly_perc
{% endcomment %}
- comment_at_star
@* data *@
-
startendtag
< />
-
starttag
<
-
starttag_curly_perc
{% ... %}
-
starttag_curly_two_hash
{{#...}}
-
starttag_curly_four
{{{{...}}}}
-
endtag
<.../>
-
endtag_curly_perc
{% end.. %}
-
endtag_curly_two_slash
{{/...}}
-
endtag_curly_four_slash
{{{{/...}}}}
- unknown_decl
- charref
- entityref
- data
- curly_two
{{ ... }}
- slash_curly_two
\{{ ... }}
- curly_three
{{{ ... }}}
- decl
- pi
Modifiers such as ~
, !--
, -
, +
, >
will show up as props on the tags.
Attributes are passed from the Htp as a complete string to be parsed with the attribute parser.