Mismatched italics syntax causes template filter to fail sometimes #202

mzmcbride · 2018-08-30T04:16:03Z

When there's mismatched italics syntax, the .filter_templates() method fails to properly parse the page sometimes.

Sample script:

#! /usr/bin/env python

import mwparserfromhell

def parse_text(case_text):
    parsed_page_text = mwparserfromhell.parse(case_text)
    print(len(parsed_page_text.filter_templates()))
    for template in parsed_page_text.filter_templates():
        print(template.name.strip())

case_text = """\
{{Infobox SCOTUS case
  |FullName=''[[et vir]]'
}}

'''''Hello there'''''
"""

parse_text(case_text)

case_text = """\
{{Infobox SCOTUS case
  |FullName=''[[et vir]]''
}}

'''''Hello there'''''
"""

parse_text(case_text)

case_text = """\
{{Infobox SCOTUS case
  |FullName=''[[et vir]]'
}}
"""

parse_text(case_text)

Of note: |FullName=''[[et vir]]' is mismatched in the first case and the third case.

Current buggy output:

0
1
Infobox SCOTUS case
1
Infobox SCOTUS case

Expected output:

1
Infobox SCOTUS case
1
Infobox SCOTUS case
1
Infobox SCOTUS case

The text was updated successfully, but these errors were encountered:

lahwaacz · 2018-08-30T07:36:13Z

Duplicate of #40. Use mwparserfromhell.parse() with skip_style_tags=True as a workaround.

earwig added the result: duplicate label Aug 31, 2018

earwig closed this as completed Aug 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mismatched italics syntax causes template filter to fail sometimes #202

Mismatched italics syntax causes template filter to fail sometimes #202

mzmcbride commented Aug 30, 2018

lahwaacz commented Aug 30, 2018

Mismatched italics syntax causes template filter to fail sometimes #202

Mismatched italics syntax causes template filter to fail sometimes #202

Comments

mzmcbride commented Aug 30, 2018

lahwaacz commented Aug 30, 2018