You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have two infoboxes that look exactly the same to me, but I'm getting different behavior in mwparserfromhell. In the first instance I'm getting what I expect - the entire infobox is captured as a template object. In the second instance parts of the infobox are extracted as separate templates. This is confusing since the infoboxes look very similar to me, and I was expecting that the entire infobox could be extracted in the second case.
txt2 = """{{Infobox building
| name = Central Park Tower
| alternate_names = Nordstrom Tower
| image = Central Park Tower April 2020.jpg
| caption = Central Park Tower on April 25, 2020
| location = 225 [[57th Street (Manhattan)|West 57th Street]]<br/>[[Manhattan]], [[New York City]], [[New York (state)|New York]], [[United States|U.S.]]
| coordinates = {{coord|40.7663|-73.9810|type:landmark_globe:earth_region:US-NY|display=inline,title}}
| status = Topped Out
| start_date = 2014
| est_completion = 2020<ref name=curbed>{{cite news |author=Amy Plitt |url=https://ny.curbed.com/2017/6/1/15714666/central-park-tower-offering-plan-approval-sales-launch |title=Central Park Tower is now one step closer to launching sales |date=June 1, 2017 |access-date=August 30, 2017 |work=Curbed}}</ref>
| building_type = [[Residential]], [[retail]]
| architectural_style = [[Modern architecture|Modern]]
| architectural = {{cvt|1550|ft|0}}
| floor_count = 131<ref>{{cite web |url=https://www.architecturaldigest.com/story/new-york-city-central-park-tower-worlds-tallest-residential-building </ref><ref>{{cite web |url=https://archpaper.com/2019/09/central-park-tower-tops-out/</ref> (98 habitable floors)<ref name="auto">{{Cite web |url=http://www.skyscrapercenter.com/building/central-park-tower/14269 |title=Central Park Tower - The Skyscraper Center |website=www.skyscrapercenter.com |access-date=October 10, 2018}}</ref>
| elevator_count = 11
| cost = $3 billion<ref name="Tase">{{cite news|url=https://commercialobserver.com/2019/04/all-in-good-tase-the-crisis-for-the-american-cohort-in-tel-aviv-is-essentially-over/|title=All in Good TASE: The Crisis for the American Cohort in Tel Aviv Is Essentially Over|date=April 4, 2019|work=Commercial Observer|last=Gourarie|first=Chava}}</ref>
| floor_area = {{convert|1,285,308|sqft|m2}}<ref name="auto" />
| architect = [[Adrian Smith + Gordon Gill Architecture]]
| structural_engineer = [[WSP Global]]
| main_contractor = [[Lendlease]]
| developer = [[Extell Development Company]]
}}"""
Text 2 Output:
['{{coord|40.7663|-73.9810|type:landmark_globe:earth_region:us-ny|display=inline,title}}',
'{{cite news |author=amy plitt |url=https://ny.curbed.com/2017/6/1/15714666/central-park-tower-offering-plan-approval-sales-launch |title=central park tower is now one step closer to launching sales |date=june 1, 2017 |access-date=august 30, 2017 |work=curbed}}',
'{{cvt|1550|ft|0}}',
'{{cite web |url=https://archpaper.com/2019/09/central-park-tower-tops-out/</ref> (98 habitable floors)<ref name="auto">{{cite web |url=http://www.skyscrapercenter.com/building/central-park-tower/14269 |title=central park tower - the skyscraper center |website=www.skyscrapercenter.com |access-date=october 10, 2018}}</ref>\n| elevator_count = 11\n| cost = $3 billion<ref name="tase">{{cite news|url=https://commercialobserver.com/2019/04/all-in-good-tase-the-crisis-for-the-american-cohort-in-tel-aviv-is-essentially-over/|title=all in good tase: the crisis for the american cohort in tel aviv is essentially over|date=april 4, 2019|work=commercial observer|last=gourarie|first=chava}}</ref>\n| floor_area = {{convert|1,285,308|sqft|m2}}<ref name="auto" />\n| architect = [[adrian smith + gordon gill architecture]]\n| structural_engineer = [[wsp global]]\n| main_contractor = [[lendlease]]\n| developer = [[extell development company]]\n}}',
'{{cite web |url=http://www.skyscrapercenter.com/building/central-park-tower/14269 |title=central park tower - the skyscraper center |website=www.skyscrapercenter.com |access-date=october 10, 2018}}',
'{{cite news|url=https://commercialobserver.com/2019/04/all-in-good-tase-the-crisis-for-the-american-cohort-in-tel-aviv-is-essentially-over/|title=all in good tase: the crisis for the american cohort in tel aviv is essentially over|date=april 4, 2019|work=commercial observer|last=gourarie|first=chava}}',
'{{convert|1,285,308|sqft|m2}}']
The second infobox is a mess - it has multiple <ref> tags which are inconsistently terminated inside a template which started inside the tag. You should probably fix the wikicode itself...
I have two infoboxes that look exactly the same to me, but I'm getting different behavior in mwparserfromhell. In the first instance I'm getting what I expect - the entire infobox is captured as a template object. In the second instance parts of the infobox are extracted as separate templates. This is confusing since the infoboxes look very similar to me, and I was expecting that the entire infobox could be extracted in the second case.
This is the code I'm using:
mwparserfromhell.parse(text.strip().lower()).filter_templates()
Text 1 Input:
Text 1 Output:
Text 2 Input:
Text 2 Output:
Also posted it here.
The text was updated successfully, but these errors were encountered: