Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: XML tags with mixed case, when using lowerCaseTags: true, selectors will be lowercased before comparing with the xml tag name #3495

Closed
corinnaSchultz opened this issue Nov 19, 2023 · 1 comment · Fixed by #3981

Comments

@corinnaSchultz
Copy link

I'm using version 1.0.0-rc.12

If you have an xml source document that contains tags with mixed case, such as:

<parentTag class="myClass">
   <firstTag> <child> blah </child> </firstTag>
  <secondTag> <child> blah </child> </secondTag>
</parentTag>

And you load this document with the following cheerio options:

xml: {
    xmlMode: true,
    decodeEntities: false,
    lowerCaseTags: true,
    lowerCaseAttributeNames: false,
    recognizeSelfClosing: true,
  }

And you use a selector like this:

$ = cheerio.load(myDoc, options)
node = $('.myClass')
node.find('firstTag > child')

This does not return the element as expected. It used to work in 1.0.0.rc.6.

From tracing through the code, it appears that the selector is first changed to lowercase ("firsttag" in this example), before comparing it with the tag name ("firstTag" in this example). I've pasted the code below.

I think I saw a similar bug when using the is() function, so it appears that this is the general behavior for selectors.

My expectation is that the tagName would also be converted to lowercase before doing this comparison, so that if the selector were mixed case or not, it would still match.

Am I misunderstanding how lowerCaseTags is supposed to work?

Code (from css-select, so maybe it's their bug?)
css-select/lib/general.js

        // Tags
        case css_what_1.SelectorType.Tag: {
            if (selector.namespace != null) {
                throw new Error("Namespaced tag names are not yet supported by css-select");
            }
            var name_1 = selector.name;
            if (!options.xmlMode || options.lowerCaseTags) {
                name_1 = name_1.toLowerCase();
            }
            return function tag(elem) {
                return adapter.getName(elem) === name_1 && next(elem);
            };
        }
@fb55
Copy link
Member

fb55 commented Aug 8, 2024

Hi @corinnaSchultz, I've added a test case for this, which is passing. Not sure where the behaviour you're seeing is coming from, your expected behaviour matches what I am seeing.

@fb55 fb55 closed this as completed Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants