Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix a bug that SAX2 parser doesn't expand the predefined entities for…
… "characters" (#168) ## Why? SAX2 parser expand user-defined entity references and character references but doesn't expand predefined entity references. ## Change - text_unnormalized.rb ``` require 'rexml/document' require 'rexml/parsers/sax2parser' require 'rexml/parsers/pullparser' require 'rexml/parsers/streamparser' xml = <<EOS <root> <A><P> <I> <B> Text </B> </I></A> </root> EOS class Listener def method_missing(name, *args) p [name, *args] end end puts "REXML(DOM)" REXML::Document.new(xml).elements.each("/root/A") {|element| puts element.text} puts "" puts "REXML(Pull)" parser = REXML::Parsers::PullParser.new(xml) while parser.has_next? res = parser.pull p res end puts "" puts "REXML(Stream)" parser = REXML::Parsers::StreamParser.new(xml, Listener.new).parse puts "" puts "REXML(SAX)" parser = REXML::Parsers::SAX2Parser.new(xml) parser.listen(Listener.new) parser.parse ``` ## Before (master) ``` $ ruby text_unnormalized.rb REXML(DOM) <I> <B> Text </B> </I> REXML(Pull) start_element: ["root", {}] text: ["\n ", "\n "] start_element: ["A", {}] text: ["<P> <I> <B> Text </B> </I>", "<P>\r <I> <B> Text </B> </I>"] end_element: ["A"] text: ["\n", "\n"] end_element: ["root"] end_document: [] REXML(Stream) [:tag_start, "root", {}] [:text, "\n "] [:tag_start, "A", {}] [:text, "<P>\r <I> <B> Text </B> </I>"] [:tag_end, "A"] [:text, "\n"] [:tag_end, "root"] REXML(SAX) [:start_document] [:start_element, nil, "root", "root", {}] [:progress, 6] [:characters, "\n "] [:progress, 9] [:start_element, nil, "A", "A", {}] [:progress, 12] [:characters, "<P>\r <I> <B> Text </B> </I>"] #<= This [:progress, 74] [:end_element, nil, "A", "A"] [:progress, 78] [:characters, "\n"] [:progress, 79] [:end_element, nil, "root", "root"] [:progress, 86] [:end_document] ``` ## After(This PR) ``` $ ruby text_unnormalized.rb REXML(SAX) [:start_document] [:start_element, nil, "root", "root", {}] [:progress, 6] [:characters, "\n "] [:progress, 9] [:start_element, nil, "A", "A", {}] [:progress, 12] [:characters, "<P>\r <I> <B> Text </B> </I>"] [:progress, 74] [:end_element, nil, "A", "A"] [:progress, 78] [:characters, "\n"] [:progress, 79] [:end_element, nil, "root", "root"] [:progress, 86] [:end_document] ```
- Loading branch information