Skip to content

Commit

Permalink
fix(cruby): patch libxml2 to address GNOME/libxml2#200
Browse files Browse the repository at this point in the history
This patch shrinks the libxml2 input buffer in a few parser functions.

Fixes #2132
  • Loading branch information
flavorjones committed Feb 5, 2021
1 parent bc9edbf commit e0da4d2
Show file tree
Hide file tree
Showing 2 changed files with 71 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Nokogiri follows [Semantic Versioning](https://semver.org/), please see the [REA
### Fixed

* [CRuby] `NodeSet` may now safely contain `Node` objects from multiple documents. Previously the GC lifecycle of the parent `Document` objects could lead to contained nodes being GCed while still in scope. [[#1952](https://github.com/sparklemotion/nokogiri/issues/1952)]
* [CRuby] Patch libxml2 to avoid "huge input lookup" errors on large CDATA elements. (See upstream [GNOME/libxml2#200](https://gitlab.gnome.org/GNOME/libxml2/-/issues/200) and [GNOME/libxml2!100](https://gitlab.gnome.org/GNOME/libxml2/-/merge_requests/100).) [[#2132](https://github.com/sparklemotion/nokogiri/issues/2132)].
* [CRuby] `{XML,HTML}::Document.parse` now invokes `#initialize` exactly once. Previously `#initialize` was invoked twice on each object.
* [JRuby] `{XML,HTML}::Document.parse` now invokes `#initialize` exactly once. Previously `#initialize` was not called, which was a problem for subclassing such as done by `Loofah`.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
From ca565c1edef9a455453fa8564270cc9c5813e1b9 Mon Sep 17 00:00:00 2001
From: Mike Dalessio <mike.dalessio@gmail.com>
Date: Sun, 31 Jan 2021 09:53:56 -0500
Subject: [PATCH] parser.c: shrink the input buffer when appropriate

Fixes GNOME/libxml2#200

Also see discussions at:
- GNOME/libxml2#192
- https://gitlab.gnome.org/nwellnhof/libxml2/-/commit/99bda1e
- https://github.com/sparklemotion/nokogiri/issues/2132
---
parser.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/parser.c b/parser.c
index a7bdc7f..efde672 100644
--- a/parser.c
+++ b/parser.c
@@ -4204,6 +4204,7 @@ xmlParseSystemLiteral(xmlParserCtxtPtr ctxt) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -4291,6 +4292,7 @@ xmlParsePubidLiteral(xmlParserCtxtPtr ctxt) {
buf[len++] = cur;
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -4571,6 +4573,7 @@ xmlParseCharDataComplex(xmlParserCtxtPtr ctxt, int cdata) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF)
@@ -4776,6 +4779,7 @@ xmlParseCommentComplex(xmlParserCtxtPtr ctxt, xmlChar *buf,

count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -5186,6 +5190,7 @@ xmlParsePI(xmlParserCtxtPtr ctxt) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
if (ctxt->instate == XML_PARSER_EOF) {
xmlFree(buf);
@@ -9783,6 +9788,7 @@ xmlParseCDSect(xmlParserCtxtPtr ctxt) {
sl = l;
count++;
if (count > 50) {
+ SHRINK;
GROW;
if (ctxt->instate == XML_PARSER_EOF) {
xmlFree(buf);
--
2.25.1

0 comments on commit e0da4d2

Please sign in to comment.