Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory allocations and improve performance in JSONObject #68

Merged
merged 1 commit into from
Nov 15, 2024

Conversation

basil
Copy link
Contributor

@basil basil commented Nov 14, 2024

Parsing https://updates.jenkins.io/update-center.json is extremely slow (hundreds of times slower than jq, for example). It consistently takes about 8 seconds and allocates about 170 GiB of RAM over the course of the parsing procedure. Profiling showed lots of regular expression compilation like

  java.lang.Thread.State: RUNNABLE
	at java.util.regex.Pattern.compile(java.base@11.0.5/Pattern.java:1757)
	at java.util.regex.Pattern.<init>(java.base@11.0.5/Pattern.java:1428)
	at java.util.regex.Pattern.compile(java.base@11.0.5/Pattern.java:1068)
	at net.sf.json.regexp.JdkRegexpMatcher.<init>(JdkRegexpMatcher.java:38)
	at net.sf.json.regexp.JdkRegexpMatcher.<init>(JdkRegexpMatcher.java:31)
	at net.sf.json.regexp.RegexpUtils.getMatcher(RegexpUtils.java:39)
	at net.sf.json.util.JSONTokener.matches(JSONTokener.java:111)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:912)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONArray._fromJSONTokener(JSONArray.java:1131)
	at net.sf.json.JSONArray.fromObject(JSONArray.java:125)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:351)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject._fromString(JSONObject.java:1145)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:162)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:132)

and string allocation like

   java.lang.Thread.State: RUNNABLE
	at java.lang.String.<init>(String.java:207)
	at java.lang.String.substring(String.java:1933)
	at net.sf.json.util.JSONTokener.matches(JSONTokener.java:110)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:912)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONArray._fromJSONTokener(JSONArray.java:1131)
	at net.sf.json.JSONArray.fromObject(JSONArray.java:125)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:351)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:156)
	at net.sf.json.util.JSONTokener.nextValue(JSONTokener.java:348)
	at net.sf.json.JSONObject._fromJSONTokener(JSONObject.java:955)
	at net.sf.json.JSONObject._fromString(JSONObject.java:1145)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:162)
	at net.sf.json.JSONObject.fromObject(JSONObject.java:132)

There are two issues here: repeatedly compiling a pattern where a simple .startsWith("null") would have sufficed, and repeatedly copying a massive string just to search a few characters in it. See flame graphs before and after.

before

after

I added a new unit test. This has also been shipping in production in our fork of json-lib to Jenkins users in 2.456 since May without any reported issues.

@aalmiray aalmiray added this to the 3.2.0 milestone Nov 15, 2024
@aalmiray aalmiray merged commit db76e69 into kordamp:master Nov 15, 2024
1 check passed
@aalmiray
Copy link
Collaborator

Thank you 😄

@basil basil deleted the starts-with branch November 15, 2024 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants