Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 527899 [9] Implement JEP 280: Indify String Concatenation #1139

Merged
merged 5 commits into from
Jun 29, 2023

Conversation

jarthana
Copy link
Member

@jarthana jarthana commented Jun 12, 2023

What it does

Implements https://openjdk.org/jeps/280 for ECJ

How to test

Author checklist

@jarthana jarthana marked this pull request as draft June 12, 2023 05:49
@jarthana
Copy link
Member Author

jarthana commented Jun 12, 2023

There are still known issues that should be addressed before merging this PR.

  • Couple of tests inXLargeTest failing with new/different error "The type generates a string that requires more than 65535 bytes to encode in Utf8 format in the constant pool"
  • The storage mechanism used for the new bootstrap entry (using Expression as a key) and storing the concatenatedLiteral in Expression need to be revisited
  • Need to polish up ClassFile#addBootStrapStringConcatEntry() a bit

@jarthana
Copy link
Member Author

Attention JDT committers, leads and experts (@mpalat @iloveeclipse @stephan-herrmann @srikanth-sankaran ) (others are welcome to chime in too.
This feature meant for Java 9 but we chose not to do this at that time. But with String Templates on the horizon, now is a good time to implement this.

This patch implements the feature for all code >= Java 9. But this approach poses a major problem.
The problem is, all Eclipse platform projects now have already moved to Java 9+ and will see a different binaries. Basically, we are are going to be flooded by comparator errors and will be forced to touch all the bundles.

If we want to avoid this, we can limit this to compliance 20+ or even 21+.

What does everyone think? I wanted to discuss this here before taking to the wider forum, like the jdt mailing list.

@iloveeclipse
Copy link
Member

Basically, we are are going to be flooded by comparator errors and will be forced to touch all the bundles.

OK for me, especially in M1 time frame, I can help touching bundles.

If we want to avoid this, we can limit this to compliance 20+ or even 21+.

I would not do that, simply to get higher test coverage (by "eating our own dog food"). Assuming the generated bytecode is "faster" executed by the modern JVM, why should ve let users wait for another LTS update to get it into their applications built with ECJ?

Where I have more concerns is the resulting binaries performance - considering that this affects almost every compiled class, that should be carefully tested / verified. Have you tried to run some performance tests (there were some hints in https://openjdk.org/jeps/280)? Can you outline chosen strategy / possible differences to the javac implementation?

@jarthana
Copy link
Member Author

Basically, we are are going to be flooded by comparator errors and will be forced to touch all the bundles.

OK for me, especially in M1 time frame, I can help touching bundles.

Thanks!

I would not do that, simply to get higher test coverage (by "eating our own dog food"). Assuming the generated bytecode is "faster" executed by the modern JVM, why should ve let users wait for another LTS update to get it into their applications built with ECJ?

Two additional points. ECJ is not obligated to implement this. This is not particularly relavent to the quoted point, but just wanted to say that.
The second point is, if we make this as a new feature as part of our Java 21 support, then nobody will see their .class files change, except when they move to Java 21. That will give us a nice cover to avoid this bytecode change.

Where I have more concerns is the resulting binaries performance - considering that this affects almost every compiled class, that should be carefully tested / verified. Have you tried to run some performance tests (there were some hints in https://openjdk.org/jeps/280)? Can you outline chosen strategy / possible differences to the javac implementation?

I ran into some failures in XLargeTest and had a chance to study some of the tests involving large string concat and didn't find anything, particularly in terms of performance. Let me run the entire JDT tests with and without the fix and see if that brings out something.

Comparing with Javac:

  • As far as produced output is concerned, except a corner case (where we avoid pushing null into operand stack), we pretty much produce what Javac does.
  • And because we let the JVM do the concatenation for us, now some of the code that ran before won't run because of the breach of the 65535 size for a String constant. Earlier, we avoided this by invoking a series of StringBuilder.append().

@iloveeclipse
Copy link
Member

now some of the code that ran before won't run because of the breach of the 65535 size for a String constant. Earlier, we avoided this by invoking a series of StringBuilder.append().

Just wondering if the new behavior can be made configurable via extra compiler flag / jdt core option?
However, even if that would be possible, I would prefer enable it by default (and the purpose to have it is to allow corner cases to still use old way of string concatenation).

@mpalat
Copy link
Contributor

mpalat commented Jun 13, 2023

Attention JDT committers, leads and experts (@mpalat @iloveeclipse @stephan-herrmann @srikanth-sankaran ) (others are welcome to chime in too. This feature meant for Java 9 but we chose not to do this at that time. But with String Templates on the horizon, now is a good time to implement this.

I strongly agree that now is a good time to implement this since String templates will preview in 21. However, instead of clubbing with Java 21, this should be implemented in master as a separate feature since it is a separate feature for one, and second, this is not a preview feature and should be available for every one. 4.29 master would be the way to go - M1 being ideal.

If we want to avoid this, we can limit this to compliance 20+ or even 21+.

Continuing from above, no - we should not be restricting this since its already available the minimum required jdk level as of now.

@srikanth-sankaran
Copy link
Contributor

I don't have much to add to what @iloveeclipse and @mpalat have said - substantially agree with their observations.

I will add myself as a reviewer.

@srikanth-sankaran srikanth-sankaran self-requested a review June 14, 2023 10:06
@jarthana jarthana marked this pull request as ready for review June 15, 2023 05:36
@srikanth-sankaran
Copy link
Contributor

Looks like the last commit Handle cases of a String literal containing '\u0001' or '\u0002' touching several files can be greatly simplified with with a single line change in org.eclipse.jdt.internal.compiler.ast.Expression.buildStringForConcatation(BlockScope, CodeStream, int, StringBuilder, List)

changing

if ((this.bits & ASTNode.HasStringConcatMarkers) != 0) {

with

if (this.constant.stringValue().indexOf('\u0001') != -1 || this.constant.stringValue().indexOf('\u0002') != -1)

@srikanth-sankaran
Copy link
Contributor

Looks like the last commit Handle cases of a String literal containing '\u0001' or '\u0002' touching several files can be greatly simplified with with a single line change in org.eclipse.jdt.internal.compiler.ast.Expression.buildStringForConcatation(BlockScope, CodeStream, int, StringBuilder, List)

changing

if ((this.bits & ASTNode.HasStringConcatMarkers) != 0) {

with

if (this.constant.stringValue().indexOf('\u0001') != -1 || this.constant.stringValue().indexOf('\u0002') != -1)

If you agree, cleaning that up plus rebasing on top of all the recent changes will help simplify and speed up the review. Thanks!

@jarthana
Copy link
Member Author

OK, I usually don't do a force-push, but since the last one is only a partial reversal of previous change, I rewrote that previous commit.

Anyway, I was going to argue that the String comparison (indexOf) might cause some performance impact, especially with sources like our tests that include lot of string concatenation. I did run some performance tests using our performance suite. Although I see some difference, they are not convincing enough to complicate the code. So, went with the proposed change.

@srikanth-sankaran
Copy link
Contributor

Aside from the standard observation about premature optimization and humans being poor judges of where time is likely spent, string concatenation in the style of our tests would not be a problem anyway since they would be constant folded at compile time.

Anyway it is good that you studied it to make sure there is no discernible difference

@jarthana
Copy link
Member Author

jarthana commented Jun 20, 2023

I resolved most of the points raised. However, I did some more testing with String concatenations with lot of operands and special marker characters. It reveals that the code we produce is lot more naive and very soon breach the 255 method arguments mark. OTOH, Javac is smarter and splits the operation into multiple ones. We should do that too.

Other point to mention is, I added the compiler preference and enable by default. But no JavaCore constant yet; we can do that when/if we get to the UI part.

@srikanth-sankaran
Copy link
Contributor

I resolved most of the points raised. However, I did some more testing with String concatenations with lot of operands and special marker characters. It reveals that the code we produce is lot more naive and very soon breach the 255 method arguments mark. OTOH, Javac is smarter and splits the operation into multiple ones. We should do that too.

I would suggest we keep this PR for the basic implementation and follow up on a separate ticket for improvements

@jarthana
Copy link
Member Author

I resolved most of the points raised. However, I did some more testing with String concatenations with lot of operands and special marker characters. It reveals that the code we produce is lot more naive and very soon breach the 255 method arguments mark. OTOH, Javac is smarter and splits the operation into multiple ones. We should do that too.

I would suggest we keep this PR for the basic implementation and follow up on a separate ticket for improvements

If it makes it easier to get this PR merged, sure. But I will fix this sooner than later. I will start working on it right after this.

@jarthana
Copy link
Member Author

@srikanth-sankaran I have a follow-up PR in #1172. Fairly simple and takes care of the cases when the concat gets too long with too many arguments. It undoes some of the test changes I made here. Just FYI.

@@ -56,7 +56,7 @@ public GenericTypeTest(String name) {
// Static initializer to specify tests subset using TESTS_* static variables
// All specified tests which does not belong to the class are skipped...
static {
// TESTS_NAMES = new String[] { "test0593" };
TESTS_NAMES = new String[] { "test1066" };
// TESTS_NUMBERS = new int[] { 470, 627 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is removed ??

@srikanth-sankaran
Copy link
Contributor

OK, I have made one more pass over the changes - The test changes I only glanced through.

I suggest let us set up a "structured walk through" for these files:

BinaryExpression.java
CombinedBinaryExpression.java
Expression.java
CodeStream.java

JCL/.project Outdated Show resolved Hide resolved
@Override
public void buildStringForConcatation(BlockScope blockScope, CodeStream codeStream, int typeID, StringBuilder recipe, List<TypeBinding> argTypes) {
if (this.referencesTable == null) {
super.buildStringForConcatation(blockScope, codeStream, typeID, recipe, argTypes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please check to make sure that we have 100% branch coverage for the new code in Expression, BinaryExpression and CombinedBinaryExpression

Copy link
Contributor

@srikanth-sankaran srikanth-sankaran Jul 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please check to make sure that we have 100% branch coverage for the new code in Expression, BinaryExpression and CombinedBinaryExpression

@jarthana : this review comment perhaps fell through the cracks. Without the new test added by @iloveeclipse viz org.eclipse.jdt.core.tests.compiler.regression.StringConcatTest.test005() in the context of #1202 there is no test in the full run of RunJDTCoreTests that exercises that branch :-(

Here is what I suggest/would do: In Expression, BinaryExpression and CombinedBinaryExpression, in all the new code, I would insert a throw new UnsupportedOperationException(); in every branch of control flow and comment out the existing code code there. Now RunJDTCoreTests to make sure that every one of those exception actually triggers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running the suite in HEAD/master with coverage / EclEmma enabled. Will report the results here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

results:

  • Expression.buildStringForConcatation() not covered: lines 965-7 (block starting codeStream.ldc)
  • BinaryExpression.buildStringForConcatation() not covered: line 1572 (super call)
    • first compound if condition didn't see all combinations
  • CombinedBinaryExpression.buildStringForConcatation() not covered:
    • lines 261/2, this was #1202 (tested before that one got merged)
    • lines 265-7 (inner else block)
    • line 281 (final super call)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, do you care to fix the spelling of concatenation in the method name?

@jarthana
Copy link
Member Author

@srikanth-sankaran It turns out the makeConcatWithConstants() is useful when we pass static bootstrap arguments. But I think we can continue to use this for what we do now (pass everything as dynamic argument). This gives us the flexibility when/if we decide to pass static arguments. WDYT?

@jarthana jarthana merged commit be45d8f into eclipse-jdt:master Jun 29, 2023
@iloveeclipse
Copy link
Member

@jarthana, @srikanth-sankaran: next week Friday is M1 planned.
Usually we declare M1 and after M1 switch the compiler used to build SDK to M1 version.

Should we keep with that, or do we want to promote & switch the ecj compiler used to build SDK to use this change before M1?

@iloveeclipse
Copy link
Member

@akurtakov : FYI. The updated compiler would generate different code now for lot of bundles, which would mean bumping versions. Better before M1 or after?

@akurtakov
Copy link
Contributor

The earlier the better IMHO.

@srikanth-sankaran
Copy link
Contributor

srikanth-sankaran commented Jun 29, 2023

The earlier the better IMHO.

I agree. The other java compiler bootstraps the entire JDK (compiler, VM and JDK) for every single line of change and this serves as an excellent test bed and safety net. Sooner the better.

@iloveeclipse
Copy link
Member

I've created eclipse-platform/eclipse.platform.releng.aggregator#1184

@akurtakov : do you know how one can "promote" ecj build to make it available for the next SDK build? If yes, please comment in eclipse-platform/eclipse.platform.releng.aggregator#1184.

@iloveeclipse
Copy link
Member

@srikanth-sankaran, @mpalat, @jarthana, @stephan-herrmann, @sravanlakkimsetti : I've created eclipse-platform/eclipse.platform.releng.aggregator#1203 as follow up on our discussion with @srikanth-sankaran about how we could improve ecj validation & get faster feedback cycle.

@sravanlakkimsetti, @laeubi, @akurtakov : I hope you could give some guidance & help regarding required maven SDK / ECJ build changes.

mpalat pushed a commit to mpalat/eclipse.jdt.core that referenced this pull request Jul 6, 2023
…e-jdt#1139)

Alters the code generation for string concatenation expression to use the bootstrap method as specified by the JEP 280 for compliance 9 and above. A new compiler option org.eclipse.jdt.core.compiler.codegen.useStringConcatFactory has been introduced for this, which is enabled by default.
robstryker pushed a commit to robstryker/eclipse.jdt.core that referenced this pull request Jul 18, 2024
…e-jdt#1139)

Alters the code generation for string concatenation expression to use the bootstrap method as specified by the JEP 280 for compliance 9 and above. A new compiler option org.eclipse.jdt.core.compiler.codegen.useStringConcatFactory has been introduced for this, which is enabled by default.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants