Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #5129 - WebAppContext.setExtraClasspath(String) cleanup #5131

Merged
merged 7 commits into from
Aug 12, 2020

Conversation

joakime
Copy link
Contributor

@joakime joakime commented Aug 7, 2020

  • More tests for both relative and absolute path references
  • More testing that will trigger quirks on Windows builds
    so that we can catch regressions faster
  • Reworked WebInfConfiguration to be glob aware in a way
    similar to how WebAppClassLoader behaves.
  • Reworked Resource.newResource(String) to delegate
    canonical path resolution to PathResource
  • Guarded PathResource's usage of Path.toAbsolutePath()
    to ignore valid conditions where the Path cannot be
    resolved to an absolute path (yet)

@joakime joakime added Bug For general bugs on Jetty side Microsoft Windows For issues unique to Microsoft Windows labels Aug 7, 2020
@joakime joakime self-assigned this Aug 7, 2020
import static org.junit.jupiter.api.Assertions.assertTrue;

public class WebAppContextExtraClasspathTest
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all these tests are good, but can they be added to an existing test, like for WebAppClassLoader, or WebAppContext?

Copy link
Contributor Author

@joakime joakime Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to test extraClassPath in isolation.
We have 2 separate WebAppClassLoader test classes. (which has good separation)
And we have a giant multi-purpose WebAppContext test that does far too many things all at once. It should really be broken down into more isolated tests.

+ More tests for both relative and absolute path references
+ More testing that will trigger quirks on Windows builds
  so that we can catch regressions faster
+ Reworked WebInfConfiguration to be glob aware in a way
  similar to how WebAppClassLoader behaves.
+ Reworked Resource.newResource(String) to delegate
  canonical path resolution to PathResource
+ Guarded PathResource's usage of Path.toAbsolutePath()
  to ignore valid conditions where the Path cannot be
  resolved to an absolute path (yet)
+ Normalize resolved paths in PathResource

Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
@joakime joakime force-pushed the jetty-9.4.x-5129-extraclasspath-glob branch from 54f888c to 55f4e4f Compare August 10, 2020 15:43
+ Introduce new Resource.fromReferences to help with
  parsing delimited resource reference lists.

Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks a bit fundamental for jetty-9.4! Is there a simpler fix we can do for the bug in 9.4 and then consider code cleanup in 10. I've not yet looked at this in detail but the the ResourceFactory vs Resource.Factory stuff seams really strange.

So let's just fix the bug here and do a cleanup/restructure in 10.

Copy link
Contributor

@janbartel janbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very big change for 9.4, so I think the bulk of this improvement work should be done in jetty-10. For 9.4, we can just fix the 2 problems:

  1. WebInfConfiguration needs to strip off a trailing * in parsing extraClassPath
  2. Resource needs to handle weird NTFS better

Then, in jetty-10, lets consider how to prevent the triplicate parsing of the extraClassPath. The solution with having the Resource.Factory interface is moving in the right direction, but I really don't like having that AND Resource.Factory : the benefit of doing this in 10 is that we will be able to change the signature of ResourceFactory. Also, this solution doesn't overcome the triplicate parsing, it's still being parsed twice in WebInfConfiguration and once in WebAppClassLoader : lets just parse this once and use the results where needed.

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On looking closer, I'm not so sure this needs to be entirely punted to 10. I think it can be greatly simplified without the introduction of new public interfaces. If that can be done, then perhaps this can go in 9.4, but any significant API changes can only be done in 10... and then only really soon.

throw e2;
}
// It's likely a file/path reference.
return new PathResource(Paths.get(resource));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a better assumption, but for 9.4 I think we really don't want to change how we work out what type a resource is as we really don't want to change behaviour... even if it is slightly wrong. We should only fix real bugs.

Copy link
Contributor Author

@joakime joakime Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that was removed was a left-over from the FileResource days.

The use of getCanonicalFile() is one of the core issues from #5129 and must be addressed.
On windows, it throws for many reasons, during testing I found that it throws if the File (or any sub-path it references) doesn't exist or it contains characters in the list of unacceptable characters (such as the * in path\to\dir\*). Note: these 2 reasons are not the only reasons getCanonicalFile throws. (On a typical Linux environment I have been unable to produce a combination of conditions that would cause it to throw)

The stripping of the "./" prefix is something that the Path.normalize() takes care of.
The modified form of the resource variable that the old lines 180-181 produces never exits this block of code, you'll have a PathResource or an exception.

@joakime
Copy link
Contributor Author

joakime commented Aug 11, 2020

This is a very big change for 9.4, so I think the bulk of this improvement work should be done in jetty-10. For 9.4, we can just fix the 2 problems:

  1. WebInfConfiguration needs to strip off a trailing * in parsing extraClassPath
  2. Resource needs to handle weird NTFS better

This cannot be done entirely within the Resource layer, it has to be done BEFORE the entry becomes a Resource.

Then, in jetty-10, lets consider how to prevent the triplicate parsing of the extraClassPath. The solution with having the Resource.Factory interface is moving in the right direction, but I really don't like having that AND Resource.Factory : the benefit of doing this in 10 is that we will be able to change the signature of ResourceFactory. Also, this solution doesn't overcome the triplicate parsing, it's still being parsed twice in WebInfConfiguration and once in WebAppClassLoader : lets just parse this once and use the results where needed.

I see 2 paths we can take ...

Option 1:

The best we can do is reduce it to 2 parsings.

Once in the optional WebInfConfiguration, and once in the WebAppClassLoader.addClassPath(String) method.

As people use WebAppContext with a dizzying variety of Configurations (and often no Configurations in various embedded scenarios, when they could easily use ServletContextHandler instead).
And people use WebAppClassLoader by directly manipulating it.

Option 2:

We don't store WebAppContext.setExtraClassPath(String) as a String.

The immediate usage of that method results in a List<Resource> and is resolved immediately.

We introduce a new WebAppContext.setExtraClassPath(List<Resource>) method, and update the apidoc for above method to say it's just a convenience method for this one.
The .getExtraClassPath() method return is changed to List<Resource> instead.

This has a side effect that the Context.newResource(String) methods are not used though.

+ Will migrate to jetty-10.0.x

Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
dirResources.add(resource);
String token = tokenizer.nextToken().trim();
if (isGlobReference(token))
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the entry has a trailing *, then it is a directory of jar files and we can ignore it, we should not look deeply into its subdirs.

Copy link
Contributor Author

@joakime joakime Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only look at the level of the /*.
So that means if we have say ...

$ ls -1 -F /opt/ext/
corp.jar
foo.jar
resources/

And a reference of /opt/ext/* then that's 3 URL entries in the WebAppClassLoader.

  1. file:///opt/ext/corp.jar
  2. file:///opt/ext/foo.jar
  3. file:///opt/ext/resources/ (treated as directory entry in the ClassLoader)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the WebAppClassLoader, this is in WebInfConfiguration.findExtraClasspathDirs where we are only looking for entries in the ;, separated list that are themselves directories; any entry that is suffixed by /* is handled by WebInfConfiguration.findExtraClasspathJars. We don't want to do a deep add of a directory inside of WebInfConfiguration.findExtraClasspathDirs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do deep directory, it's only at the level of the glob.

Glob means anything relevant in the point of the tree of the glob, not deeper.

"Relevant" is anything that has meaning to the Classloader, which is zip, jar, and directory (all other entries that the glob produces are ignored).

Copy link
Contributor Author

@joakime joakime Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we have a misaligned behavior here.

WebInfConfiguration and WebAppClassLoader in existing jetty-9.4.x behaves differently in regards to /path/to/* entries.

We either use the WebAppClassLoader behavior for glob that adds directories.
Or we use the WebInfConfiguration behavior for glob that does not add directories.

Looking at how java itself works I think we should use the existing jetty-9.4.x behavior that WebInfConfiguration uses and not add directories.

I'll explain with a code example ...

package cloader;

import java.io.File;

public class ShowClassLoader
{
    public static void main(String[] args)
    {
        String javaClasspath = System.getProperty("java.class.path");
        System.out.printf("java.class.path=%s%n", javaClasspath);
        String[] parts = javaClasspath.split(File.pathSeparator);
        System.out.printf("  entries.length=%d%n", parts.length);
        for (int i = 0; i < parts.length; i++)
        {
            System.out.printf("  entry[%d]=%s%n", i, parts[i]);
        }
    }
}

Run on the command line ...

$ ls -1 -F target/lib
corp.jar
foo.jar
resources/

$ java -cp 'target/lib/*:target/classes' cloader.ShowClassLoader
java.class.path=target/lib/corp.jar:target/lib/foo.jar:target/classes
  entries.length=3
  entry[0]=target/lib/corp.jar
  entry[1]=target/lib/foo.jar
  entry[2]=target/classes

This behavior on OpenJDK 11.0.7 is what WebInfConfiguration mimics, but not what WebAppClassLoader does.

The OpenJDK 11.0.7 on Windows classpath behaves the same way.

And it's documented like this at https://docs.oracle.com/javase/8/docs/technotes/tools/windows/classpath.html#A1100762

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a followup, the glob behaviors on Linux from the above example seem to be mostly done via the shell.

Windows JVM seems to have specific code to allow globs, so we are essentially attempting to mimic glob shell behavior on linux and glob classpath behavior on windows.

Copy link
Contributor Author

@joakime joakime Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been reverted in WebInfConfiguration and WebAppClassloader has been updated to reflect the truth in both JVM behavior and WebInfConfiguration behavior (along with tests).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the point I was making was that in WebInfConfiguration.findExtraClasspathJars we should not add the directory with the trailing * to the list of resources, only the jars and zips that are contained at the top level inside of that directory. If you've changed the code so that that directory is not added, then we're good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation in this PR does not add directories for glob usage in either WebInfConfiguration (like before) nor in WebAppClassLoader (this is a change in behavior)

Signed-off-by: Joakim Erdfelt <joakim.erdfelt@gmail.com>
Copy link
Contributor

@janbartel janbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@joakime joakime merged commit e632d24 into jetty-9.4.x Aug 12, 2020
@joakime joakime deleted the jetty-9.4.x-5129-extraclasspath-glob branch August 12, 2020 15:09
@KevinMcT
Copy link

KevinMcT commented Jan 11, 2021

Hi, I just wanted to note that this PR (Sparked from the issue #5129) ended up as a breaking change in our system which has been running stable without issue for the past 7 years.

(For future reference.)

java.nio.file.InvalidPathException: Illegal char <:> at index 2: /C:/Users/xxx/.gradle/caches/modules-2/files-2.1/org.constretto/constretto-api/2.2.3/7e7a08dcb9abdfc4fb2a610a6edcad1bb40c1444/constretto-api-2.2.3.jar
	at sun.nio.fs.WindowsPathParser.normalize(WindowsPathParser.java:182)
	at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:153)
	at sun.nio.fs.WindowsPathParser.parse(WindowsPathParser.java:77)
	at sun.nio.fs.WindowsPath.parse(WindowsPath.java:94)
	at sun.nio.fs.WindowsFileSystem.getPath(WindowsFileSystem.java:255)
	at java.nio.file.Paths.get(Paths.java:84)
	at org.eclipse.jetty.util.resource.Resource.newResource(Resource.java:179)
	at org.eclipse.jetty.util.resource.Resource.newResource(Resource.java:152)
	at org.eclipse.jetty.server.handler.ContextHandler.newResource(ContextHandler.java:2027)
	at org.eclipse.jetty.webapp.WebAppClassLoader.addClassPath(WebAppClassLoader.java:272)
	at org.eclipse.jetty.webapp.WebAppClassLoader.<init>(WebAppClassLoader.java:195)
	at org.eclipse.jetty.webapp.WebAppClassLoader.<init>(WebAppClassLoader.java:158)
	at org.eclipse.jetty.webapp.WebAppContext.preConfigure(WebAppContext.java:467)
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:523)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.server.handler.StatisticsHandler.doStart(StatisticsHandler.java:253)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.server.Server.start(Server.java:423)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.server.Server.doStart(Server.java:387)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)

While testing an upgrade from jetty 9.4.31 -> 9.4.32 it was discovered that this issue propped up for dev's using Windows. Issue is not present for dev's with Linux/MacOS machines.

URI as given in the URI object:

file:/C:/Users/xxx/.gradle/caches/modules-2/files-2.1/org.constretto/constretto-api/2.2.3/7e7a08dcb9abdfc4fb2a610a6edcad1bb40c1444/constretto-api-2.2.3.jar
            // -> Yes: Give the classpath to Jetty
            String classpath = ClasspathHelper.convertSetToClasspathString(_webAppClassPath);
            log.info("Webapp classpath: " + classpath);
            context.setExtraClasspath(classpath);

The ClasspathHelper method call is nothing fancy, does this:

static String convertSetToClasspathString(Set<URI> classpath) {
        return classpath.stream().map(URI::getPath).collect(Collectors.joining(";"));
    }

The URI::getPath strips off the file: prefix, the returned string is then feed into the setExtraClasspath method.

We remedied the issue by rewriting the stream from utilizing getPath to toString

String classpath = _webAppClassPath.stream().map(URI::toString).collect(Collectors.joining(";"));

This wasn't a major issue, however introducing a breaking change in a patch release is neither expected or desired.

@gregw
Copy link
Contributor

gregw commented Jan 11, 2021

@KevinMcT Sorry about that, it was not our intention. I have opened #5871 as that behaviour sounds like a bug to me.

@gregw
Copy link
Contributor

gregw commented Jan 12, 2021

@KevinMcT after discussing this with the team, we are not so sure that this is something we can fix. A string like /C:/some/path is neither a URI nor a file path and thus should not be supported by the setExtraClasspath method. If anything, the fact that it may have previously been supported was a bug that needed to be fixed... which we may have done inadvertently with previous changes.
Thus I'm going to mark #5871 as closed for now. If you can think of a reason we should support /C:/some/path, then please reopen and explain.

@KevinMcT
Copy link

@gregw np! I don't disagree with your conclusion. My main point was to highlight the fact that a change was introduced which might hit other users as well. (Leaving the note for future reference.)

Seeing that even though this was a bug which was fixed inadvertently, while reading through the release notes it wasn't immediately obvious that this was going to end up as a breaking change (for us that is).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For general bugs on Jetty side Microsoft Windows For issues unique to Microsoft Windows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No jars added when using a folder in extraClasspath of the webapp context xml file
4 participants