-
Notifications
You must be signed in to change notification settings - Fork 41.3k
Description
We have several Spring boot applications running on windows servers. These applications start as an executable jar. Since the 3.2.0 release we noticed the startup times of these applications are a lot bigger. For example the startup time of one application has gone up from 16 seconds to 155 seconds. We also encountered performance issues at customers after running the application for a while. This only occurs when the max memory of the server is not very high and classes need to be reloaded. This can make the application unresponsive. The issues are resolved when we use the classic loader implementation.
I investigated the issue a bit further. The issue only seems to occur on windows. If I run the same spring boot application in a docker container the performance is good. The issue occurs on both OpenJDK 17 and 21. I am running on spring boot 3.5.0.
When the classes are loaded the threads seem to spend a lot of time in the following function:
"pool-2-thread-2" #61 [21940] prio=5 os_prio=0 cpu=406.25ms elapsed=4.92s tid=0x000002e84e6a2bf0 nid=21940 runnable [0x000000ec0d1f8000]
java.lang.Thread.State: RUNNABLE
at java.panw.PanwHooks.NativeMethodEntry(Native Method)
at java.panw.PanwHooks.MethodEntry1(Unknown Source)
at java.net.URLStreamHandler.getHostAddress(java.base@21.0.6/URLStreamHandler.java)
at java.net.URLStreamHandler.hostsEqual(java.base@21.0.6/URLStreamHandler.java:459)
at java.net.URLStreamHandler.sameFile(java.base@21.0.6/URLStreamHandler.java:431)
at java.net.URLStreamHandler.equals(java.base@21.0.6/URLStreamHandler.java:352)
at java.net.URL.equals(java.base@21.0.6/URL.java:1144)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(java.base@21.0.6/ConcurrentHashMap.java:1721)
at org.springframework.boot.loader.net.protocol.jar.JarFileUrlKey.get(JarFileUrlKey.java:49)
at org.springframework.boot.loader.net.protocol.jar.UrlJarFiles$Cache.get(UrlJarFiles.java:157)
at org.springframework.boot.loader.net.protocol.jar.UrlJarFiles.getCached(UrlJarFiles.java:81)
at org.springframework.boot.loader.net.protocol.jar.JarUrlConnection.assertCachedJarFileHasEntry(JarUrlConnection.java:306)
at org.springframework.boot.loader.net.protocol.jar.JarUrlConnection.connect(JarUrlConnection.java:287)
at org.springframework.boot.loader.net.protocol.jar.JarUrlConnection.getJarFile(JarUrlConnection.java:99)
at jdk.internal.loader.URLClassPath$Loader.getResource(java.base@21.0.6/URLClassPath.java:657)
at jdk.internal.loader.URLClassPath.getResource(java.base@21.0.6/URLClassPath.java:316)
at java.net.URLClassLoader$1.run(java.base@21.0.6/URLClassLoader.java:424)
at java.net.URLClassLoader$1.run(java.base@21.0.6/URLClassLoader.java:421)
at java.security.AccessController.executePrivileged(java.base@21.0.6/AccessController.java:809)
at java.security.AccessController.doPrivileged(java.base@21.0.6/AccessController.java:714)
at java.net.URLClassLoader.findClass(java.base@21.0.6/URLClassLoader.java:420)
at java.lang.ClassLoader.loadClass(java.base@21.0.6/ClassLoader.java:593)
- locked <0x000000061353b058> (a java.lang.Object)
at org.springframework.boot.loader.net.protocol.jar.JarUrlClassLoader.loadClass(JarUrlClassLoader.java:107)
at org.springframework.boot.loader.launch.LaunchedClassLoader.loadClass(LaunchedClassLoader.java:91)
at java.lang.ClassLoader.loadClass(java.base@21.0.6/ClassLoader.java:526)
The issue can be easily reproduced when running an application with the configuration/ code below on windows 11 with openjdk 21. When I start this application, the startup time is about 20 seconds. When I start the application with the classic loader implementation the startup time is 3 seconds. In a docker container the startup times for both cases were about 2 seconds.
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>spring-boot-demo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.5.0</version>
<relativePath/>
</parent>
<properties>
<java.version>21</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<mainClass>com.example.demo.DemoApplication</mainClass>
<executable>true</executable>
</configuration>
</plugin>
</plugins>
</build>
</project>
DemoApplication.java
package com.example.demo;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
The problem seems to be that the URL.equals()
function in some cases (same URI values for both objects) is very slow on windows. I ran some performance tests with just that function. The performance was almost 15 times slower on windows than on linux.
Activity
wilkinsona commentedon Jun 18, 2025
Thanks for the report and analysis. It looks like we need to override
equals(URL u1, URL u2)
inorg.springframework.boot.loader.net.protocol.jar.Handler
. We already overridehashCode(URL url)
and doing something similar for equals should help here.bgoorden commentedon Jun 18, 2025
Thanks for the response. That seems like a good solution.
philwebb commentedon Jun 18, 2025
java.net.URLStreamHandler
already implements anequals
method that delegates to the protectedsameFile
method that we override. At least it appears to on Linux. From the stack trace, it doesn't look like our overloaded version is being called.wilkinsona commentedon Jun 18, 2025
Thanks, Phil. I stand corrected. So either the override isn't working for some reason or the URL doesn't have our custom handler.
[-]Executable JAR application class loading performance issues on windows[/-][+]Executable JAR application class loading performance issues on Windows[/+]philwebb commentedon Jun 18, 2025
JarFileUrlKey
has a cache that attempts to save building strings for URLs. This seems to work for 90% or URLs and for most standard apps. If, however, a URL has a host then theURL.equals()
method ends doing a DNS lookup.I've not been able to replicate the issue on Windows, so my guess is there's something unique about the URLs on the classpath for these apps. Regardless, I think we should only use the cache if
equals()
is going to be cheap. Ifequals()
is going to be expensive, we should just rebuild the string.[-]Executable JAR application class loading performance issues on Windows[/-][+]Executable JAR application class encounters performance issues when classpath URLs reference a host[/+]Only cache JarFile URL keys that are cheap to lookup
sadaaithal commentedon Jul 8, 2025
@philwebb , @bgoorden and the rest:
We have the same issue at some of our Windows environments.
We have a spring application that is an executable jar which suffers from startup performance issues.
The startup thread ("main") has the same stack as this issue.
And we think the problem arises from the Palo Alto Network (Cortex XDR?) agent that gets injected into the Java application.
See the top frames in this stack:
I am running some tests with a simple instrumentation agent where even inducing a 1ms delay in "java.net.URLStreamHandler.getHostAddress()" causes a 35+ minute startup delay.
And a rough count of the number of invocations to URLStreamHandler class turned to be ~2.3m.
I do not know what the Palo Alto cortex agent is doing exactly, but given the delay in "getHostAddress()" I have a hunch that it might be checking its "end point protection" rules to see if the application is attempting to access a blocked host/address.
And since these URLs are mainly pointing to bundled JARs (uber spring jar) , and that we call it millions of times, these checks from Cortex agent add up and become "expensive".
Glad we have a fix/workaround to make API performant, but I am of the opinion that the real fix should come from Palo Alto Java agent team.
I will share more details as soon as I learn more about this
-sada
philwebb commentedon Jul 9, 2025
Thanks for info @sadaaithal
sadaaithal commentedon Jul 10, 2025
@philwebb my colleague and I reviewed the changes in 206785f and we have some feedback.
The code changes are designed to help JarFileUrlKey.java cache the URL object only when the host is non-null .
In a runnable-jar scenario, we think all URL objects representing a nested file resource (eg: jar file) will have host as null.
Here's one example of how a URL object and its field look inside the process heap:
With your patch, we will still try to cache this URL inside JarFileUrlKey.java since host is empty.
Which triggers an implicit hashCode and/or equals call on URL, which delegates this to URLStreamHandler which then invokes DNS APIs . ( as fyi - in a process heap generated sometime during a manually indudced startup delay, only 2 out of 500k URL objects had the host field as null/empty )
So the code fix will not prevent the Palo Alto Cortex Java Agent from inspecting the DNS calls.
(I would still argue that this is an issue with the agent at this time, I can't find their repo here and I am trying to see how to get through to them).
Today we have two levels of cache
If we have to patch this on our end, then we need one that ensures it doesn't use URL objects as a key for lookup.
So we got to control or eliminate the cache in JarFileUrlKey
Some options
-sada
philwebb commentedon Jul 10, 2025
Thanks @sadaaithal, I've opened #46401 to see what we can do to address this.