java.lang.OutOfMemoryError

1. The problem

During load testing, we found our application runs into java.lang.OutOfMemoryError. Here is the stacktrace.


java.lang.OutOfMemoryError
java.util.zip.ZipFile.open(Native Method)
java.util.zip.ZipFile.(ZipFile.java:112)
java.util.jar.JarFile.(JarFile.java:127)
java.util.jar.JarFile.(JarFile.java:92)
org.apache.catalina.loader.WebappClassLoader.openJARs(WebappClassLoader.java:1544)
org.apache.catalina.loader.WebappClassLoader.findResourceInternal(WebappClassLoader.java:1763)
...

Using pmap to check the running Java process, it takes up 3G of memory, but the Java Heap Space is only 1G. So we guess there’s memory leak with threads. Each Java thread needs stack space, stack space is different from Java heap space. We didn’t touch -Xss Java parameter, so the default stack space per thread is 1M. Now that the memory usage is 3G, which means the system got a couple thousands of threads.


bash-3.00$ ps -eLf | grep java | wc
1837   25605  262751

2. First thing first thing, check environment and settings.

We have enough system memory, why it runs out of system memory so quickly at 3G?


bash-3.00$ uname -a
SunOS sedm3202 5.10 Generic_125100-10 sun4v sparc SUNW,Sun-Fire-T200
bash-3.00$ isainfo -kv
64-bit sparcv9 kernel modules
bash-3.00$ which java
/usr/bin/java
bash-3.00$ file /usr/bin/java
/usr/bin/java:  ELF 32-bit MSB executable SPARC Version 1, dynamically linked, not stripped, no debugging information available

It turns out that we are running 32-bit JVM on 64-bit SunOS! So we swtiched to 64-bit JVM right away. But the problem won’t go away, the system will only last a bit longer during load testing, the leaking is still happening.

3. Thread Dump

Jstack won’t work all the time when system is not responding, so we added the following jsp code.


<%@ page import="web.helper.HttpHelper" %>

<%
out.write("======================Thread dump==================");
out.write(HttpHelper.captureThreadDump());
%>
/**
* A utility method to capture thread dump for current running JVM.
*
* @return thread dump
*/
public static String captureThreadDump() {
Map<Thread, StackTraceElement[]> allThreads = Thread.getAllStackTraces();
Iterator<Thread> iterator = allThreads.keySet().iterator();
StringBuffer stringBuffer = new StringBuffer();
while (iterator.hasNext()) {
Thread key = (Thread) iterator.next();
StackTraceElement[] trace = allThreads.get(key);
stringBuffer.append(key + "rn");
for (int i = 0; i < trace.length; i++) {
stringBuffer.append(" " + trace[i] + "rn");
}
stringBuffer.append("");
}
return stringBuffer.toString();
}

4. The root cause

java.util.Timer TimerThread, they are all blocked by Object.wait().

A facility for threads to schedule tasks for future execution in a background thread. Tasks may be scheduled for one-time execution, or for repeated execution at regular intervals.

Corresponding to each Timer object is a single background thread that is used to execute all of the timer’s tasks, sequentially. Timer tasks should complete quickly. If a timer task takes excessive time to complete, it “hogs” the timer’s task execution thread. This can, in turn, delay the execution of subsequent tasks, which may “bunch up” and execute in rapid succession when (and if) the offending task finally completes.

We then use the BTrace to test how the Timer was used.

import com.sun.btrace.annotations.*;
import static com.sun.btrace.BTraceUtils.*;
import java.io.File;

@BTrace
public class TimerTester {

    @OnMethod(
      clazz="java.util.Timer", method=""
    )
    public static void onNew() {
        print("===onNew===");
        jstack();
    }
}

The run result showed the same error. So we traced further to the source code where the Timer is instantiated and found the root cause.

5. Using BTrace

a. Start target application

It’s a web application

b. Find the process id using “jps”

jps

c. Start btrace client tool

btrace <pid> TimerTester.java
(Visited 21 times, 1 visits today)