about greg
outofmemory: greg's blog
projects


Turning Off the JIT Compiler

A colleague and I were having trouble tracking down the root cause of a production issue recently and we learned an interesting trick involving the JIT compiler. The issue was that after a certain period of time the JVM would go from 60 second GC intervals to less than 2 second intervals, coupled with a spike in CPU usage from 3% to 30% (light load). There was no correlation found in the application logs, but we could show that the reduction in interval was occurring at the same time as the increase in CPU usage.

We used the Linux threading model to determine which process ID was causing the CPU spike and then mapped that PID back to a thread in the javacore dumps. Unfortunately, the thread’s stack was not printed to the java core dump, a common issue with the 1.4.2 JVM from IBM.

Figure 1: Javacore With JIT Compiler On

3XMTHREADINFO      "Thread-4" (TID:8DA4C10, sys_thread_t:917DA770, state:MW, native ID:3800D) prio=5
NULL           
3HPREGISTERS           Register Values
NULL                   ---------------
3HPREGVALUES            EAX : 00000007, EBX : 917DAA68, ECX : 00000000
3HPREGVALUES            EDX : 917DA770, ESI : 917DAB30, EDI : 91961C24
3HPREGVALUES            EBP : 91961C24, ESP : 91961B1C, EIP : 0361947F
3HPREGVALUES            EFLAGS : 00000282
NULL           
3HPNATIVESTACK         Native Stack of "Thread-4" PID 5232
NULL                   -------------------------
3HPSTACKLINE            sysMonitorEnter at 361947F in libhpi.so
3HPSTACKLINE            ?? at 34EC688 in libjvm.so
3HPSTACKLINE            ?? at 34EA244 in libjvm.so
3HPSTACKLINE            cacheAllocWithInitialization at 34EA466 in libjvm.so
3HPSTACKLINE            jitLockedHeapAlloc at 34EC964 in libjvm.so
3HPSTACKLINE            8E626553

Notice the NULL after 3XMTHREADINFO as well as the ?? symbols in the native stack. Clearly the core dump signal handler was unable to determine what was going on with this thread, either due to the JIT compilation or allowed time to process, I’m not sure.

We spoke to an IBM JVM engineer who recommended we try turning off the JIT compiler in order to ensure the java core signal handler catches every stack. The theory is that since the java core signal handler doesn’t have to translate the JIT compiled code back into Java byte code, it will be able to print all the threads stacks completely.

Turning off the JIT compiler really illustrated the impact it has to performance on the system. Without the machine language, the app took about eight times longer to start up! Low and behold, the java cores taken from this round of the production issue showed complete stacks.

Figure 2: Javacore With JIT Compiler Off

3XMTHREADINFO      "Thread-4" (TID:8DA4C10, sys_thread_t:90E97508, state:R, native ID:3800D) prio=5
4XESTACKTRACE          at java.lang.Object.wait(Native Method)
4XESTACKTRACE          at java.util.TimerThread.mainLoop(Timer.java:444)
4XESTACKTRACE          at java.util.TimerThread.run(Timer.java:397)
NULL           
3HPREGISTERS           Register Values
NULL                   ---------------
3HPREGVALUES            EAX : 00000001, EBX : 90E97800, ECX : 00000000
3HPREGVALUES            EDX : 90E97508, ESI : 90E978C8, EDI : 90F81F30
3HPREGVALUES            EBP : 90F81F40, ESP : 90F81E28, EIP : 03619BC4
3HPREGVALUES            EFLAGS : 00000296
NULL           
3HPNATIVESTACK         Native Stack of "Thread-4" PID 10874
NULL                   -------------------------
3HPSTACKLINE            sysMonitorWait at 3619BC4 in libhpi.so
3HPSTACKLINE            lkMonitorWait at 34E4F73 in libjvm.so
3HPSTACKLINE            JVM_MonitorWait at 347D518 in libjvm.so
3HPSTACKLINE            mmipSysInvokeJni at 3566EF8 in libjvm.so
3HPSTACKLINE            mmisInvokeJniMethodHelper at 35669FE in libjvm.so
3HPSTACKLINE            mmipInvokeJniMethod at 3567544 in libjvm.so
3HPSTACKLINE            L0_invokenonvirtual_quick__ at 353FBFA in libjvm.so
3HPSTACKLINE            mmipExecuteJava at 353A123 in libjvm.so
3HPSTACKLINE            90F821A8

Notice how not only is the stack area fleshed out, but there are no longer any ?? symbols in the native stack area. Each thread has complete information on what it was doing at the time of the dump.

With this version of the java core dump, we were able to find the infinite loop causing the issue (list manipulation in a while loop unprotected against negative flows). This tactic will save us some time in the future for sure.

Tags: , ,

Leave a Reply