JVM memory pool

Below shows the different memory pools.

Heap Memory explanation

JVM memory pool is split into different sections, two of these main regions are a young generation space and old generation space (Tenured). To start with new live Java thread’s objects will be allocated into Eden space within it’s own Thread Local Allocation Buffer (avoiding lock contention on Eden the single memory resource), once Eden space is full a minor garbage collection will occur copying objects which are still live into the to survivor space, then the Eden space is cleared. The same process happens again but also copying objects from the currently filled to survivor space into the one called from survivor space. This way one survivor space is kept free so that copies can done between them. Referenced objects in the survivor space bounce between the two survivor spaces at each minor GC, until it either becomes unreferenced or has survived the tenuring threshold. If the tenuring threshold is reached, that object is migrated to Old gen (Tenured) alternatively if the object(s) cannot fit into the survivor region they could be promoted to Old gen early. Once Old gen (Tenured) is full or near full then a Full GC event occurs (subject to GC algorithm). In Full GC the JVM has to first identify all the referenced objects. When that is done, the JVM sweeps the entire heap to reclaim all free memory (for example, because the object is now dead). Finally, the JVM then moves referenced objects to defragment the old heap.

Thread Local Allocation Buffers

In Java each thread will have it’s own TLAB which is only within Eden space. This is used to allow multi-threaded applications to allocate with concurrent threads without requiring locking or synchronization between threads. Each thread’s TLAB can be dynamically re-sized so that this does not waste too much space (as different threads have different allocation rates). Three things affect TLAB’s size for each thread and this is application specific; Size of the Eden space, amount of threads running, allocation rate of the threads. Therefore if you need to increase the TLAB size you can attempt increasing the Eden space which will allow the dynamic resizing of TLAB’s for any given thread to be greater. You would want to manually change the settings if many threads frequently perform allocations outside the TLAB, if only a few do this then it is more effective to determine why the specific thread is allocating so often.

This can be modified to be disabled (not recommended) or a specific initial size with resizing disabled as shown below.

-XX:TLABSize= -XX:-ResizeTLAB

Non-heap Memory explanation

Full GC is performed on non-heap space which begins at Permanent Generation which contains all the reflective data of the virtual machine itself, such as class and method objects. Non-heap memory is used by Perm-gen space & also code cache.

Code Cache

The HotSpot Java VM uses code cache, containing memory that is used for compilation and storage of native code. Normally this area doesn’t need tuning but if you use the tiered complier the chances of filling this region will be increased and if an error is seen you can adjust the memory allocated using;

-XX:InitialCodeCacheSize= - Minimum size for code cache.

-XX:ReservedCodeCacheSize= - Reserved code cache maximum size. [Solaris 64-bit, amd64, and -server x86: 2048m; in 1.5.0_06 and earlier, Solaris 64-bit and amd64: 1024m.] default value 32 usually, but subject to Java version and platform.

Off-heap memory

Caching topologies allow a java process to cache objects off-heap thereby having objects that are not subject to affecting GC timings and allow a cache for java. The amount of memory to allocate of heap can be specified with the below flag, the object must also be serializable and accessing this object will be less efficient than objects in heap. (As it must be serialized and de-serialized)

-XX:MaxDirectMemorySize

ThreadStackSpace

When a new thread is created, it pre-allocates a fixed-size block of memory for that thread’s stack. By reducing the size of a memory block, you can avoid running out of memory, especially if you have lots of threads - the memory saving is the reduction in stack size times the number of threads.The downside of doing this is that you increase the chance of a Stack Overflow error. Thread stacks are created outside of the JVM heap, so even if there’s plenty of memory available in the heap, you can still fail to create a thread stack due to running out of memory. This leads to JVM crashing. Default varies between OS & Java version. (Generally not worth modifying unless on a 32-bit JVM.)

to adjust thread stack space use flag

-Xss(value)k

Lowest value that can be used is 64k & highest is 2M.

A stack overflow in Java language code will normally result in the offending thread throwing java.lang.StackOverflowError. On the other hand, C and C++ write past the end of the stack and provoke a stack overflow.

Java methods share stack frames with C/C++ native code, namely user native code and the virtual machine itself. Java methods generate code that checks that stack space is available a fixed distance towards the end of the stack so that the native code can be called without exceeding the stack space. This distance towards the end of the stack is called “Shadow Pages.” The size of the shadow pages is between 3 and 20 pages, depending on the platform. This distance is tunable, so that applications with native code needing more than the default distance can increase the shadow page size. The option to adjust shadow pages is

-XX:StackShadowPages=

Log files

The JVM will not log unless explicitly told to do so, this is done using the flag. Should use the same log file throughout process lifetime and also allows for rotation to control disk space usage.

    
            -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=7 -Xloggc:/var/log/gc-debug.log -verbose:gc

Configuration files

Config is usually found in ${JBOSS_HOME}/bin/run.conf - removing flags from the JVM is low risk, it will decide which Garbage Collector algorithm to use and will perform it when needed. Will also set default values for other flags. Ensure a copy is made before changes are made that way any problems encountered can be fixed quickly

Profiling Application

Before we can explicitly set any settings on a server it is best to profile the application to see how it performs. We do this using tools such as JConsole, netbeans & jvisualvm. You would remotely monitor the host with these tool and not run them locally, they present a GUI to easily see how Heap, non-Heap is being used along with how the Java objects are being handled. You could give the JVM flexible settings for it to be able to resize the heap as it sees fit. Generally dynamic heap sizing leads to growth and not as often shrink operations. See attached file as an example.Flexible-JVM-settings.txt

Also would view the GC debug logs and Zabbix data to get the bigger picture.

Diagnostics

Tools for Diagnosis of Common Problems

Problem	Symptom	Diagnostic Tools
Insufficient memory	`OutOfMemoryError`	Java Heap Analysis Tool ( `jhat`)
Memory leaks	Growing use of memory Frequent garbage collection	Java Monitoring and Management Console ( `jconsole`) JVM Statistical Monitoring Tool ( `jstat`)
	A class with a high growth rate A class with an unexpected number of instances	Memory Map ( `jmap`) See `jmap -histo` option
	An object is being referenced unintentionally	`jconsole` or `jmap` with `jhat` See `jmap -dump` option
Finalizers	Objects are pending for finalization	`jconsole` `jmap -dump` with `jhat`
Deadlocks	Threads block on object monitor or `java.util.concurrent` locks	`jconsole` Stack Trace ( `jstack`)
Looping threads	Thread CPU time is continuously increasing	`jconsole` with JTop
High lock contention	Thread with high contention statistics	`jconsole`

Diagnosis With JDK tools

This section describes how to diagnose common Java SE problems using JDK tools. The JDK tools enable you to obtain more diagnostic information about an application and help you to determine whether the application is behaving as it should. In some situations, the diagnostic information may be sufficient for you to diagnose a problem and identify its root cause. In other situations, you may need to use a profiling tool or a debugger to debug a problem.

For details about each tool, refer to the Java SE 6 tools documentation.

View all available flags and settings

Run the following command:

    
            java -XX:+UnlockCommercialFeatures -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+PrintFlagsFinal -version

Note that the options will vary from each Java version and update. Best not to assume non-standard settings will apply elsewhere. (just check using above if uncertain.) This will print the following columns to define a Flag, along with a specific example.

Type

Name

Value::Category

bool G1PrintHeapRegions =false {diagnostic}

Where bool = boolean type (True or False), name, value this has been set to and what category this falls under being one of: {diagnostic}, {product}, {pd product}, {C1 pd product}, {C2 product}, {C2 pd product},{lp64_product}, {ARCH product}, {commercial}, {manageable} & {experimental}. Many flags exist (up to hundreds).

Out of Memory errors

These errors can be caused by a few things. This can be due to a memory leak in the application, memory in use by other processes & OS or not enough native memory available on the host. To Diagnose first begin taking note of the specific error reported then using jvisualvm take Heap dumps, thread dump and check back over previous Zabbix data for JVM related metrics.

JVM Memory usage higher than N% of Availability

Find out current flags that the JVM is using for the host. Check Zabbix monitoring graphs for: Eden space usage, survivor space usage, Old generation (Tenured) and GC related events. This alert is likely due to a memory leak.

Young Generation Guarantee

In an ideal minor collection the live objects are copied from one part of the young generation (the Eden space plus the first to-survivor space) to another part of the young generation (the second survivor space). However, there is no guarantee that all the live objects will fit into the second survivor space. To ensure that the minor collection can complete even if all the objects are live, enough free memory must be reserved in the tenured generation to accommodate all the live objects. In the worst case, this reserved memory is equal to the size of Eden plus the objects in the non-empty survivor space. This is why we don’t allocate greater than 50% of the total heap for the Young gen space.

Types of Garbage Collectors

SerialGC

Flag -XX:+UseSerialGC. Only advisable on hosts with a single CPU and low memory management requirements. If any more than that do not use this collector type. Uses a single thread to perform all garbage collection work, which makes it relatively efficient since there is no communication overhead between threads. Introduced in Java 5.0, and does not come with any specific GC tunable flags.

CMS GC

Flag -XX:+UseConcMarkSweepGC. Designed for applications that prefer shorter garbage collection pauses and that can afford to share processor resources with the garbage collector while the application is running. During each major collection cycle, the concurrent collector will pause all the application threads for a brief period at the beginning of the collection and again toward the middle of the collection. The second pause tends to be the longer of the two pauses and multiple threads are used to do the collection work during that pause. The remainder of the collection including the bulk of the tracing of live objects and sweeping of unreachable objects is done with one or more garbage collector threads that run concurrently with the application(tunable). Uses stop the world pauses in two separate phases.

Flags Specific to CMS

-XX:+CMSIncrementalMode - Enables incremental mode, where concurrent phases of CMS are done incrementally. periodically stopping the concurrent phase to give back the processor to application threads. Recommended for hosts which have 1 or 2 CPU’s.

-XX:+CMSIncrementalPacing - Enables automatic control of the amount of work the Incremental CMS collector is allowed to do before giving up the processor, based on application behaviour. Enabled by default in Java 6 onwards with I-CMS.

-XX:+CMSScavengeBeforeRemark - Force young collection before remark phase. Can be used to avoid back to back GC cycles in some rarer cases.

-XX:CMSInitiatingOccupancyFraction= - percent of old generation space occupancy at which the first CMS garbage collection cycle should start.

-XX:+UseCMSInitiatingOccupancyOnly - percent of old generation space occupancy at which all CMS garbage collection cycles should start. Use in conjunction with above flag.

-XX:+UseParNewGC - parallel copying collector, runs collection over multiple threads within young generation space. Auto enabled when CMS is used

-XX:+CMSPermGenSweepingEnabled - During a CMS collection in Old generation also sweep Perm-Gen space and remove classes which are no longer used.

-XX:+CMSClassUnloadingEnabled - class unloading enabled

Phases

Once the GC is triggered, the CMS algorithm consists of a series of 6 phases that run in this order;

Initial Mark - Pauses all application threads and marks all objects directly reachable from root objects as live. Stop the world event
Concurrent Mark - Application threads are restarted. All live objects are transitively marked as reachable by following references from the objects marked in the initial mark.
Concurrent Preclean - This phase looks at objects which have been updated or promoted during the concurrent mark or new objects that have been allocated during the concurrent mark. It updates the mark bit to denote whether these objects are live or dead. This phase may be run repeatedly until there is a specified occupancy ratio in Eden.
Remark Since some objects may have been updated during the preclean phase its still necessary to do stop the world in order to process the residual objects. This phase does a retrace from the roots. It also processes reference objects, such as soft and weak references. stop the world event
Concurrent Sweep - This looks through the Ordinary Object Pointer (OOP) Table, which references all objects in the heap, and finds the dead objects. It then re-adds the memory allocated to those objects to its freelist. This is the list of spaces from which an object can be allocated.
Concurrent Reset - Reset all internal data structures in order to be able to run CMS again in future.

Parallel GC

Flag -XX:+UseParallelGC. Multiple threads are used to speed up garbage collection. By default, only minor collections are executed in parallel; major collections are performed with a single thread. However, parallel compaction can be enabled with the option -XX:+UseParallelOldGC so that both minor and major collections are executed in parallel, to further reduce garbage collection overhead. Operates in a similar manner to Serial GC. Stop-the-world process, introduced in Java 5.0 update 6

Flags specific to Parallel collector

-XX:+UseParallelOldGCDensePrefix - Instructs to use DensePrefix to not compact certain heap areas to spend less time running compaction tasks when compaction occurs see http://www.google.com/patents/US20080235305 Compaction squeezes all live objects towards the beginning part of the heap. However, over time, it can become very densely populated leaving very low levels of fragmentation. A naïve implementation could end up spending too much time compacting dense areas and have no return on investment. Instead use what is known as a dense prefix that acts like a watermark denoting the optimal point at which compaction should start. Only applies in Java versions 5 + 6, specifics are TBC

-XX:InitialTenuringThreshold= - Sets the initial tenuring threshold for use in adaptive GC sizing in the parallel young collector.

-XX:MaxGCPauseMillis= - Sets a target for the maximum GC pause time for both Minor and Major GC. Interpreted as a hint that pause times of milliseconds or less are desired. Default there is no maximum pause time goal

-XX:GCTimeRatio= - Sets the ratio of garbage collection time to application time to 1 / (1 + ). Default value 99. Resulting in a goal of 1% of the total time spent in garbage collection.

-XX:InitialSurvivorRatio= - Should be used when adaptive sizing is enabled and a desire to specifically initially size survivor spaces.

-XX:-UseAdaptiveSizePolicy - By default adaptive sizing is auto-enabled and only works with this collector. Automatically re-sizes survivor spaces as the application behaviour warrants. This setting will disable adaptive sizing. If you enable adaptive sizing on any other collector it will have no effect.

-XX:+PrintAdaptiveSizePolicy - prints the adaptivesizing during execution, for use with debug GC logging.

-XX:+UseNUMA - Enables a JVM heap space allocation policy that helps overcome the time it takes to fetch data from memory by leveraging processor to memory node relationships by allocating objects in a memory node local to a processor on NUMA (Non Uniform Memory Access) systems. Available in Java 6 Update 2. Only useful when dealing with servers that have more than one physical CPU.

-XX:ParallelGCThreads= - Sets the number of threads used during garbage collection. Default varies on each server platform. Usually same as amount of CPU cores and will not run more threads than cores available even if over-provisioned. Calculation used by default is TBC

Below shows a basic visual difference between these collectors. Where blue indicating application threads and orange for GC threads. (Not fully accurate, just rough idea.)

G1GC (Garbage-First)

Flag -XX:+UseG1GC. Heap is utilized in a different manner to other collectors, heap is partitioned into a set of equal-sized heap regions aiming for near but not over 2,048. Eden, Survivor, and Old Generations are logical sets of these regions. Humongous Objects can be any object which is larger than half a region size these are allocated directly in the old generation humongous region. G1 performs a concurrent global marking phase to determine the liveness of objects throughout the heap. After the mark phase completes the G1 collector knows which regions are mostly empty. It collects from these regions first as they will likely contain the least amount of live objects. It uses the Snapshot-At-The-Beginning (SATB) algorithm, which takes a snapshot of the set of live objects in the heap at the start of a marking cycle. This collector works within a set of parameters to meet pause time goals, making the time spent performing GC more predictable than other collectors. Normally used with a large heap (6G+) where GC time has to be kept low & predictable pause times are needed. Note it is planned for G1 to replace CMS in the future and development is continuing for G1 so that it can be used effectively with smaller heap sizes. Supported in Java SE 7 Update 4. Released in Java 6 Update 14.

Flags specific to G1

-XX:G1HeapRegionSize= - Sets the size of a G1 region. The value will be a power of two and can range from 1MB to 32MB. Default will vary on memory allocated to Java process.

-XX:MaxGCPauseMillis= - Sets a target value for desired maximum pause time. The default value is 200 milliseconds.

-XX:G1MixedGCLiveThresholdPercent= - Sets occupancy threshold for an old region to be included in a mixed garbage collection cycle. Default is 65%, may be classed as experimental

-XX:G1ReservePercent= - Sets the percentage of reserve memory to keep free to reduce the risk of to-space overflows. Default is 10%

-XX:G1NewSizePercent= - Sets the percentage of the heap to use as the minimum for the young generation size. Default value 5%, may be classed as experimental

-XX:G1MaxNewSizePercent= - Sets the percentage of the heap size to use as the maximum for young generation size. Default value 60%

-XX:ParallelGCThreads= - Amount of STW GC threads. By default is set to the same as the number of logical processors up to a value of 8.

-XX:ConcGCThreads= - Number of parallel marking threads. By default approximately 1/4 of the number of parallel garbage collection threads.

-XX:+PrintStringDeduplicationStatistics - Prints detailed deduplication statistics. Default is disabled. Available in Java 8 update 20

-XX:+UseStringDeduplicationJVM - String deduplication feature to save on memory allocations made by objects. Instead of each String object pointing to its own character array, identical String objects can point to and share the same character array. Default is disabled. Java 8 update 20

Phases

Initial mark phase: G1 GC marks the roots during this phase. This phase is piggybacked on a normal stop the world young garbage collection.
Root region scanning phase: G1 GC scans survivor regions of the initial mark for references to the old generation and marks the referenced objects. This phase runs concurrently with the application and must complete before the next stop the world young garbage collection can start.
Concurrent marking phase: G1 GC finds reachable (live) objects across the entire heap. This phase happens concurrently with the application, and can be interrupted by stop the world young garbage collections.
Remark phase: This phase is stop the world collection and helps the completion of the marking cycle. G1 GC drains Snapshot At The Beginning (SATB) buffers, traces unvisited live objects, and performs reference processing.
Cleanup phase:G1 GC performs stop the world operations of accounting and Remembered Sets (RSet) scrubbing. During accounting, the G1 GC identifies completely free regions and mixed garbage collection candidates. The cleanup phase is partly concurrent when it resets and returns the empty regions to the free list.

Below illustrates a G1 heap

Fun with Flags

Quite a few of the common flags to be seen will be

-Xms & -Xmx(m,g). These options specify the minimum and maximum total heap space. These can be set at different values which allows the JVM to dynamically resize the heap as it sees fit. Having these values the same will improve the performance by fixing the heap size (the overhead of computing a new heap size is removed from the equation). This however should only be done once you have profiled your applications needs against the JVM with logging enabled. Otherwise setting a fixed value may not be optimal if the application undergoes dramatic changes in object allocations (due to high variable loads)

-XX:SurvivorRatio= - Sets ratio between Eden and Survivor Spaces. If the survivor ratio is too small will mean that copying collection overflows directly into tenured generation. Too large a ratio will make them uselessly empty. By default this is set to 8, as long as the Survivor regions are close to 50% utilization per minor GC cycle you won’t need to change this. If they are almost always 0 or 100% then it is essential to fix this otherwise knock on effects will occur. (Additional GC activity)

-XX:+UseLargePages - large page support is used to optimize processor Translation-Lookaside Buffers. Sometimes using large page memory can negatively affect system performance. For example, when a large mount of memory is pinned by an application, it may create a shortage of regular memory and cause excessive paging in other applications and slow down the entire system.

-XX:+DisableExplicitGC - Instructs the JVM to ignore any calls to System.gc(), which is an expensive operation requesting the JVM to do a Full Major GC. (Java code can call upon System.gc()). Note JVM will still perform GC when required.

-Dsun.rmi.dgc.server.gcInterval= - recommended to set anywhere from 20-30 minutes and is only used for distributed garbage collected objects.

-XX:NewSize= & -XX:MaxNewSize= - sets the Young Generation size and max size. (the Eden and Survivor spaces) recommended that this be 25-50% of the total allocated heap size. Do not use with G1

-XX:NewSizeThreadIncrease= - Sets the increment size when different new and maxnew size are used.

-Xmn - Same as newsize flags combined, is used to bind the Young Generation to a fixed size. Do not use with G1

-XX:PermSize= & -XX:MaxPermSize= - Permanent Generation Size, does not have a noticeable impact on garbage collector performance generally. Not a concern most of the time unless this memory region is being filled which will cause the collector of choice to call SerialGC. Note PermGen removed since Java 8.

-XX:Metaspacesize= & -XX:MaxMetaspace= Used only in Java 8 onwards as replacement for Permgen. Sets size and max sizes. By default Unlimited, subject to change.

-XX:MaxTenuringThreshold= - Maximum value for tenuring threshold. Default value is 15.

Less common, performance option flags

-XX:+TieredCompilation - Increases startup speed of the server VM. Normally, a server VM uses the interpreter to collect profiling information about methods that is fed into the compiler. In the tiered scheme, in addition to the interpreter, the client compiler is used to generate compiled versions of methods that collect profiling information about themselves. Since the compiled code is substantially faster than the interpreter, the program executes with greater performance during the profiling phase. In some cases, a startup that is even faster than with the client VM can be achieved because the final code produced by the server compiler may be already available during the early stages of application initialization. The tiered scheme can also achieve better peak performance than a regular server VM because the faster profiling phase allows a longer period of profiling, which may give better optimization. Available in Java SE 7 & some late versions of Java 6

-XX:+UseCompressedOops - only for use on 64bit JVM’s with a total heap below 32GB. Enables the use of compressed pointers ( object references represented as 32 bit offsets instead of 64-bit pointers). This will almost halve the space taken by ordinary object pointers from a 64 bit size per object to 32 bit size. Almost no reasons against using this apart from previously stated heap size requirements & a more recent Java requirement. Note this is enabled by default in Java 6 Update 18 or higher. Introduced in Java 6 Update 14

-XX:+AlwaysPreTouch - Pre-touch the Java heap during JVM initialization. Every page of the heap is thus demand-zeroed during initialization rather than incrementally during application execution.

-XX:+AggressiveOpts - Enables point performance compiler optimizations, will use the heap aggressively. Changes grouped by this flag are minor changes to JVM runtime compiled code and not distinct performance flags. Note specific optimizations enabled by this option can change from release to release and even build to build. Experimental setting.

-XX:+UseBiasedLocking - Enables biased locking, a technique for improving the performance of uncontended synchronization. An object is “biased” toward the thread which first acquires its monitor through a monitorenter bytecode or synchronized method invocation; subsequent monitor-related operations performed by that thread are relatively much faster on multiprocessor machines. Some applications with significant amounts of uncontended synchronization may attain significant speedups with this flag enabled; some applications with certain patterns of locking may see slowdowns.

-XX:+DoEscapeAnalysis - Enables escape analysis optimization feature. An object, after it is allocated by some executing thread “escapes” if some other thread can ever see the allocated object. Introduced in Java 6 Update 14

-XX:+OptimizeStringConcat - Optimize String concatenation operations where possible. Default Enabled, introduced in Java 6 Update 20.

-XX:+UseStringCache - Enables caching of commonly allocated strings. Default disabled.

-XX:StringTableSize= - Allow tuning of String cache when using many interned Strings, set to a prime number. Default varies between Java versions, Introduced in Java 7 update 40

-XX:+PrintStringTableStatistics - print hash table statistics, used prior to modifying the StringTableSize. Introduced in Java 7 update 40

Garbage Collection general rules

The smaller the heap the faster Full GC is performed, however because the heap is smaller collections will take place more often.
To best size a heap depends on the application type. All tuning and heap parameters are application specific and what works in one situation may not work as well in another. Generally speaking applications which have a large amount of live data and are interactive are usually better off with a smaller heap (3-6GB) along with larger (up to 50%) of the total heap allocated to YG. Applications which process work in batches and aren’t interactive in use tend to be better off with a larger heap size.

Garbage Collection Failures

These happen because not enough space is available in the heap for a java object to be inserted. A few potential causes for this. An example of a collection failure can be seen in a log file and may appear as follows

CMS (concurrent mode failure)   

The underlying issue could be a memory leak or not enough heap space allocated. As an example if CMS is selected as the collector and a failure occurs a Stop the world Full GC takes place being as CMS has failed for some reason.

JVM 32-bit vs 64-bit

A 32 bit JVM will only use 32 bit object sizes, this makes the memory use more efficient as each object uses 50% less space than 64-bit objects which are used in 64-bit JVM’s. However 32-bit JVM limits you to almost 4G max heap.

However in more recent java versions 64 bit JVM’s make use of -XX:+UseCompressedOops.

also 64-bit implementations can be forced to use the 32-bit architecture by adding -d32 onto the java process opts.

Known Problems

Typically, problems in a Java SE application are linked to critical resources such as memory, threads, classes, and locks. Resource contention or leakage may lead to performance issues or unexpected errors. Table 1 summarizes some common problems and their symptoms in Java SE applications and lists the tools that developers can use to help diagnose each problem’s source.

Insufficient Memory

The Java Virtual Machine (JVM) * has the following types of memory: heap, non-heap, and native.

Heap memory is the runtime data area from which memory for all class instances and arrays is allocated. Non-heap memory includes the method area and memory required for the internal processing or optimization of the JVM. It stores per-class structures such as a runtime constant pool, field and method data, and the code for methods and constructors. Native memory is the virtual memory managed by the operating system. When the memory is insufficient for an application to allocate, a java.lang.OutOfMemoryError will be thrown.

Following are the possible error messages for OutOfMemoryErrors in each type of memory:

Heap memory error. When an application creates a new object but the heap does not have sufficient space and cannot be expanded further, an OutOfMemoryError will be thrown with the following error message:
java.lang.OutOfMemoryError: Java heap space
Non-heap memory error. The permanent generation is a non-heap memory area in the HotSpot VM implementation that stores per-class structures as well as interned strings. When the permanent generation is full, the application will fail to load a class or to allocate an interned string, and an OutOfMemoryError will be thrown with the following error message:
java.lang.OutOfMemoryError: PermGen space
Native memory error. The Java Native Interface (JNI) code or the native library of an application and the JVM implementation allocate memory from the native heap. An OutOfMemoryError will be thrown when an allocation in the native heap fails. For example, the following error message indicates insufficient swap space, which could be caused by a configuration issue in the operating system or by another process in the system that is consuming much of the memory:
java.lang.OutOfMemoryError: request bytes for . Out of swap space?

An insufficient memory problem could be due either to a problem with the configuration -- the application really needs that much memory -- or to a performance problem in the application that requires you to profile and optimize to reduce the memory use. Configuring memory settings and profiling an application to reduce the memory use are beyond the scope of this article, but you can refer to the HotSpot VM Memory Management white paper (PDF) for relevant information or use a profiling tool such as the NetBeans IDE Profiler.

Memory Leaks

The JVM is responsible for automatic memory management, which reclaims the unused memory for the application. However, if an application keeps a reference to an object that it no longers needs, the object cannot be garbage collected and will occupy space in the heap until the object is removed. Such unintentional object retention is referred to as a memory leak. If the application leaks large amounts of memory, it will eventually run out of memory, and an OutOfMemoryError will be thrown. In addition, garbage collection may take place more frequently as the application attempts to free up space, thus causing the application to slow down.

Ways to Diagnose a Memory Leak

A memory leak may take a very long time to reproduce, particularly if it happens only under very rare or obscure conditions. Ideally, the developer would diagnose a memory leak before an OutOfMemoryError occurs.

First, use JConsole to monitor whether the memory usage is growing continuously. This is an indication of a possible memory leak. Figure 1 shows the Memory tab of JConsole connecting to an application named MemLeak that shows an increasing usage of memory. You can also observe the garbage collection (GC) activities in the box inset within the Memory tab.

The Memory tab shows increasing memory usage, which is an indication of a possible memory leak.

You can also use the jstat command to monitor the memory usage and garbage collection statistics as follows:

  $ /bin/jstat -gcutil

The jstat -gcutil option prints a summary of the heap utilization and garbage collection time of the running application of process ID at each sample of the specified sampling for number of times. This produces the following sample output:

  S0     S1     E      O      P     YGC   YGCT    FGC    FGCT     GCT


  0.00   0.00  24.48  46.60  90.24  142   0.530   104   28.739   29.269


  0.00   0.00   2.38  51.08  90.24  144   0.536   106   29.280   29.816


  0.00   0.00  36.52  51.08  90.24  144   0.536   106   29.280   29.816


  0.00  26.62  36.12  51.12  90.24  145   0.538   107   29.552   30.090

For details about the jstat output and other options to obtain various VM statistics, refer to the jstat man page .

Heap Histogram

When you suspect a memory leak in an application, the jmap command will help you get a heap histogram that shows the per-class statistics, including the total number of instances and the total number of bytes occupied by the instances of each class. Use the following command line:

$ /bin/jmap -histo:live

The heap histogram output will look similar to this:

num   #instances    #bytes  class name
--------------
  1:    100000    41600000  [LMemLeak$LeakingClass;
  2:    100000     2400000  MemLeak$LeakingClass
  3:     12726     1337184  
  4:     12726     1021872  
  5:       694      915336  [Ljava.lang.Object;
  6:     19443      781536  
  7:      1177      591128  
  8:      1177      456152  
  9:      1117      393744  
 10:      1360      246632  [B
 11:      3799      238040  [C
 12:     10042      160672  MemLeak$FinalizableObject
 13:      1321      126816  java.lang.Class
 14:      1740       98832  [S
 15:      4004       96096  java.lang.String
 < more .....>

The jmap -histo option requests a heap histogram of the running application of process ID . You can specify the live suboption so that jmap counts only live objects in the heap. To count all objects including the unreachable ones, use the following command line:

$ /bin/jmap -histo

It may sometimes be useful to determine what objects will be garbage collected by comparing two heap histograms: one that counts all objects including the unreachable ones and another that counts only the live objects. From one or more heap histogram snapshots, you can attempt to identify the class that may have a memory leak, which typically has any of the following characteristics:

Its instances occupy unexpectedly large amounts of memory.
The number of instances of the class is growing over time at a high rate.
Class instances that you would expect to be garbage collected are not.

The preceding heap histogram obtained by the jmap utility indicates that LeakingClass and its array have the largest instance counts, so they are the leak suspects.

The heap histogram sometimes provides you with the information you need to diagnose a memory leak. For example, if the application uses the leaking class in only a few places, you can easily locate the leak in the source code. On the other hand, if the leaking class is widely used in the application, such as the java.lang.String class, you will need to trace the references to an object and diagnose further by analyzing a heap dump.

Heap Dump

You can obtain a heap dump in any of the following ways. First, you can use the jmap command to get a heap dump with this command line:

$ /bin/jmap -dump:live,file=heap.dump.out,format=b

This produces the following sample output:

Dumping heap to d:\demo\heap.dump.out ...
Heap dump file created

The jmap -dump option requests that a heap dump of the running application of process ID be written to the specified filename, heap.dump.out. Similar to the -histo option, the livesuboption is optional and specifies that only live objects should be dumped.

The second method is to get a heap dump from JConsole by invoking the dumpHeap operation of the HotSpotDiagnostic MBean, as Figure 2 indicates.

Figure 2: Obtain a heap dump by invoking the dumpHeap operation of the HotSpotDiagnostic MBean. ^?

This is particularly useful and convenient when you are using JConsole to monitor the application because you can do monitoring and troubleshooting with a single tool. In addition, JConsole allows you to connect to an application remotely, and thus you can request a heap dump from another machine.

You have now read about two ways to obtain a heap dump at runtime. You can also request that a heap dump be created when an OutOfMemoryError is first thrown by setting the HeapDumpOnOutOfMemoryError HotSpot VM option. You can set this option on the command line when you start the application:

$ /bin/java -XX:+HeapDumpOnOutOfMemoryError ...

This option can also be set while the application is running by using the jinfo command:

$ /bin/jinfo -flag +HeapDumpOnOutMemoryError

And lastly, the HeapDumpOnOutOfMemoryError option can be set with JConsole by invoking the setVMOption operation of the HotSpotDiagnostic MBean, as in Figure 3.

Figure 3: Set a VM option by invoking the setVMOption operation of the HotSpotDiagnostic MBean.

When an OutOfMemoryError is thrown, a heap dump file named java_pid.hprof will be created automatically:

java.lang.OutOfMemoryError: Java heap space


Dump heap to java_pid1412.hprof ...


Heap dump file created [68354173 bytes in 4.416 secs ]


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space


        at MemLeak.consumeMemory(MemLeak.java:25)


        at MemLeak.main(MemLeak.java:6)

Heap Analysis

Once you have the heap dump, you can use the jhat command to do the heap analysis and determine which references are keeping the leak suspect alive:

$ /bin/jhat heap.dump.out

This produces the following sample output:

Reading from heap.dump.out...
Dump file created Tue Jul 20 12:05:59 PDT 2006
Snapshot read, resolving...
Resolving 283482 objects...
Chasing references, expect 32 dots..........................................
Eliminating duplicate references............................................
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.

The jhat utility, the heap analysis tool formerly known as HAT, reads a heap dump and starts an HTTP server on a specified port. You can then use any browser to connect to the server and execute queries on the specified heap dump. Figure 4 shows all classes excluding java.* and javax.* in the heap dump that jhat analyzes. This tool supports a number of queries including the following:

Show all reference paths from the root set to a given object. This is particularly useful for finding memory leaks.
Show the instance counts for all classes.
Show the heap histogram including the instance counts and sizes for all classes.
Show the finalizer summary.

Figure 4: The heap dump shows all classes other than java.* and javax.*.

You can also develop your own custom queries with the built-in Object Query Language (OQL) interface to drill down through a specific problem. For example, if you want to find all java.lang.String objects of string length 100 or more, you can enter the following query in the OQL query page:

select s from java.lang.String s where s.count >= 100

Finalizers

Another possible cause of an OutOfMemoryError is the excessive use of finalizers. The java.lang.Object class has a protected method called finalize . A class can override this finalize method to dispose of system resources or to perform cleanup before an object of that class is reclaimed by garbage collection. The finalize method that can be invoked for an object is called a finalizer of that object. There is no guarantee when a finalizer will be run or that it will be run at all. An object that has a finalizer will not be garbage collected until its finalizer is run. Thus, objects that are pending for finalization will retain memory even though the objects are no longer referenced by the application, and this could lead to a problem similar to a memory leak.

Ways to Diagnose Excessive Use of Finalizers

Excessive use of finalizers retains memory and prevents the application from quickly reclaiming that memory. Such excessive use can cause an OutOfMemoryError. As Figure 5 shows, you can use JConsole to monitor the number of objects pending for finalization.

Figure 5: The VM tab in JConsole shows the number of objects pending for finalization.

You can also find out what the finalizable objects are in the heap dump using jhat as described earlier.

In addition, on the Solaris and Linux operating systems, you can use the jmap utility to find the classes of the finalizable objects:

$ /bin/jmap -finalizerinfo

Deadlocks

A deadlock occurs when two or more threads are each waiting for another to release a lock. The Java programming language uses monitors to synchronize threads. Each object is associated with a monitor, which can also be referred as an object monitor. If a thread invokes a synchronizedmethod on an object, that object is locked. Another thread invoking a synchronized method on the same object will block until the lock is released. Besides the built-in synchronization support, the java.util.concurrent.locks package that was introduced in J2SE 5.0 provides a framework for locking and waiting for conditions. Deadlocks can involve object monitors as well as java.util.concurrent locks.

Typically, a deadlock causes the application or part of the application to become unresponsive. For example, if a thread responsible for the graphical user interface (GUI) update is deadlocked, the GUI application freezes and does not respond to any user action.

Ways to Diagnose Deadlocks

Java SE 6 provides two very convenient ways to find out whether a deadlock has occurred in an application and also enhances the deadlock detection facility to support java.util.concurrentlocks. Both JConsole and the jstack command can find deadlocks that involve object monitors -- that is, locks that are obtained using the synchronized keyword -- or java.util.concurrentownable synchronizers.

Figure 6 shows that there are two deadlocks in the Deadlock application, and the Deadlock 2 tab shows the three deadlocked threads that are blocked on an object monitor. Each deadlock tab shows the list of threads involved in the deadlock, identifies which lock a thread is blocked on, and indicates which thread owns that lock.

Figure 6: JConsole detects two deadlocks and provides details.

You can also use the jstack utility to get a thread dump and detect deadlocks:

$ /bin/jstack

Following is the bottom part of a sample jstack output that detects one deadlock that involves the java.util.concurrent ownable synchronizer. Click here for a larger sample.

Looping Threads

Looping threads can also cause an application to hang. When one or more threads are executing in an infinite loop, that loop may consume all available CPU cycles and cause the rest of the application to be unresponsive.

Ways to Diagnose Looping Threads

Increasing CPU usage is one indication of a looping thread. JTop is a JDK demo that shows an application’s usage of CPU time per thread. JTop sorts the threads by the amount of their CPU usage, allowing you to easily detect a thread that is using inordinate amounts of CPU time. If high-thread CPU consumption is not an expected behavior, the thread may be looping.

You can run JTop as a stand-alone GUI:

$ /bin/java -jar /demo/management/JTop/JTop.jar

Alternately, you can run it as a JConsole plug-in:

$ /bin/jconsole -pluginpath /demo/management/JTop/JTop.jar

This starts the JConsole tool with an additional JTop tab that shows the CPU time that each thread in the application is using, as shown in Figure 7. The JTop tab shows that the LoopingThread is using a high amount of CPU time that is continuously increasing, which is suspicious. The developer should examine the source code for this thread to see whether it contains an infinite loop.

Figure 7: The JTop tab shows how much CPU time each thread in the application uses.

High Lock Contention

Synchronization is heavily used in multithreaded applications to ensure mutually exclusive access to a shared resource or to coordinate and complete tasks among multiple threads. For example, an application uses an object monitor to synchronize updates on a data structure. When two threads attempt to update the data structure at the same time, only one thread is able to acquire the object monitor and proceed to update the data structure. Meanwhile, the other thread blocks as it waits to enter the synchronized block until the first thread finishes its update and releases the object monitor. Contended synchronization impacts application performance and scalability.

Ways to Diagnose High Lock Contention

Determining which locks are the bottleneck can be quite difficult. The JDK provides per-thread contention statistics such as the number of times a thread has blocked or waited on object monitors, as well as the total accumulated time spent in lock contention. Information about the number of times that a thread has blocked or waited on object monitors is always available in the thread information displayed in the Threads tab of JConsole, as shown in Figure 8.

Figure 8: The Threads tab shows the number of times that a thread has blocked or waited on object monitors.

But the ability to track the total accumulated time spent in contention is disabled by default. You can enable monitoring of the thread contention time by setting the ThreadContentionMonitoringEnabled attribute of the Threading MBean to true, as shown in Figure 9.

Figure 9: Enable monitoring of the thread contention by setting the ThreadContentionMonitoringEnabled attribute of the Threading MBean.

You can check the thread contention statistics to determine whether a thread has higher lock contention than you expect. You can get the total accumulated time a thread has blocked by invoking the getThreadInfo operation of the Threading MBean with a thread ID as the input argument, as Figure 10 shows.

Figure 10: Here is the return value of the getThreadInfo operation of the Threading MBean.

Java SE 6 Platform’s Monitoring and Management Capabilities

The monitoring and management support in Java SE 6 includes programmatic interfaces as well as several useful diagnostic tools to inspect various virtual machine (VM) resources. For information about the programmatic interfaces, read the API specifications.

JConsole is a Java monitoring and management console that allows you to monitor the usage of various VM resources at runtime. It enables you to watch for the symptoms described in the previous section during the execution of an application. You can use JConsole to connect to an application running locally in the same machine or running remotely in a different machine to monitor the following information:

Memory usage and garbage collection activities
Thread state, thread stack trace, and locks
Number of objects pending for finalization
Runtime information such as uptime and the CPU time that the process consumes
VM information such as the input arguments to the JVM and the application class path

In addition, Java SE 6 includes other command-line utilities. The jstat command prints various VM statistics including memory usage, garbage collection time, class loading, and the just-in-time compiler statistics. The jmap command allows you to obtain a heap histogram and a heap dump at runtime. The jhat command allows you to analyze a heap dump. And the jstack command allows you to obtain a thread stack trace. These diagnostic tools can attach to any application without requiring it to start in a special mode.

Java Troubleshooting