inspiration for this post
Not that long ago, I was diagnosing an issue on Jenkins where I was seeing an
OutOfMemoryError in a native API. "What hijinks be these?" I thought to myself, since while the memory footprint was high GC wasn't getting out of control. Of course, like so many things, I had to learn what the cause of these errors was within the context of a service outage.
my first exposure to java's native heap
When we think of the Java heap, we usually think of this chunk of memory that is kept in order for us by the Garbage Collector, and why wouldn't we? Any call to the
new operator is allocating memory within the heap for whatever instance we're creating, and the Garbage Collector is keeping tabs on that instance so that, when it's no longer in use, the memory can be reclaimed within the heap. That last bit is important. I'm sure most people already know this, but it's still worth calling out that the heap doesn't shrink once it's grown, and will grow up to its max heap (-Xmx) size.
If you're using a 32-bit JVM, the max you can set your heap to is 4GB (or less depending on the OS), which is inclusive of the max heap and permgen size. Conversely, on a 64-bit JVM, you're limited by the machine as to what you set as the boundaries to your heap (depending on JVM implementation and CompressedOops).
What you have left to work with, in both of these limitations, is the free space available to the native heap (or c-heap). I'm calling out that this is the free space available because the Java heap we've all grown to know and love is a section of the native heap; they're not mutually exclusive areas of memory. This space is used for native APIs and data, and it can most definitely run out.
Let's say you're using a 32-bit JVM, your OS can handle a 4GB heap, and you've allocated 3.5GB as the max heap and 384MB to permgen. Should you max those out, you've left your native heap with 128MB to do everything it needs to. In some applications this may not be a problem, but under certain conditions, say if you're heavily using IO, you could end up exhausting this memory, leaving you with an out of memory error in a native method. For example:
java.lang.OutOfMemoryError at java.util.zip.ZipFile.open(Native Method) ...
There are a few more interesting details about the native heap that are worth pointing out:
- The native heap isn't managed by the garbage collector. The portion that makes up the Java heap is, of course.
-XX:+HeapDumpOnOutOfMemoryErrorwon't actually work on
OutOfMemoryErrors caused by exhaustion of the native heap. There's a bug ticket logged for this which was, in my opinion incorrectly, closed as not reproducible.
- A heap dump won't actually reveal what's happening in the native heap; you'll need process monitoring to figure that out.
- Loading anything into or out of the native heap from the Java heap that isn't already a byte array requires serialization for insertion and deserialization for retrieval.
can you store stuff in the native heap from within your application?
Honestly, I could write an entire blog post just about off-heap storage (in fact I started to write one here and stopped). I may very well write a post about that, but I'll leave you with the following advice: Yes you can and probably shouldn't on your own. There are a couple of ways to do this, one being
ByteBuffer (the "legit" way) and sun.misc.Unsafe which you have to pry out of the JVM using reflection backdoors.
One detail that may not be obvious is that there are other settings for direct memory in the JVM. There's another flag that can be passed to the JVM called
-XX:MaxDirectMemorySize which is different than heap size. Terracotta has an excellent write up about this, which while it's for their product BigMemory touches on a lot of interesting data points that have to do with off-heap memory management.
I'd also like to point out that
ByteBuffer delegates to a class called
Bits which has some really sketchy implementation details when you allocate memory, so you shouldn't make calls to allocate any more often than necessary. Rather than type out the details, I'll just show you the code in all its glory (I put my name in the comments for the lines I wanted to draw your attention to):
If you were trying to store large blocks of data represented by byte arrays without blowing up your old generation heap space, using off-heap storage can be very beneficial. You can put all of the data in a
ByteBuffer and read it from an
InputStream, though that involves keeping track of offsets of data in the buffer and either writing your own
InputStream to support the buffer or finding one that's already implemented in another project.
If you were trying to use off-heap storage as a cache for Java objects, you should probably look at something preexisting like Hazelcast or Terracotta's BigMemory. The two challenges you'll end up with trying to handle this yourself are serialization/deserialization since all objects must be converted to/from byte arrays, and managing how you're accessing that data directly from memory. The serialization and deserialization aspects can be painful from a performance standpoint, especially using Java's built in serialization. You can get significantly better performance using something like Kryo or Jackson Smile which serializes to binary JSON. There's also fast-serialization, which claims to be faster than Kryo and has some off-heap storage implemented with more in the works. Hazelcast recently did a comparison of Kryo and Smile, and the results clearly show a noticeable improvement in performance. Accessing the data is also non-trivial, since you need to allocate large chunks of data and manage offsets yourself to fetch the correct data.
There's a really nice blog post at http://mishadoff.github.io/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/ that goes through the many things you can and shouldn't do with
Unsafe if you're interested. There's also a fantastic writeup about using
ByteBuffer and dealing with all of its idiosyncrasies at http://mindprod.com/jgloss/bytebuffer.html