Saturday, August 10, 2013

understanding java's native heap (or c-heap) and java heap

inspiration for this post

Not that long ago, I was diagnosing an issue on Jenkins where I was seeing an OutOfMemoryError in a native API. "What hijinks be these?" I thought to myself, since while the memory footprint was high GC wasn't getting out of control. Of course, like so many things, I had to learn what the cause of these errors was within the context of a service outage.

my first exposure to java's native heap

When we think of the Java heap, we usually think of this chunk of memory that is kept in order for us by the Garbage Collector, and why wouldn't we? Any call to the new operator is allocating memory within the heap for whatever instance we're creating, and the Garbage Collector is keeping tabs on that instance so that, when it's no longer in use, the memory can be reclaimed within the heap. That last bit is important. I'm sure most people already know this, but it's still worth calling out that the heap doesn't shrink once it's grown, and will grow up to its max heap (-Xmx) size.

If you're using a 32-bit JVM, the max you can set your heap to is 4GB (or less depending on the OS), which is inclusive of the max heap and permgen size. Conversely, on a 64-bit JVM, you're limited by the machine as to what you set as the boundaries to your heap (depending on JVM implementation and CompressedOops).

What you have left to work with, in both of these limitations, is the free space available to the native heap (or c-heap). I'm calling out that this is the free space available because the Java heap we've all grown to know and love is a section of the native heap; they're not mutually exclusive areas of memory. This space is used for native APIs and data, and it can most definitely run out.

Let's say you're using a 32-bit JVM, your OS can handle a 4GB heap, and you've allocated 3.5GB as the max heap and 384MB to permgen. Should you max those out, you've left your native heap with 128MB to do everything it needs to. In some applications this may not be a problem, but under certain conditions, say if you're heavily using IO, you could end up exhausting this memory, leaving you with an out of memory error in a native method. For example:

java.lang.OutOfMemoryError
  at java.util.zip.ZipFile.open(Native Method)
  ...

There are a few more interesting details about the native heap that are worth pointing out:

  • The native heap isn't managed by the garbage collector. The portion that makes up the Java heap is, of course.
  • Using -XX:+HeapDumpOnOutOfMemoryError won't actually work on OutOfMemoryErrors caused by exhaustion of the native heap. There's a bug ticket logged for this which was, in my opinion incorrectly, closed as not reproducible.
  • A heap dump won't actually reveal what's happening in the native heap; you'll need process monitoring to figure that out.
  • Loading anything into or out of the native heap from the Java heap that isn't already a byte array requires serialization for insertion and deserialization for retrieval.

can you store stuff in the native heap from within your application?

Honestly, I could write an entire blog post just about off-heap storage (in fact I started to write one here and stopped). I may very well write a post about that, but I'll leave you with the following advice: Yes you can and probably shouldn't on your own. There are a couple of ways to do this, one being ByteBuffer (the "legit" way) and sun.misc.Unsafe which you have to pry out of the JVM using reflection backdoors.

One detail that may not be obvious is that there are other settings for direct memory in the JVM. There's another flag that can be passed to the JVM called -XX:MaxDirectMemorySize which is different than heap size. Terracotta has an excellent write up about this, which while it's for their product BigMemory touches on a lot of interesting data points that have to do with off-heap memory management.

I'd also like to point out that ByteBuffer delegates to a class called Bits which has some really sketchy implementation details when you allocate memory, so you shouldn't make calls to allocate any more often than necessary. Rather than type out the details, I'll just show you the code in all its glory (I put my name in the comments for the lines I wanted to draw your attention to):

If you were trying to store large blocks of data represented by byte arrays without blowing up your old generation heap space, using off-heap storage can be very beneficial. You can put all of the data in a ByteBuffer and read it from an InputStream, though that involves keeping track of offsets of data in the buffer and either writing your own InputStream to support the buffer or finding one that's already implemented in another project.

If you were trying to use off-heap storage as a cache for Java objects, you should probably look at something preexisting like Hazelcast or Terracotta's BigMemory. The two challenges you'll end up with trying to handle this yourself are serialization/deserialization since all objects must be converted to/from byte arrays, and managing how you're accessing that data directly from memory. The serialization and deserialization aspects can be painful from a performance standpoint, especially using Java's built in serialization. You can get significantly better performance using something like Kryo or Jackson Smile which serializes to binary JSON. There's also fast-serialization, which claims to be faster than Kryo and has some off-heap storage implemented with more in the works. Hazelcast recently did a comparison of Kryo and Smile, and the results clearly show a noticeable improvement in performance. Accessing the data is also non-trivial, since you need to allocate large chunks of data and manage offsets yourself to fetch the correct data.

If you were trying to use off heap for dealing with IO, you should check out Netty, which not only works very well and intuitively, but also does the job better than ByteBuffer

There's a really nice blog post at http://mishadoff.github.io/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/ that goes through the many things you can and shouldn't do with Unsafe if you're interested. There's also a fantastic writeup about using ByteBuffer and dealing with all of its idiosyncrasies at http://mindprod.com/jgloss/bytebuffer.html

3 comments:

  1. Replies
    1. Thanks for the highly detailed feedback. Would you care to give me some details as to why you have such a negative opinion of this post?

      Delete
  2. Excellent article, many thanks. I have only just become aware of the native heap... assuming blissful for many years there was simply a single heap!
    Off-heap storage is fascinating. As I now mostly work with ART (the new Dalvik) I am inspired to look as to how this material relates!

    ReplyDelete