Saturday, August 17, 2013

jersey server side client and resource with in memory request handling

who's to say who's crazy and who's not?

This is a question from one of my favorite comedians, Otis Lee Crenshaw. He answers his own question with this (paraphrased):

If you can play the guitar and harmonica at the same time like Bob Dylan or Neil Young you're considered a genius. Make that extra effort to strap a pair of cymbals to your knees and people will cross the street just to get the hell away from you.

I guess what I'm saying is, that in this post, I think the cymbals may be firmly affixed.

I've spent a fair amount of time thinking about the problem this blog post is centered on, and I think that the solution I'm suggesting has value, though I could certainly understand people not liking it. At all. So, without further adieu...

the problem

Let's say you have a web application, and you have some data (a resource, if you will) that you're pulling in on the server side and spitting out into a view using a typical MVC pattern. For the sake of giving some context to this problem, let's say your data is messages posted by a user. Since there's no massive website that deals with this kind of data at unfathomable scale, I thought this would be a nice unique use case.

Your MVC application has worked well, but it could stand to be improved. At some point, you decide on pursuing either one or both of the following (since the net result is the same):

  • There's not much value in displaying more than the last 20 messages for any given user up front on a page. Fetching additional posts on the client side via an AJAX call makes more sense.
  • This website should have a mobile presence, but data consumption is a concern. Loading smaller groups of messages on a user by user basis on demand is preferable for the user experience.

In either case, a popular solution is to create a RESTful endpoint that exposes those messages. A simple call to /messages/{user}?offset=x&limit=y returning a nice lightweight JSON representation, and we've satisfied both cases. Problem solved!

Or is it...

You don't want to pull down everything this way, or at least not on the desktop site. There's value in pulling some of this data server side: your clients don't have to make additional connections to get the initial view, you don't have to consume additional resources on the server side for handling network traffic, and you may be able to more easily reuse other resources like database sessions.

At the same time, (I think) there's value in being consistent. Is the controller of your desktop view accessing your messages the same way your resources are? Are you sure you're serializing and deserializing to the same values client side as your are server side? You could programmatically invoke your resources and get typesafe return values, but is the behavior different when it's a request vs a method call? How are the execution paths different from one another?

Do they have to be different at all?

an idea for a solution

There's an obvious problem with making calls to the resource via an actual HTTP request when you're taking about a client and resource that exist in the same JVM: all the overhead of actually making a network connection via localhost. That may be in order of a few milliseconds, but those milliseconds add up, and studies have shown end users have little patience for latency.

If you make enough calls, this can easily turn into 10's or even 100's of milliseconds of latency

As it turns out, you can avoid this network hop entirely, but it requires some (mildly kludgy) work.

Jersey provides a type called Connector for its client API. This type is used to handle connections that are created from ClientRequest instances. I mentioned in a previous post how to set up an Apache HTTP Client connector if you'd like to see an example of this.

More interestingly, the type ApplicationHandler can be injected using Jersey's @Context based injection system, and represents a hook into the Jersey server instance that is only a few levels removed from the methods of the servlet container itself. All of Jersey's routing logic is downstream from ApplicationHandler, so sending requests to it means you're largely getting the full HTTP request taste without the calories.

We're going to need to capture the ApplicationHandler instance at startup. Unfortunately, the code to do this is quite ugly in its current state. You could no doubt do this more cleanly using dependency injection, but I think you get the point. First we'll need a provider for Jersey that will allow us to capture the instance, and a way to construct Connector instances with that instance:

Then we'll need to wire it up to the application:

Now, we need the connector itself.

translating different request and response representations

Most of the work done in the connector is ugly get/set logic; nothing sophisticated or glamorous. It still needs some explanation though.

At a high level, we need to implement the apply method, and inside get a ContainerRequest from a ClientRequest, pass it to the ApplicationHandler, and then convert a ContainerResponse to a ClientResponse. Here's the skeleton code:

Let's dig into building the request first:

We end up copying the main three pieces of data we need:

  • The URI data (needed to construct)
  • The headers
  • The request body (or entity)

I should call out that two of the arguments to the ContainerRequest constructor are null. The first is a reference to a SecurityContext instance, which is outside the scope of this post. The second is called PropertiesDelegate, which isn't actually Javadoc'd. The example works without it, though I may go back and dig into what it does later.

Now that we have the request, we need to send it to the ApplicationHandler:

As you can see, we get a ContainerResponse instance back, which we'll need to convert to a Response, which is then used to create the ClientResponse instance we have to return:

Full disclosure: I don't know if I'm covering all the bases as far as what needs to be set on Response. I have for my example, but I don't know what I may be missing for other use cases at this point. I will update this post as I discover other data points that may need to be handled.

With this in place, we have apply fully implemented:

There's a full implementation of the ServerSideConnector type located here. Another method has to be implemented for asynchronous functionality that closely resembles the apply method shown, along with a close and getName method that can be seen in the linked code.

use within a client and performance

Here's an example client that handles resources that deal with Messages. As you can see, the invocation of the client is pretty cookie cutter, and the Connector implementation can be swapped out per the lines in the constructor:

At the beginning of this post I called out performance as a concern due to the latency of passing data over a network connection when it could be passed directly via memory instead. Let's see if the in-memory solution indeed performs better.

HTTP client requests:

In-memory requests:

Yep, I'd say it's faster.

is this really necessary?

Probably not, but I wanted to figure out how to do it. I do think there's value in hitting a consistent set of execution points for a single type of transaction, and that one way to keep it consistent is to have a single entry point, which this accomplishes (though with some overhead).

One major motivation for this was a continual challenge I've faced in regard to supporting both desktop and mobile versions of a website, and keeping those sites consistent.

Another use case I can think of for this is being able to break apart an application into more of a service oriented architecture as time permits. Up front it may not make sense to split an application up into too many services, but as your organization, web traffic, and resources grow, your need to split your application up to scale either to your customer base or your development staff will grow as well. Using a client right out of the gate, and being able to abstract interactions to the resources backing it from internal to external by changing two lines of code does, in my opinion, has value.

I'm interested to hear feedback about this solution, because I know it strays from the norm considerably. Like I said, the cymbals are firmly attached; are you planning on crossing the street?

5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hi, nice post that gave me lot of interesting points to investigate further, but... (there's always a but, right?)

    First just a quick note, I found your post after searching (for considerable time) about using new type of connectors on the server-side (I don't understand why Jersey only have client-side connectors, even if they can be used on the server side as well but differently from true server-side connectors) because I'm trying to do a POC of a multi-protocol REST middleware by implementing a MQ connector on the server-side. Wwe did this with Jersey with earlier versions (that has been in production for several years now) but we are considering going forward with either Jersey 2 or change to Fuse (for which a similar POC was made quite easily). So to cut a long story short I was considering of implementing that as a Container or as a Application or as a Connector.

    Going back to the "but"... I ran your example and got it to run, but as far as I can see what it does it's just replacing a URL for another, so that instead of having calls directed to, say, http://mycompany.com/rest/example/123 they will be directed to http://localhost/rest/example/123. But that is still a HTTP call that will go thru the HTTP server, and not "in memory". Or am I missing something here?

    The result I got were

    With "normal" connector:
    Time elapsed (get): 7ms
    Time elapsed (get): 7ms
    Time elapsed (get): 6ms
    Time elapsed (get): 5ms
    Time elapsed (get): 5ms
    Time elapsed (get): 6ms
    Time elapsed (get): 8ms
    Time elapsed (get): 13ms
    Time elapsed (get): 6ms
    Time elapsed (get): 5ms

    With "your" connector:
    Time elapsed (get): 5ms
    Time elapsed (get): 6ms
    Time elapsed (get): 4ms
    Time elapsed (get): 10ms
    Time elapsed (get): 5ms
    Time elapsed (get): 4ms
    Time elapsed (get): 4ms
    Time elapsed (get): 4ms
    Time elapsed (get): 3ms
    Time elapsed (get): 4ms

    With ApacheConnector:
    Time elapsed (get): 8ms
    Time elapsed (get): 6ms
    Time elapsed (get): 6ms
    Time elapsed (get): 6ms
    Time elapsed (get): 6ms
    Time elapsed (get): 5ms
    Time elapsed (get): 5ms
    Time elapsed (get): 5ms
    Time elapsed (get): 12ms
    Time elapsed (get): 5ms

    With such values I can't take any conclusion, specially knowing that if I just let the request go thru directly, it will be always 0 ms response time. So again, I'm probably missing something here.

    BTW, I struggled a bit to put your example to work, I was getting a NPE and finally had to change one line:

    ContainerRequest containerRequest = new ContainerRequest(URI.create("http://localhost:8080/myapp"), uri, method, null, new MapPropertiesDelegate());


    ReplyDelete
    Replies
    1. Ugh, I just wrote a huge reply to this and my browser lost the message :(

      First, thanks for commenting on the post Antonio, sorry it took me a few days to get back to you.

      Regarding the NPE, I'm wondering if there's a difference in the versions of Jersey we were using. I wasn't running into that issue locally, but maybe I should try a newer version to see it happens to me.

      As far as in memory vs over HTTP are concerned, I'm intercepting the ClientRequest and converting it directly to a ContainerRequest via the Connector, but I never actually open a connection to send data. It's just a matter of passing the appropriate values into the ContainerRequest to simulate an HTTP request, and then injecting it into the ApplicationHandler to route it accordingly. By injecting it into the ApplicationHandler, it's circumventing the Servlet API entirely and just hitting Jersey past the point where it would convert an HttpServletRequest/Response into a ContainerRequest/Response.

      The URI that gets passed into the ContainerRequest is only used for internal routing; it doesn't get turned into a connection, or at least it doesn't to my knowledge.

      That said, I want to understand why it wasn't working correctly for you. It may make more sense for me to attach a zip file of the project I ran locally for this post in a state where I saw it running in memory rather than over the network to see how that behaves on your system. Do you see an HTTP connection being made when you tried to run this yourself?

      As far as the performance differences go, I'm not really sure. Perhaps its environmental? This may also be brought to light by attaching my local copy.

      Delete
  3. Nice work! I've been dabbling a little with this same idea as well, but on a lower level (see https://github.com/KlausBrunner/localsock for one partial result). I do believe it makes a lot of sense to break large applications up into REST services of fairly small granularity, even when they're only communicating locally. Provides great run-time wiring flexibility, easier component testing (using anything that speaks HTTP), and simple tracing using tcpdump/Wireshark even in production. I've been using this approach quite successfully with an application that would have traditionally called for a huge EAR with lots of EJBs (or Spring beans), or an even more monolithic approach.

    Of course, in some cases the performance impact isn't negligible any longer, and that's where it makes sense to cut out the TCP stack roundtrip.

    ReplyDelete
    Replies
    1. Thanks for the feedback and compliment. I'll have to check out your work as well!

      Delete