Wednesday, September 21, 2011

The Case for Patch Releases

A new patch release of Apache Camel, version 2.8.1 was released last week. I didn't announce releases in a while. Why write about camel-2.8.1 then?

The reason is that the Apache Camel community didn't produce patch releases in the past (we only did it for the 1.6.x version before discontinuing support for the 1.x versions). We actually started producing patch releases back in April for camel-2.7.x. At the time, I wasn't sure if and how this will continue because you need enough support within the community and some had the view that at Apache we should only focus on innovation.

The untold truth about that though has to do with the business models of open source. ZDNet had an excellent article a while ago, which makes for interesting reading. The author left out another business model, let's name it FUD ware, that relies on bundling known, successful open source projects and claim that only your distribution is production ready and offers what the market needs. That works better if you have some influence over the community that produces the original open source project. It is not enough, that's for sure, but combined with other business models (such as 1. Support Ware or 4. Project Ware) it may make the needed difference.

A few of us however, want the original ASF distribution to be stable, secure and production ready. After a bit of a struggle I am happy we managed to get the Apache Camel community used to the idea of producing patch releases more frequently. Between Dan Kulp and myself we issued since mid April three patch releases on camel-2.7.x, camel-2.8.1 last week and camel-2.8.2 is only a few weeks away.

This way you, our users, won't have to wait at least a quarter to have your issues resolved. At the end of the day, you the users, are the Apache Camel community.

Monday, September 12, 2011

Congrats Master Melegrito!

Personal things don't usually belong here, but this is a bit special. I am also still sore after this weekend.

Once a year, me and Ama have the opportunity to go with our instructor, 4th Dan Tae Kwon Do Master Josh Geeson, to train in self defense against stick and knife with Master Julius Melegrito, event organized in Charlotte, NC by Master Evins.

The seminar was a real treat as expected. Master Melegrito's speed and technique are amazing. After drills with two sticks, we got to one stick, repeating the same techniques. After the shoulders started burning we dropped the stick altogether and went mano-mano, repeating again, the same techniques only to realize how similar they are to Hapkido and Tae Kwon Do. Too bad this year we ran out of time and didn't get to practice defense against knife attacks. I am not complaining much though because last year my partner was Master Robert Shin who's not very forgiving and the hardwood floor was, well, hard.


Yes, Ama, you are the best!

What makes this year's event special is that Guro Melegrito is now a Black Belt Magazine 2011 Hall of Famer, the Weapons Instructor of the Year. Congrats Master Melegrito, many thanks for visiting and see you again next year!

Wednesday, September 7, 2011

Annotation for the Claimcheck Pattern

The claimcheck pattern is described in the EIP book Camel is based on. The way it is described, the pattern uses a data store which means that it applies mostly to local processing. However, in most of the cases, at least in my experience, the processing is not local so one needs to retrieve the original message at a different location where access to the data store is not necessarily possible.

As a bit of background, the commonly used analogy for the claimcheck pattern is the process used to check-in baggage while traveling and claim it at the destination. The rationale is that cabin space is a scarce resource (as is cpu, memory and bandwidth) and different parts of the in-flight entity have different SLA requirements: humans need oxygen, leg room, pressurized cabin space, food, entertainment, etc. whereas baggages can be stacked in the cargo hold. In some cases they may take different routes to destination as well.

So how should this be implemented? Let's take a look at the elements that define the pattern. First we have a message that via some logic will be transformed into two messages (1).  Let's call the two Messages the "initial message" and the "claim message". That is the Content Filter part described in the pattern, not to be confused with the filter dsl in camel that is actually a Message Filter, very different thing.

Our two messages will travel separately between departure and arrival, to borrow the terminology from the travel analogy via two different Message Channels (2), which in Camel we know as routes. The first message channel is the initial route we setup, let's call it the "main channel". The second channel is a one way one, let's call it the "baggage channel". Between departure and arrival, the initial message will go on the bagage channel and the claim message will go on the main channel.

We also need to generate a Correlation Identifier (3) to preserve the association (the equivalent of the bar coded tag on the baggage and the boarding pass). The fact that one could have multiple baggages is outside the scope of this pattern, it can be handled by a splitter/aggregator. The correlation id is both attached to the initial message and supplied somewhere in the claim message.

The claim message will replace the initial message on the main channel and may undergo further processing. It may contain partial content from the initial message required for processing, it may need to be of a specific type, we cannot make many assumptions. Two things are clear though: it is produced from the initial message, so we need a Processor (4) for that, and it contains the correlation id, so we need an Expression (5) to extract it from there at destination. While we need to attach the correlation id to the initial message too, we have a bit more freedom on the baggage channel, so let's simplify things a bit and use a custom header/property instead of customizing too much and require another Expression.

Putting together the five elements above, it starts to look like in the general case we need a separate, configurable one-way (in-only) route with some processing in the departure and arrival endpoints. I already looks like our implementation will actually require a different kind of Endpoint which also means a Component. It should support multiple queues at departure and arrival (i.e. both for check-in and claim). Since processing takes place on the main channel and at arrival during baggage claim the initial message is restored, it must be retrieved from the arrival claim queue, which means that the arrival queue must support random access. This is a key aspect, sometimes overlooked, that gives a lot of grief when implementing this pattern.

As far as implementation goes, I started to play with a bit of code so you could see a solution in one of my next posts and probably in the Apache Camel trunk soon.