The claimcheck pattern is described in the EIP book Camel is based on. The way it is described, the pattern uses a data store which means that it applies mostly to local processing. However, in most of the cases, at least in my experience, the processing is not local so one needs to retrieve the original message at a different location where access to the data store is not necessarily possible.
As a bit of background, the commonly used analogy for the claimcheck pattern is the process used to check-in baggage while traveling and claim it at the destination. The rationale is that cabin space is a scarce resource (as is cpu, memory and bandwidth) and different parts of the in-flight entity have different SLA requirements: humans need oxygen, leg room, pressurized cabin space, food, entertainment, etc. whereas baggages can be stacked in the cargo hold. In some cases they may take different routes to destination as well.
So how should this be implemented? Let's take a look at the elements that define the pattern. First we have a message that via some logic will be transformed into two messages (1). Let's call the two Messages the "initial message" and the "claim message". That is the Content Filter part described in the pattern, not to be confused with the filter dsl in camel that is actually a Message Filter, very different thing.
Our two messages will travel separately between departure and arrival, to borrow the terminology from the travel analogy via two different Message Channels (2), which in Camel we know as routes. The first message channel is the initial route we setup, let's call it the "main channel". The second channel is a one way one, let's call it the "baggage channel". Between departure and arrival, the initial message will go on the bagage channel and the claim message will go on the main channel.
We also need to generate a Correlation Identifier (3) to preserve the association (the equivalent of the bar coded tag on the baggage and the boarding pass). The fact that one could have multiple baggages is outside the scope of this pattern, it can be handled by a splitter/aggregator. The correlation id is both attached to the initial message and supplied somewhere in the claim message.
The claim message will replace the initial message on the main channel and may undergo further processing. It may contain partial content from the initial message required for processing, it may need to be of a specific type, we cannot make many assumptions. Two things are clear though: it is produced from the initial message, so we need a Processor (4) for that, and it contains the correlation id, so we need an Expression (5) to extract it from there at destination. While we need to attach the correlation id to the initial message too, we have a bit more freedom on the baggage channel, so let's simplify things a bit and use a custom header/property instead of customizing too much and require another Expression.
Putting together the five elements above, it starts to look like in the general case we need a separate, configurable one-way (in-only) route with some processing in the departure and arrival endpoints. I already looks like our implementation will actually require a different kind of Endpoint which also means a Component. It should support multiple queues at departure and arrival (i.e. both for check-in and claim). Since processing takes place on the main channel and at arrival during baggage claim the initial message is restored, it must be retrieved from the arrival claim queue, which means that the arrival queue must support random access. This is a key aspect, sometimes overlooked, that gives a lot of grief when implementing this pattern.
As far as implementation goes, I started to play with a bit of code so you could see a solution in one of my next posts and probably in the Apache Camel trunk soon.