Thursday, April 14, 2011

Transforming XML in CXF

Working with XML based web services does not only involve having a proxy or HTTP client invoking on the remote endpoint with the underlying data-binding magically populating a given bean instance with the values contained in XML tags.

Working with XML also implies that some care has to be taken for all those clients distributed all over the enterprise and the WEB at large be able to consume the existing and newer endpoints.

If you have a newer endpoint deployed that can handle the payloads from the older clients then the backward compatibility is maintained. If the validation is enabled then this is usually achieved by adding optional elements to the updated schema.

What happens if you have so many endpoints that they have to be gradually replaced by newer ones, should newer clients be blocked from consuming the old endpoints ? Is it really needed if all the new payloads add is some ignorable content that makes the information about a given concept more complete for newer endpoints ? Is it even realistic as in the case of the WEB ? Is telling such clients 'wait, the endpoint is being upgraded' is the cheapest/simplest option ?

This is where the forward compatibility comes in and it implies that the unrecognized content has to be ignored. Disabling the validation is not always safe as the newer clients providing new content can expect that that new content has been processed into account as opposed to being ignored. Validation will prevent the important unrecognized tags from being dropped. But it also will prevent the forward compatibility altogether.

The new CXF Transform Feature has been introduced and hopefully it will help users with resolving all sorts of backward and forward compatibility issues. CXF JAX-WS and JAX-RS endpoints and clients can be configured for namespaces be dropped or changed, elements dropped, changed or appended to existing elements. This is all realized at the STAX XMLStreamReader and XMLStreamWriter levels so it is fast and effective.

Does your web service need to talk to the legacy server which does not understand what namespaces are ? Do you need to consume a response from the newer endpoint which has a new namespace introduced ? Do you need certain elements ignored or dropped on the input/output ?

Use the transformation feature, get inspired and write the services which just work and don't cause all the clients be recompiled/updated whenever a minor change to the new endpoint has been applied. Have the web service clients talking to a variety of endpoints without changing the code.

The use of the transform feature in combination with the servlet-based redirection allows for replacing the old endpoints with the new ones and redirecting the requests from the old clients to the new endpoints. Which is quite cool. See this web.xml (CXFServletV1 redirects to CXFServletV2, effectively changing final URI path from /v1/rest-transform to /v2/rest-transform), with CXFServletV2 serving a jaxrs:endpoint with the "restTransform" id, the last jaxrs endpoint in this beans.xml. Note this endpoint relies on the transform feature which drops the namespaces from the inbound payloads and changes the name of the outbound element, only for redirected requests - so that V1 clients can talk to V2 endpoints without V2 clients being affected.

Finally, if you use the default CXF JAX-RS JSONProvider then the transform feature can be applied to input/output JSON sequences too, surely you don't want the JSON clients failing to parse the JSON data if you happen to add one more property to the JAXB bean which is used to produce a JSON sequence :-).

Most of it can also be done by custom CXF interceptors as well, which can register custom STAX handlers or use XSLT or XPath. I've updated the Advanced XML section on the CXF JAX-RS wiki, have a look please.

No comments: