Monday, April 13, 2009

Analysis of CXF RESTful client APIs

We have recently introduced the CXF RESTful client API for CXF users be able to consume non-SOAP HTTP-based web services at ease. The goal of this blog entry is to compare the three different API flavors CXF offers, discuss the pros and cons of each one, with some occasional off-topic musings :-)

Before I proceed I'd like to acknowledge the fact a Rest Easy JAX-RS implementation offers a proxy-based support in their own client framework. I was not aware of it at a time of introducing the CXF RESTful api but I'm happily doing it now - well done Rest Easy :-). If at least two implementations do a proxy-based API then may be there's a chance we'll get this approach standardized ?

Now, here's the actual comparison of the CXF RESTful apis.

Beautiful Proxies

Proxies have been written off as the main reason behind the brittle client applications having to be recompiled every time a service they consume changes. Others say they epitomize all the problems associated with the rpc-encoded, SOAP-based services. Others would avoid them simply because they're not cool.

Things can be quite different in the reality though and here's why.

Typically, the code which relies on proxies has to be recompiled whenever the service interface changes. These proxies have often been generated from a given service document. The key thing here is that it's actually service interface developers who control how robust the client-side proxies will be.

Whenever a new requirement arises users will often add yet another operation in the interface description instead of extending the data the existing operations operate upon or introducing another interface instead. Read the last sentence again : you should recognize this is how HTTP-based resources cooperate with each other. When a current (service) resource 'exhausts' itself, a new resource is introduced with the previous one delegating to the new one.

So what's it to do with proxies ? Proxies can capture this process remarkably well and with some help from the service interface authors they can weather the changes.


BookDescription desc = new BookDescription();
desc.setId(123);
Book book = BookStore.getBook(desc);
Chapters chapters = book.getChapters();


The rule is simple : design the interface such that a single complex type is accepted and/or returned from a given method. In the above example, when the book description changes it is well likely the client code won't need to be recompiled.

In fact it's likely the service code won't need to be recompiled either. It's ironic that everyone accepts the code like this on the server side, without talking about RPC, for writing say RESTful services :


@Path("/store")
class BookStore {

@GET @PATH("/{id}")
public Book getBook(BookDescription desc) {
}

}


Look - this code can be as brittle as the client code shown earlier on. Just add an integer id instead of BookDescription and then see what happens to this code when you need to retrieve by a book name as well : recompilation to accommodate for a new argument 'String name' or a new operation like getBookByName() in the Java code is a typical answer. Get a complex type instead and both clients and servers will become much more cost-effective when dealing with minor changes. By the way, pay attention to the getBookByName() alternative - it's not the problem of documents like WSDL that such operations pop up - rather it's the absence of single complex types in the signatures that cause them appear in the Java code.

Another thing to recognize in the above server code fragment is that the code which returns a Book serves as a client to a code which expects the Book on the other end. Rather than returning a name this code returns a Book - it's natural but imagine what would've happened to the code on both sides if a name was returned originally but then a new requirement to get the chapters would arise as well...

In fact the server code is often less aware of the fact that the distributed network is out there. With CXF JAXRS proxies you can get all the details of the underlying exceptions if needed and switch easily to their HTTP-centric forms.

When you view the outside web services world through a given proxy you often won't notice a difference between the styles. In fact, sometimes you'll actually see how little substance is there in some of those REST vs SOAP discussions. In my own mind, it's not generic interfaces in REST vs proxies in SOAP. Rather, shortcomings of the individual (WS-*) interfaces are often confused with those of SOAP or blamed on WSDL. In my own mind SOAP suffers from the lack of GET but on the other hand it can get the people agree on how to get messages secured across the hops. REST forces people to think in terms of data as a given resource typically supports a common set of verbs, with SOAP you tunnel everything through POST but nothing prevents you from thinking in terms of data too, with REST you get the support of generic tools like browsers, with SOAP you miss on it but sometimes you probably don't notice it nor WEB does - you're just happy that your application delivers.

With the advent of JAX-RS MessageBody providers, proxies can cope even better with changes to the data. Everyone uses JAXB now because it magically hides the 'complexities' of XML from developers. Sometimes I find it strange though I do use JAXB now and then too. You know what - sometimes it is ok just to write a little bit of XML processing code, web services are about interoperability and XML is the underlying format which can make it possible. But if you do JAXB then you can always interpose an XML-processing JAX-RS provider which will adapt an incoming XML to the one which will be deserialized properly by JAXB. Or perhaps you can just skip JAXB altogether as one of CXF users suggested once.

Proxies based on the JAX-RS Path values with custom regular expressions can serve as early validation points. Proxies in CXF can throw the exceptions you need and can let you examine the actual response headers and easily switch to HTTP-centric clients. Using a given proxy is like exercising a micro Domain-Specific-Language instance.

Proxies and existing JAXRS annotations can coexist very well in most cases. One exception is that @Contexts can not be used as method parameters, it is ok. Another edge case is that when root path annotations with template variables, those sitting on top of the resource class, may (not always) require a client to provide the substitution instances at the proxy creation time.

Would I use them myself. If it were not an infinitely rich data model that my code were to consume then why not ? If I knew I was about to write a code consuming a well-behaving web service which deals with books and clearly documents its extension policies, then why not ? Would it turn my client code into an RPC-encoded piece as opposed to the RESTful one ? I don't know...

The breeze of fresh air : HTTP-centric clients

I have to admit : programming HTTP can be very refreshing, liberating if you wish. You can see it actually works, you know you do the generic interface programming, the code is explicit about gets, updates, deletes. It's new and indeed it's cool.


WebClient wc = WebClient.create("http://bookstore.com");
Response r = wc.get();
Book b = getWithJAXB(resonse.getEntity());


In CXF we saw no reason in introducing the client analogs of Response. When a server code uses Response, clearly it's intended for a client ? It's called 'Response' and one can get the status, headers (metadata) and the entity on either side. Same for other types like WebApplicationException.

I like this http-centric code but I'm still trying to figure out why and when I would actually use it.

First, I think it would be naive to assume using the above code makes it any more reusable than the proxy-based one. It's simply not the case that you can take this code and use against any other Book service out there simply because other services will do something different with a common set of HTTP verbs, or simply will deal with a completely different set, PATCH anyone or WebDav ?

Second, the combination of the generic HTTP code and that of JAXB (suppose a default JAXB provider is used to deal with XML) makes me a bit dizzy - though I like the above code still - may be because it is just something new. Why do we say we write a remote web services code here and yet completely hide away from what makes the fundamental idea behind web services, that of interoperability, work ?


WebClient wc = WebClient.create("http://bookstore.com");
Response r = wc.get();
Book b = getWithJAXB(resonse.getEntity());

Response r = wc.path("/foo/bar").get();
r = wc.back().get();





I do like the way WebClients can switch back and forward when working with a given service, or indeed convert themselves into proxies and back, say when dealing with a large set of services, with proxies being available for a subset of exposed resources. Switching from proxies to WebClients can be handy when handling the remote exceptions.

I like the way Builder pattern. It was obvious we'd need to do the builder pattern given that Jersey does it on the client side - if the client API does get standardized then you know the builder pattern will be there.

There is only one problem with the Builder pattern - you may not actually want to hardcode the wc.path("/foo/bar") into your client code - it's plain brittle. It's a common problem with all those in-code rules, paths, routes - people forget sometimes why they are writing the web service code - to work in the interoperable way as long as possible.

When would I use WebClients ? They'd excel in testing the resources, for sure. It's new and it's fresh. They'd work nicely in combination with proxies, or indeed against the well-behaved services, perhaps embedding explicit paths will work.


Want to be cool ? Use XMLSource

In CXF we have introduced an XMLSource, a light weight utility class for dealing with XPath expressions.


WebClient wc = WebClient.create("http://bookstore.com");
XMLSource source = ws.get(XMLSource.class);
source.setBuffering(true);
Book b1 = source.getNode("/store/book[1]", Book.class);
Book b2 = source.getNode("/store/book[2]", Book.class);
Book[] books = source.getNode("/store/book", Book.class);
URI firstBookLink = source.getLink("/store/book/@href");
// xml:base
URI baseURI = source.getBaseURI();


I do like it. Working with XPath is the best way to write a robust web services code, on either side (note, XMLSource can be used on the server side too as a method input parameter). It does support namespaces too, just pass along a map of prefix to namespace pairs :


WebClient wc = WebClient.create("http://bookstore.com");
XMLSource source = ws.get(XMLSource.class);
Map<String, String> map =
new HashMap<String, String>();
map.put("ns", "http://books");
Book b1 = source.getNode("/ns:store/ns:book[1]", map, Book.class);


As you know, in XPath, you don't need to match the prefix like 'ns' against the actual prefix which will be on the wire, so the above code will work even if books are qualified with 'bar:', as long as it is "http://books" which that prefix binds to.

At the moment XMLSource uses JAXB but it will support custom XML-aware providers too eventually. One problem with existing JAXP classes like Source is that you can run an XPath expression against a given Source just once really, with XMLSource you can do multiple times against XML instances of small to medium sizes (note that setBuffering(true) call).

We will help users to get back to XML - it's good and very cool to know and understand the technology which underpins the modern web services - so be cool !

Look at it the other way. You are about to start writing a code which will consume one of those Atom-inspired Google services : http://code.google.com/apis/base/starting-out.html. You know what you need to do, XMLSource will make it trivial for you :-)


So that is it for now, and the long story cut short : please use CXF Restful API, choose the flavor you like and help us to improve it.