Musings about web services: October 2007

Wednesday, October 31, 2007

One for reads, many for writes

So we have GET and POST. And we also have PUT, UPDATE and PATCH. Please note UPDATE is the verb introduced by Web3S, while PATCH is the verb likely to be used in AtomPub based applications.

POST, PUT, UPDATE and PATCH are all about writes. They have different semantics. But they're about writes. Actually, POST and PUT can both be used to create new resources, depending on who is in charge (thanks to the Book). Also, POST with a Multipart-Related content type seems similar to this possible application of PATCH.

PUT is considered idempotent. Unless it is used to create new resources. Well, it's still might stay idempotent in this case, unless no POST is allowed. Otherwise you create your new resource with PUT and in no time someone will start POSTing to it after discovering it through GET and then DELETing it.

The point of this post is not to try to claim all these write verbs are confusing. They're not confusing as long as you know what you're doing, when to use those verbs and have built your resource handlers properly, etc. They can help you to add more meaning to the updates.

Knowing when it's better to apply POST versus PUT for ex is a useful thing to know. It can help you build a purest RESTful application. Rather than have a bunch of overloaded execute methods, it's appealing to have verbs whose names suggest some semantics.

When I see some healthy discussions when what verb should be used, it reminds me of some healthy discussions I've seen about how to write a good Java application for example. It's about the skills, about knowing your tools.

There's only one thing I don't understand : what difference does it make for WEB ?

Not yet, but may be I'll see it in some time. The main question for me is not whether it's better to have create() and update() methods or just several overloaded execute methods. Rather, will it make any difference to the WEB at large that some applications are using GET and POST while other applications are using GET, POST and PUT ? So far, discussions about when to use POST vs PUT etc focus more on the technical purity rather than on the interoperability.

For example, all the HTML forms out there use POST even when deleting resources and it has not crippled the WEB, hasn't it ? Future forms will use PUT and DELETE but what difference will it make for the WEB ?

This interesting example demonstrates how PATCH can be used to do what GData does today with batch updates. Yes, it's interesting, I'm sure it will work, but what difference will it make for the WEB ? Doesn't it seem a bit too complicated just for the purpose of avoiding overloaded POSTs ?

SOAP-based services are blamed for the fact that a lot of custom verbs are pushed over POST. In many cases it's a fair point (actually, I'd like to muse about it later). Now, as far as the interoperability in a RESTful world is concerned, one needs to know the semantics of a given application in order to decide when POST vs PUT vs PATCH vs UPDATE should be used. It's unlikely a tool will be written soon which will understand itself when to use which verb when updating a given resource. If you're using an Atom powered client then may be a decision will be made by the tool. But while no doubt Atom adoption rate will increase, it's also of little doubt that other RESTful protocols, both generic and custom ones, will also progress.

The popular argument that a uniform interface lets one switch the communities easily won't work with a lot of write verbs being used out there. This argument in itself is not very practical anyway. Uniform interface does not tell you about the application or when to use which verb. HTTP OPTIONS is there but it may not always help. Say, in Amazon S3 PUT is used to create new resources while in AtomPub it's used to update the state of a given existing resource.

At least I can imagine how one can write a truly generic tool, possibly working in a semantic web world (I wish someone wrote a book like RESTful Web Services about it), if all the resources out there supported a single update method.

A single generic read method is good enough for all. Why is everyone so fixed on having many write verbs ?

Also, I'm wondering, does a true interoperability exist only in the world of fantasies ?

Sunday, October 21, 2007

Doubts about links in banking applications

Stefan Titkov has posted a Doubts about links entry where he talks about why doing

GET http://example.com/192879202039374738

is better than having a method like

Customer getCustomer(ID id) :

If I have an id, I need to know that a) it is, indeed, a customer ID and b)
that I have to call get getCustomer() method to retrieve more information.
There’s no agreement, no uniformity to the interface.

etc... This is all the proper REST talk from Stefan.

The reason this entry caught my attention was that a sample application referred to was a banking application. Surprisingly, bank/accounts are talked about very often in the context of discussions about web services, I myself used to refer to banking applications few times before.
Surprise, surprise, I've never written a banking application before :-), but those banks are so handy when discussing applications built around factory patterns.

As it happens, I'm thinking a lot, like many other people too, when and why I would use a RESTful style when building a service as opposed to using a more coarse-grained approach.

Questions like : what is the audience (who is going to consume the service), what is an ultimate benefit, how practical it is, etc are bothering me. I see a lot of potential from exposing different types of data resources to the WEB. Their state can be transient but top-level resources themselves should be fairly stable, as far as their life-time is concerned.

After all, as far as programming REST is concerned it's all about making the life of consumers easier, right ? It's nice when they can use their browser and see the application data, or use, say, an Atom-enabled reader and check the events coming of my application. It's cool when they can build mashups on top of my own data.

So when I see people saying, ok, when you do your banking applications, just use GET when referring to say people's accounts, because it's RESTful, I'm getting confused. I'd love to see at least one analysis out there which would explain what does it mean, practically, to write a banking application using a RESTful approach.

When I'm doing my online banking I'm going to http://mybank/internet, login there, and start a transient and secure session. I'm not going to add a link to my account to my Favourites folder nor I'm going to build a cool mashup on top of it. There's unlikely to be some kind of generic intermediary sitting between the client and a server and doing some advanced caching.

I'd like to understand what does to mean, to write a banking application using a RESTful approach. I'd also love to see, at least once, someone unreservedly advocating REST, saying : may be for some types of applications a resource-oriented approach might not be the best fit...

I know, I can write most of my applications using a RESTful approach. But I also know that I can use XSLT to split a 1MB string using recursive functions or solve a complex chess problem, the question is, what for ?, as XSLT is really perfect at doing apply-templates and match, but not at solving algorithmic problems.

So as far as I'm concerned I'd like to see the ultimate goal of going with REST for a given service, rather than doing it for just the sake of it and then failing to figure out, who is going to benefit from it ?, while telling at the same time to all my friends that I've written a RESTful service.

I'm looking forward to seeing more pragmatic and practical discussions in this area.

Sunday, October 14, 2007

When to use Atom

Yaron Goland has published a thought-provoking entry about Atom. It's fun to read too.
I've never seen Star Wars before, not a single episode, I should've. I didn't immediately recognized
who a General Weasdel was, and only after reading an interesting discussion on a Sam Ruby's blog did I realize who Darth Sudsy was :-).

In short, one of the questions Yaron raises is : when to use documents in the Atom syndication format (ASF) given that one has to tunnel custom XML inside individual atom entires as opposed to just passing this given XML around as is ?

Whether the example Yaron uses is contrived or not, it's hard not to notice, that yes, one just adds some extra layer of complexity when wrapping the content inside Atom entries. Yes, the example shown can be reformatted to make it more readable, but one still will have some markup there which has nothing to do with the original content.

So when is it worth it ? I've tried to contemplate a bit about it here and I'm glad to see this discussion happening now.

As far as I'm concerned, figuring out when to use ASF is the least difficult part. Dare Obasanjo points to the fact that Atom is good at representing the streams of (timestamped) microcontent and Sam Ruby and Yaron offer some thoughts on when Atom is better be used. This comment also suggests that Atom-wrapping a given content is not a de-facto choice, it depends on what people want to achieve by doing so.

I see ASF be particularly good at representing arbitraty types of events, for example.

The difficult question is when is it really worth using Atom Pub as an application level protocol of choice ? Just because there're Atom-enabled client tools out there ? So far I feel it matters only when the generic tools are targeted, but I may be wrong.

Another reason which is cited often enough is that Google does is, with its GData protocol. Oh, man, Darth Goo-Goo-L and his general G'Day tah, :-), that is. The idea of a wide-spread internet programming with the help of GData-enabled client libraries might not be that far-fetched at all, you never know :-).

And I thought it's all just about sending simple XML around :-). The battle is just beginning, which format to use and what protocol to use, and so on and so forth :-)

Friday, October 12, 2007

My simple WEB

In my simple WEB I primarily care about 3 main things :

* GET
* Addressability and Links
* Ignore Unknown Extensions

Web services needs to be addressable, when possible and practical. This will let them live in the WEB. Consumers can GET something out of such services easily. Yea, they need to also be able to update these services somehow. So lets add POST.

Now, everyone knows about PUT, DELETE and some other verbs but so far I don't quite understand how my simple WEB will benefit from PUT and DELETE, so I'll leave them out for now. Once I understand I'll welcome them in.

I'd naively assume that this is all what is needed to have all the services I've built to play with each other nicely, irrespectively of the style used to develop these services.

What really surprises me in all those debates about which style of building web services wins is that very rarely, if ever, the consumer's ability to ignore unknown extensions, aka forward compatibility, is mentioned as an absolutely key ingredient.

This is one of those things which truly makes the WEB scale, as far as the economics associated with the cost of the change and usability of client tools are concerned. In a RESTful world, the focus is on the data. This makes it easier to deal with extensions : one just extends the language.
In a not so RESTful world people are often tempted to deal with new extensions by introducing yet another method or yet another interface all the time even when it's avoidable. The lesson from a RESTful word is to focus on the data extensibility and not on the interface extensibility and interface changes.

Versioning and extensibility is a fascinating subject and it's off-topic so I'd rather chat more about it later, I'll refer to what some thought leaders out there say about it.

Musings about web services