Friday, January 21, 2011

Unified Search Experience Made Easy With CXF

Providing the quality search experience is an important task for any serious web application project.

Most of the HTML forms let users do simple queries resulting in the equality or partial match checks on the server side. For example, "Find the book with the id starting from 123" or "Find all the books with the author's first name starting from Fred".

Imagine the following task. Provide a web interface which will let users search for all the books which have an id greater than 123. Or find the list if customers which paid less than 200 dollars.

What I've observed quite a few times is that when multiple services are built by individual teams within the same large organization or as part of single large project then every team will create its own query language.

One team will come with a custom query language built within this very team. And the other team working on some other service will build a slightly different variation of the custom query language.

The end result is that the users may need to learn two query languages, one in order to query the 1st service and the other one - in order to query the 2nd one.

Sometimes different teams will agree on using the same query language.

Using explicit SQL expressions is one option. Most likely a frontend UI tool will collect the user input and convert it into SQL and then pass it to the remote service. IMHO the use of SQL as a query language on the WEB should be discouraged for the obvious reasons: the fact the the end service uses an SQL database for storing the data is the very last thing the consumer should know about, just too much information is being leaked for this to work.

The use of XQuery or query languages created by Google Data and Microsoft teams is an entirely different approach. It does let developers provide a unified search experience to the users. For example, all the Google Data services have to support the same query language - something that users can appreciate.

As I mentioned in this post, CXF JAX-RS supports converting FIQL expressions into SearchCondition expressions which capture the FIQL queries and let users match them against the application data.

FIQL is indeed a simple language - please read this post from Arul for a nice introduction to FIQL. IMHO it does offer a viable alternative to more complex and advanced query languages and we'd like to continue enhancing the CXF search extensions for users be able to get the best out of FIQL.

The CXF SearchCondition interface offers a utility method for converting the FIQL queries to SQL expressions. This method (toSQL()) has been deprecated recently. While the users who find this method working for them may continue using it for a while, it is now recommended to use the SearchCondition visitors, thanks to Brian Topping for providing a patch.

It were possible to convert SearchCondition into more optimized SQL or non-SQL expressions even before the introduction of visitors but now the relevant code has become much cleaner. The SQLPrinterVisitor is shipped with CXF and it can be used to convert the queries to SQL, using the proper SQL aliases if needed. For example, imagine a query such as "a==b". The 'a' may easily be assumed to be the name of the column in some table - but we may not necessarily want the end users to 'hard code' the names of the columns in the queries; thus the SQL visitor lets the service developers to register an alias map, for the resulting query to contain say "A_Column" instead of 'a'.

I can imagine XQuery-aware and other visitors being added in time. In fact the way we are trying to build the search extensions is to make sure other query languages such as XQuery/etc are supported transparently. If users will start asking about supporting the new query language then we'll just provide the relevant SearchContext parser and SearchCondition visitor.

One immediate enhancement we are thinking of is to add a SearchQueryBuilder which users would use to build FIQL/etc queries using simple Java operations and pass the builder result to WebClients or proxies.

So imagine all the different web services within the same organization supporting the same simple URI friendly query language which is easy to understand and use. One thing you can be sure of is that the end users will appreciate it, especially when they start building their own client applications which need to query a number of those web services.

By the way, it should work nicely for CXF JAX-WS services too provided they've been JAX-RS-enabled.

5 comments:

RainerW said...

The search support is just great. It helps a lot in unifying searches (and not creating a new query syntax). I love it.

But then, ..
I need more than just the _search= query, and if I do something like
http://...../foo?_s=key==3&level=2 it just breaks (as level is not a valid property name for the search)

Any idea (except including all possible future query parameters in the searchable class, *yuk*) ?

Best regards, RainerW

RainerW said...

Digging a little bit deeper ...
The problem seems to arise in SearchContextImpl.java.

getExpression() picks up the whole query string, and not just the part that starts with _s= or _search=. Reordering the query items does not help, as the query *must* begin with _s (or _search) to select the expression at all.

Currently, I think about chopping off query items "to the right" of the search spec as first aid.

Any comment on this?

Best regards, RainerW

Sergey Beryozkin said...

Hi, this is fixed, please see

https://issues.apache.org/jira/browse/CXF-3298

By the way, do you have any ideas what other search parameters can be generally supported ?

I'm thinking of something like:
_s==a=b&_sortBy=a

so SearchCondition.toSQL can convert _sortBy into ORDER BY

thanks, Sergey

RainerW said...

Hi, thank you for the *quick* fix.
It works perfectly (at least for the test I've run up to now ;-) ).

As for general search parameters: _sortBy would be nice, agreed.

Another one would be a pair of _offset, and _limit to get a pageable, size restricted subset of matches.

For string comparisons, a distinction between case-sensitive and non-case-sensitive matches would sometimes be helpful, but this is probably better encoded as comparison operator. The same goes with a "startsWith" match.

For the general parameters, the existing/usual "other" parameters, like _type, _method, and (in my case) _style, should be avoided. As an alternative, general search parameters could be contained in an additional (end-)section withing the search spec (extending the RFC), to avoid naming conflicts.

Best regards, RainerW

Sergey Beryozkin said...

Hi, thanks for a quick confirmation the fix works :-)

The comments are very helpful.
_sortBy, _limit & _offset are worth suporting indeed.

Users would need to configure SQLPrinterVisitor for _limit & _offset be mapped to a correct db-local property, but it would be simpler than building the SQL/etc expression manually by iterating through the individual SeacrchConditions...

I think you can do 'startsWith' by using a star:

_s=name==CXF*

thanks, Sergey