All posts by Alexander Ananiev

Yahoo Pipes – A Great Way to Create Composite Applications

Yahoo Pipes web site was launched last week and almost immediately drew the attention of a large crowd – I think the site actually went down for a few hours after the launch.

Yahoo Pipes makes it extremely easy to “mash” different Web sources together – without any programming, using drag-and-drop AJAX UI. The UI is actually very slick, it loads fast and provides very intuitive environment. Basically, the UI allows users to create a “message flow” or a pipe using predefined customizable operations, which is a paradigm familiar to any enterprise developer.

I literally took me twenty minutes to put together a simple “pipe” aggregating different SOA-related feeds and allowing user to filter feeds (title or body) using keywords – you can check it out here (you don’t even need a Yahoo account to run it). Learning time was close to zero (perhaps five minutes).

This experience got me thinking. Why is it so easy to create composite applications on the Web (it was easy enough before and with Yahoo pipes it’s just gotten easier) and why is it so hard to do it in an enterprise SOA environment? Why ESB tools don’t have the same level of ease of use and the same degree of “zero administration”? Visual message flow and workflow editors for ESBs are nothing new, but they still come with steep learning curve, tricky configuration requirements and hefty price tags. Of course it’s not fair to compare a simple RSS aggregation/filtering function with typical enterprise tasks (e.g., try to reconcile three different customer XML schemas with different field semantics), but I still think that we have plenty of room for improvement in the enterprise SOA space.

JSON Pros and Cons

JSON is a simple object serialization approach based on
the JavaScript object initializers syntax. The code for initializer (object
literal) is put into a string and then interpreted using JavaScript eval()
function or JSON parser (which is very lightweight):

serializedObj='{firstName:"john", lastName:"doe"}';
...
// This is just an example, JSON parser should be used instead
// to avoid security vulnerabilities of "eval"
var obj = eval("(" + serializedObj + ")");
document.getElementById("firstName").innerHTML=person.firstName;

JSON is used extensively in various AJAX frameworks and toolkits to provide easy
object serialization for remote calls. It is is supported by both GWT
and DOJO. There is a growing realization that
perhaps JSON should be considered as an option when implementing SOA, for
example Dion Hinchcliffe recently published a blog
entry
suggesting that JSON (and other Web 2.0 technologies) must be
seriously considered by SOA architects.

So what are the benefits of using JSON relative to XML (for this comparison I
want to focus just on XML and stay away from SOAP)? Here is my brief take on
that, there are also numerous other articles
and posts
on the subjects.

JSON XML
JSON strengths
Fully automated way of de-serializing/serializing JavaScript objects,
minimum to no coding
Developers have to write JavaScript code to serialize/de-serialize to/from
XML
Very good support by all browsers While XML parsers are built-in into all modern browsers, cross-browser XML
parsing can be tricky, e.g., see this
article
Concise format thanks to name/value pair -based approach Verbose format because of tags and namespaces
Fast object de-serialization in JavaScript (based on anecdotal evidence ) Slower de-serialization in JavaScript (based on anecdotal evidence )
Supported by many AJAX toolkits and JavaScript libraries Not very well supported by AJAX toolkits
Simple API (available for JavaScript and many other languages) More complex APIs
JSON weaknesses
No support for formal grammar definition, hence interface contracts are hard to
communicate and enforce
XML Schema or DTD can be used to define grammars
No namespace support, hence poor extensibility Good namespace support, many different extensibility options in Schema
Limited development tools support Supported by wide array of development and other (e.g., transformation)
tools
Narrow focus, used for RPC only, primarily with JavaScript clients (although
one can argue that it’s one of the strengths)
Broad focus – can be used for RPC, EDI, metadata, you name it
No support in Web services -related products (application servers, ESBs, etc),
at least not yet
Supported by all Web services products

So the bottom line is that JSON and XML are, of course, two very different
technologies; XML is much broader in scope so I’m not even sure if comparing
them side by side is fair.

As an object serialization technology for AJAX (or should it now be called AJAJ
since we’ve replaced XML with JSON?) JSON looks very appealing. Anybody who ever
tried parsing SOAP directly in a browser (while having to support multiple
browsers) can attest that this is not a straightforward task. JSON simplifies
this task dramatically. So I think that ESB vendors should definitely start
thinking about adding JSON to the list of formats they support.

One of the keys to SOA success is that it should be easy to consume a service,
i.e., the entry barrier for service consumers must be low to support “grass root” SOA adoption.
While a top-down SOA effort may succeed, it will certainly take
longer than bottom-up (“grass-root”) approach when developers are able to
consume services as they see fit. AJAX/JSON fits this bill perfectly – it is
easily understood by developers and it does not require any Web services -specific
tools or infrastructure.

So overall I’m pretty enthusiastic about JSON.

What is Good SOA?

Good SOA is the one that helps you solve your business problem. Good SOA is not about ESBs, BPM or registries, you CAN solve your business problem without them. Of course, if an ESB helps you implementing the solution that solves your problem more efficiently, by all means, you should use it. But is ESB a must have? Not at all. Same is true for BPM/BPEL products or even registries. These products may or may not help you build your solution faster or make it more flexible. It all depends on your business domain, your specific requirements and your existing IT environment.

Good SOA should help you build solutions faster. There should be a measurable improvement of developers’ productivity. This means that it should be easier to build solutions with SOA than without. If this is not the case, if, for instance, developers complain that they need to jump through all sorts of hoops to invoke a service, then something is wrong with your SOA.

A corollary to that is that good SOA spurs innovation. Just look at successful public Web services, such as Google maps or Amazon. They triggered creation of applications that developers of these services had never envisioned. Note that true innovation is almost always grass-root. Your SOA governance processes must support, not prevent, grass-root initiatives. When you build a good service, you really can’t envision all possible usages of this service. So let others help you.

What about flexibility and extendibility? We’re often told that good architecture should be flexible and extendable, so that future requirements can be implemented without major changes to the architecture. This is one of the most cited SOA benefits. But we should not go overboard trying to implement ultimately flexible solutions. Nobody can predict future with certainty. No architecture can accommodate ALL future requirements. Besides, oftentimes flexibility increases complexity. More complexity − more difficult it is to deal with the architecture, so developers’ productivity plummets. So focus on what you know and build the best solution for the task at hands. Chances are it will be flexible enough.

Once again, good SOA is not about technologies or products. It’s not about WSDL, SOAP and WS-*. You can build great stuff with just “Plain Old XML” over HTTP (and you can build equally great applications with WS-Security, WS-RM and all the rest of WS-* soup − but make sure you have proper tools in place to deal with those). Just make sure that your architecture solves a real business problem and that it makes developers more productive.

Who Needs Web Services Repository?

Web services repositories are almost always mentioned in conjunction with SOA governance as its key enabler and there is a widespread notion that a repository is a key component of an SOA.

To me there are two types of SOA governance. There is strategic governance which is part of the overall IT governance that deals primarily with funding and other “big” decisions (“big G”). There is also more “tactical” SOA governance which deals mostly with configuration management (CM) (e.g., release management) of individual Web services (“small g”).

There is also Web service management, including SLA monitoring and enforcement, and, perhaps, security, but to me that’s just an operational aspect of SOA governance and it’s distinctly separate from big G (funding) and small g (configuration and change management).

So, having been in a developer’s shoes for the most of my career, I’m primarily interested in “small g”, I’m sure management types can figure out “Big G” much better than I am.

So how this “small g”, or SOA CM, is different from CM that we’ve been doing routinely over the last, well, at least 20 years? Why all of a sudden we’re told that we need new tools for that, such as Web services registries and repositories?

I thought that we already have a repository that we know very well, and this is our version control repository – CVS, SVN, ClearCase, Perforce, whatever. So all our code and code-related artifacts, including, of course, WSDL and schema files (and whatever WS-* files, such as WS-Policy, we’ll need in the future) are already checked into our version control repository of choice. We can do a lot of good things with version control repositories, including all kinds of analysis, diffing, etc. We can develop ant or rake script to integrate build and deploy process with version control. With tools like Maven we can do pretty complicated dependency management. There are also continuous integration tools, build servers, change management tools, reporting tools and all kinds of other software helping us to deal with the code we’re developing. In other words, we know how to “govern” our code and so “governing” a few extra XML files (WSDL, Schema) should not be that difficult or special.

So I just don’t see how a Web services repository is going to make any of these tasks easier.

What we do need is better WSDL and Schema presentation and visualization tools, so that Web service consumers don’t have to always deal with raw XML. But I doubt this task warrants an expensive SOA repository, and there are some tools out there that can do it on the cheap.

SOA repositories also provide some run-time APIs so that service consumers or intermediaries can dynamically discover the most suitable service. Quite frankly, I think this scenario is a little far-fetched, especially given the lack of support for dynamic discovery in existing WS tools and products. Then there is also support for dynamic endpoints a la UDDI (or directly using UDDI standard), but, again, dynamic endpoints can be supported much more easily using configuration files as opposed to heavy-weight run-time APIs. Extremely low acceptance of UDDI is the proof of that.

So perhaps SOA registries and repositories are useful for “Big G” governance tasks (although I have my doubts – e.g., how relevant are WSDL files for funding decisions?), but the “small g”, that is, CM tasks, can certainly be more efficiently handled by existing CM tools. SOA repository vendors should think about extending existing CM tools instead of trying to create specialized environments just for Web services.

Schema Compliance is the Key to Interoperability

Good Web services interoperability
is an absolute must for a successful SOA implementation, but why
interoperability has been so difficult to achieve?

I think that inability to comply with a published Web services contract
expressed via its WSDL/Schema could be one of the leading causes of
interoperability problems (I use the term “interoperability” pretty broadly here).

For example:

  • Many “wrapper” service implementations use positional parameter binding, as I
    described in
    my previous post
    . This allows a service provider to accept messages
    will never validate against the schema. Then, certain changes in
    implementation (a different binding mechanism, switching from “wrapper” to “non-wrapper”),
    could all of a sudden start causing issues for clients that have never had any
    problems before.

  • Some Web services clients always generate messages using qualified local
    elements thus ignoring what is in the schema (the default is actually “unqualified”).
    This may work for some binding frameworks but not for others.

  • Different JAX-RPC implementation handle “xsd:anyType” differently. Some generate “Object”
    bindings; some use SOAPElement and some actually allow using DOM’s “Document”
    or “Element” (as an extension to JAX-RPC).
    This would be Ok if serialization/de-serialization
    process did not change depending on the binding,
    but that unfortunately is not always the case.

  • Handling of “nillable” (xsi:nil) and required elements. If an element is
    declared as required but nillable, it must be provided as part of a message.
    Instead, some Web service clients omit an element if it is “null” in Java.

My biggest pet peeve is that most JAX-RPC implementations today don’t even
support schema validation out of the box (hopefully this will change in JAX-WS).
I understand that full validation against a schema could be expensive, but can
we at least handle required parameters/fields as they defined in the schema?
Types mismatch will certainly cause an error, why not a missing mandatory field?

Note that I’m not talking about WS-I profiles, .NET/Java interop, and
differences in WS-* implementations. I’ve seen all of the above problems while working
with different Web services implementations in Java (mostly JAX-RPC).

And it is not just about interoperability – reliance on subtle conventions as
opposed to a published contract makes architectures more fragile and more
tightly coupled.

So my advice is simple – write a handler (at lest for development/test
environments) and enable schema validation on the server and, under some
circumstances (i.e., you don’t control the service), on the client. This will
save you from many interoperability problems, guaranteed.

“Wrapper”/”Non-wrapper” Web Service Styles – Things You Need to Know

“Wrapper”/”non-wrapper” Web services styles are mandated by JAX-RPC and JAX-WS
specifications
and are explained in details in documentation or in other sources, for
example in
this article
. However, from my experience, there is still a lot of
confusion among developers about differences between these two styles and also
how a particular style affects design and implementation of a Web service.

First of all, we need to understand that “wrapper”/”non-wrapper” style is a
characteristic of a Web services implementation framework (such as JAX-WS), as
opposed to a style defined in WSDL. In fact “wrapper”/”non-wrapper”
can only be used in conjunction with the document/literal style defined in WSDL.
“wrapper”/”non-wrapper” setting defines how Web service request/reply messages
are interpreted by a Web service provider/consumer. Quite simply, “wrapper”
style tells the Web service provider that the root element of the message (also
called “wrapper element”) represents the name of the operation and it is not
part of the payload. This also means that children of the root element must map
directly to parameters of the operation’s signature. The “non-wrapper” style (also
sometimes called “bare”), does not make this assumption; in this case the entire
message will be passed to the service operation. The reply message is handled in
a similar way.

“Wrapper” style can be used for the following method:


public String produceFullName( 
        // note we have to provide names in annotations
        // to comply with the schema.
        // otherwise generic "in0", "in1" names are used.
        @WebParam(name = "firstName")String firstName, 
        @WebParam(name = "lastName") String lastName );

The request SOAP messages for this method will look like this:


<soap:Envelope 
    xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soap:Body>
    <produceFullName xmlns="http://personservice">
      <firstName>John</firstName>
      <lastName>Doe</lastName>
    </produceFullName>
  </soap:Body>
</soap:Envelope>

However, suppose we implemented the same function differently:


public String produceFullName(
        @WebParam(name = "person", 
                targetNamespace="http://person") 
        Person person );

This method can be used with “non-wrapper” style resulting in the following
request message:


<soap:Envelope 
    xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soap:Body>
    <ns2:personExt xmlns:ns2="http://person">
      <firstName>John</firstName>
      <lastName>Doe</lastName>
    </ns2:personExt>
  </soap:Body>
</soap:Envelope>

In JAX-WS, “SOAPBinding” annotation can be used to specify the style, e.g.,
@SOAPBinding(parameterStyle=SOAPBinding.ParameterStyle.BARE).
Note that “wrapper” is the default. In JAX-RPC the “wrapper” style is
specified in the Web service mapping xml file by adding “wrapped-element”
element to the service endpoint mapping.

If you generate your Java artifacts from WSDL file, the “wsdl2java” tool
will automatically assume “wrapper” if operation name matches the wrapper
element name and if some other requirements are met (some “wsdl2java”
implementations also allow to control “wrapper”/”non-wrapper” using a command-line option).

So once again, “wrapper”/”non-wrapper” is a data binding style used in the
context of a particular Web service provider/consumer. It is not part of the
contract of a Web service, since it is not mentioned anywhere in the WSDL file.
In other words, if you are a Web service provider, your consumers don’t have to
know or care whether your implementation style is “wrapper” or “non-wrapper”.
They may very well choose the “non-wrapper” style for their client’s
implementation and this client should interoperate with your “wrapper” service
without a hitch (at least in theory).

In fact, the “wrapper” style is supported widely but not universally.
From what I understand, the support for the “wrapper”
style first appeared in .NET and later on Java-based Web services frameworks
began supporting it as well (in Axis and others). I think it is also supported
by Perl SOAP library. However, this style may not be supported by other languages,
for example, I believe that PHP 5 SOAP engine does not have a notion of the “wrapper”
style.

“Wrapper” style is really an RPC-style binding over “document/literal” WSDL
style. The basic premise of the “wrapper” style is that our service is a remote
method that takes some parameters (as opposed to pure message-based paradigm of
a “non-wrapper” Web service, where method name is not explicitly provided in the message).
So how “wrapper” with “document/literal is better
than “rpc/literal” WSDL style? Here are some of its advantages:

  • “Wrapper” service can only have one message part in WSDL, which guarantees that
    the message (consisting of the method element that contains parameters) will be
    represented by one XML document with a single schema.

  • RPC style message definitions in WSDL have to use schema types, whereas with the
    document style we can refer directly to element names. This makes XML message
    representation more “precise”, it is also easier to validate.

  • RPC/literal, while WS-I compliant, fell out of favor and being
    frowned upon, not
    in small part due to poor Microsoft support.

“Wrapper” style has one interesting drawback, which may not be immediately
obvious. Most Web services engines (at least the ones that I had a chance to
work with) use positional parameter binding with “wrapper” style. This means
that if a service signature changed (e.g., a new parameter was added), clients
will have to change as well. In other words, with “wrapper” style, all
parameters have to be present in the XML message (although they can be defined
as null using “xsd:nil” attribute). This is despite the fact that the element
corresponding to the parameter can be defined as optional in the schema. “non-wrapper”
style does not have this problem; adding a new optional element does not
affect clients, since binding is always name-based. This creates somewhat
tighter coupling between “wrapper” style consumers and providers. This may also
violate the contract of the Web service defined by its schema. To get around
the last problem, you may want to define all child elements of the wrapper root
element as required (minOccurs=1); optional elements must be made nillable. For
some reason JAX-WS spec does not explicitly state this.

JAX-WS, however, does impose some additional restrictions on the “wrapper” style,
the most important one being that the wrapper element’s content type must
be “sequence”. This makes sense, since we’re trying to model a method’s
signature, which is always represented as an ordered list of arguments (at least
in Java/C#).

As far as the “non-wrapper” style goes, probably the most obscure thing about it
deals with how a SOAP engine decides which operation to invoke based on the
message type (assuming a Web service has multiple operations), since operation
name is not explicitly provided by the message (unless “SOAPAction” header is
used, however, this header is HTTP-only and also optional in JAX-WS). With “non-wrapper”,
each operation must accept a message that corresponds to a unique XML element
name. Note that it is not enough to define different WSDL messages using “wsdl:message”
element, each message must be represented by a different element in the schema.
Web service engines use unique element names to determine Java method names (that
correspond to WSDL operations). This means that with “non-wrapper” you can’t
have different operation names processing the same message, e.g., process(Customer)
and processDifferently(Customer) can only be implemented by creating
different customer types.

So we have an interesting dilemma – “wrapper” does not support overloaded methods (in
fact, I don’t think WS-I allows them either), whereas “non-wrapper” does not support
methods with the same signature (at least under certain conditions).

So how should we decide when to use the “wrapper” style? My approach is always
the following:

  • Design good “tight” schemas for your request and reply messages. Do not rely on Java-to-WSDL/Schema
    generation. Hopefully, you have good requirements as the basis for your design.

  • You can now see if the request schema matches the Web service operation’s
    signature. If that is the case, “wrapper” might be your choice. Consider pros
    and cons of each style described above. Basically, “wrapper” provides more “natural”
    RPC implementation, whereas “non-wrapper” gives more control and somewhat
    looser coupling (from a client’s standpoint). If you decide to go with the “wrapper”,
    do not assume that all of your clients will be using the “wrapper” (unless you
    know for sure that this will be the case). So make sure that your WSDL can be
    easily consumed by both “wrapper” and “non-wrapper” clients. For example, an
    element with empty sequence maps to “void” return type if the “wrapper” style is
    used; however, “wsdl2java” (or a similar tool used by other languages) will
    generate an empty class as the return type for “non-wrapper”. This last option
    may not be very intuitive or desirable for your clients.

  • When publishing your service, provide examples of its client implementation. You
    may not be able to address all the different languages and platforms; however
    covering the most widely used ones (most often, Java in C# in an enterprise environment) is a
    good idea. This will help your clients consume your Web service and also clarify
    the “wrapper” versus “non-wrapper” question.

Reliability of SOAP over HTTP Web Services

HTTP or HTTPS can be viewed as the current de-facto standard transport binding
for Web services. It is frequently said, however, that HTTP is inherently
unreliable and not appropriate in situations where guaranteed delivery and other
Quality of Service (QoS) characteristics are required. To deal with this issue,
many Web services products, commercial and open-source, offer a non-standard (meaning
that it is not part of a WS-I profile) transport binding using JMS or other
messaging APIs. Alternatively,
WS-ReliableMessaging (WS-RM)
specification
defines a reliable messaging protocol for Web
services independently of the transport binding. Currently, WS-RM is supported
by several Web services products.

It is tempting to standardize on one of these alternatives for an enterprise SOA
implementation in order to meet QoS requirements. However, we need to realize that both
of these options add complexity to Web services implementation and greatly
impair interoperability. JMS binding requires that all Web service consumers
must support a particular messaging provider. Additionally, appropriate
messaging resources (queues, connection factories) must be configured
on the server and, sometimes, on the client’s side (e.g., MQ channels). A Web service
provider has to be able to communicate with a messaging provider e.g., to
consume a message from a queue (which calls for a Message Driven Bean in J2EE environment).
So while these issues are not insurmountable, they certainly make things
more complicated and, among other things, require full J2EE stack for Web
services implementation.

WS-RM is currently being standardized by
WS-I
, there is also
project “Tango”

that specifically addresses interoperability between Sun and Microsoft
implementations. Unfortunately, WS-RM support is still far from being uniformed among
Web services vendors. Also, from what I understand, WS-RM by itself is not
sufficient since it does not define how (or if) a WS-RM conversation
participates in a transaction. So then WS-Transaction or some proprietary vendor
extension is needed, and that in turn calls for the ability to support XA. I
also suspect that the use of a persistent storage on the client is the only way
to support all WS-RM requirements (e.g., what if the client goes down in the middle
of a message exchange?), so this makes clients “fatter” and more complex.

The bottom line is that “good old” SOAP over HTTP is the easiest and the most
interoperable way to implement Web services today. So how much can we trust
SOAP/HTTP Web services and should HTTP even be considered in
an enterprise setting where QoS is almost always a requirement?

First, we need to remember that HTTP as well as virtually any other
communication protocol today (including IIOP and protocols used by messaging
providers) is based on TCP. The whole purpose of TCP is to be able to transfer
data reliable and so it employs acknowledgements and sliding window mechanism to
guarantee the delivery of data. What does it mean in terms of Web services? Say,
we have a one-way service. If we invoked a service and received a successful
reply (no SOAP faults or HTTP errors), we can be confident that our SOAP message
reached its destination. But what if the connection went down in the middle of
the call and we received a timeout error? Our message exchange is now in an
ambiguous state. If the message has not been received, we need to invoke the
service again, but, on the other hand, if it has been received, the duplicate
message may lead to an inconsistent state if the service provider is not able to
handle duplicates correctly.

QoS provided by messaging systems or WS-RM helps us in this situation by ensuring
that the message will only be delivered once; messaging systems can also handle
re-delivery of messages and “store and forward” (there are other QoS policies
besides “exactly once”, but “exactly once” is the most stringent one).
Messaging systems also provide another important benefit by allowing a Web
services call to participate in an atomic transaction. This allows service
consumers to keep in synch multiple resources (including the ones provided by Web
services) thus improving data integrity and reliability.

So where does it leave SOAP/HTTP services? Based on the explanation above, SOAP/HTTP
is not the best fit when:

  • Service providers can’t handle message duplicates (in
    other words, an operation performed by the service is not idempotent).

  • Different data resources owned by service provider and service consumer must
    be in synch at all times.

However, we still might be able to use SOAP/HTTP if we make our service
consumers a little smarter. Specifically, service consumers must be able to meet
the following requirements:

  • Consumers must be able to retry a Web service call in case of a failure due to
    a connectivity error. In the simplest case, a consumer’s application may
    choose to show the error message to an end user and let the user press “submit”
    button again. However, most consumers will probably choose to automate the retry
    logic (e.g., have X number of attempts) and at least log unsuccessful attempts
    and potentially alert an application administrator (note that some Web services
    clients have built-in retry capabilities).

  • Consumers must be able to uniquely identify the data that they are sending to
    providers. Suppose we have “addEmployee” service that takes an employee XML
    message as an input. The employee message must have unique ID set by the
    consumer of the service, so that when the invocation is retried, the service
    will be able to detect that the employee was already added as part of the
    previous call. This means that popular techniques using database sequences or
    autoincrement fields for generating unique IDs do not work with SOAP/HTTP Web
    services. Service consumers must implement their own way of creating unique IDs
    (perhaps relying on some kind of a natural key or using UUID).

  • Consumers must be able to handle certain application-specific errors (SOAP
    faults). For example, “addEmployee” service may return “Employee with this ID
    already exists” error after “addEmployee” call was retried in response to a
    connectivity failure. The consumer’s application will have to stop retrying
    after catching this error. This situation may also require an end user (or an administrator) involvement to
    verify that the employee was indeed added.

As an example, let’s take a look at “Create/Read/Update/Delete” (CRUD)
operations. While real-life services oftentimes do much more than just “CRUD” (e.g.,
calculations), it is a valid simplification for our purposes.

Operation Is Indempotent? Required Service Consumer Behavior
Create No Must be able to retry the call

Must be able to handle “record already exists” error.
Read Yes None
Update Yes, unless update involves calculating new values based on the
existing values, e.g., incrementing a counter
Must be able to retry the call
Delete No Must be able to retry the call

Must be able to handle “record
does not exist” error.

So what is the takeaway? SOAP/HTTP can be used in many (if not most)
situations, however, implications of this decision must be fully
understood. All service consumers must be made fully aware of these implications.
Most importantly, service consumers must implement proper logic for handling
connectivity failures and application errors. In some cases, users of service
consuming application may need to be instructed about how to react
to certain application errors.

Using Schema Validation with JAXB and XFire

Schema validation framework that I covered earlier is actually fully supported by
JAXB. If you’re using XFire, you will have to switch to JAXB binding in order to
utilize it.

As described in XFire documentation, you can specify “schema” element as part
of your service definition and point it to your schema. From what I understand,
however, this does not enable validation, it just tells XFire to use
your schema as part of the generated WSDL (in response to “?wsdl” requests).
So I actually had to develop a simple handler to enable it:


public class JaxbValidationEnablingHandler extends AbstractHandler {
   
    public JaxbValidationEnablingHandler() throws SAXException{
        super();
        setPhase(Phase.PARSE);
        before(ReadHeadersHandler.class.getName());
    }
   
    public void invoke(MessageContext ctx) {
        ctx.setProperty(JaxbType.ENABLE_VALIDATION, "true");
    }
}

Another problem that I encountered was that validation exceptions don’t
reach the client, instead the client receives meaningless “Can’t unmarshal” exception.
A simple change in XFire’s JaxbType class fixes that though.

So how does this approach compare to the validation handler approach that I presented
in the previous post? On one hand,
relying on JAXB allows us to easily handle both “wrapped” and “bare” Web service styles.
It also requires minimal effort to implement. On the other hand, very often with Web services
there is a need to be able to process XML without reading it into objects, so, chances are,
JAXB won’t be used in all cases. Also, non-JAXB approach provides for more sophisticated
error handling using
custom error handlers that can be used in conjunction with the validation framework.

Another requirement that have to be considered regardless of the approach is
integration with various XML and Web service registries that are becoming
wide spread in large organizations. This means the ability to read schema from a URL or even
over UDDI or proprietary API.

Clearly, there is still some work to be done in order to make schema validation the norm
for Web services.

Web Services without Code Generation

JAX-RPC supports two major ways of developing Web services:
“bottom-up”, which allows to generate WSDL from a Java interface and “top-down”,
which generates Java classes from WSDL. Both of these approaches suffer from
one major drawback.
WSDL/Schema/mapping files or Java classes are fully re-created every time there is a need
to change Web service interface and so it becomes a developer’s responsibility to
merge existing WSDL files or Java code with the generated files. The need to manually
change WSDL/Schema stems from the fact that the generated files are usually very “basic”,
they do not take advantage of advanced schema capabilities, they
don’t use WSDL/Schema modularization and so on. Generated Java classes are even more problematic.
You really don’t want to have simple “value” classes without any behavior with fully exposed
properties. And adding behavior requires manual intervention.

In reality, there is a way to evolve two sets of artifacts (WSLD
and Java classes) independently without any generation by manually updating
Java-WSDL
mapping file
. The format of this file, however, is the antithesis of the
“convention over configuration” idea. The format is extremely verbose; each and every
field has to be mapped explicitly and so manual modification of this file poses a
real challenge for any more or less complex data structure.

The latest JEE Web service specification, JAX-WS, fixes this problem by
heavily leveraging annotations.
For data mapping, annotation support
is provided by JAXB specification. JAX-WS and JAXB finally free developers from
having to rely on the brittle code generation approach.

In fact, JAXB still supports code generation and so developers can use
it for generating Java or WSDL files. The generated artifacts could provide a good starting point for a
Web service interface. After that, however, it is pretty easy to keep both sets of
artifacts in synch by updating annotations and/or WSDL/Schema files.

How does it work in practice? Let’s say we have this very simple schema:


<element name="person">
  <complexContent>
    <sequence>
      <element name="firstName" type="xsd:string" />
      <element name="lastName" type="xsd:string"/>
    </sequence>
  </complexContent>
</element>

The following class will match the schema:


// By default JAXB uses getters/setters to drive marshaling/unmarshaling
// Using fields is easier, since they can be defined more compactly
// in one place and also moved around to manipulate order.
// Beware though - getters/setters are ignored in this case
@XmlAccessorType(XmlAccessType.FIELD)
// We need to provide the namespace, otherwise it defaults to
// the package name
@XmlRootElement(namespace="http://myarch.com/personservice")
public class Person {
    // note that JAXB marshals instance variables in the order they
    // declared
    protected String firstName;
    protected String lastName;
   ...

Note that JAXB provide several
additional attributes and annotations
which I’m not using here – I’m trying to rely as much as
I can on “convention over configuration”. Element names, for example, match my variable names,
so there is no need to provide annotations for individual fields (however, in this case,
you can’t use qualified locals
in your schema since JAXB won’t prefix nested elements.)

Many JAXB annotations, such as “XmlType” are only used for generating schema from Java classes.
Same is also true for most attributes of XmlElement.
Oftentimes developers specify “required” and “nillable” attributes of XmlElement annotation
thinking that JAXB will automatically enforce these constraints, e.g. :


@XmlElement( required = true, nillable=false)
private String lastName;

However, these parameters are not used at
run-time. JAXB 2.0 actually relies on the
schema validation framework
and so
these parameters can be omitted assuming that the schema contains the constraints.

In other words, we only have to use those annotations that help
us to correctly marshal/unmarshal our objects.

The only other step that we need to take is to create ObjectFactory class in same
package with the mapped class. In this class we need to define a factory method for the root
element of our content tree (you don’t need to worry about nested datatypes even
if they map to other classes). From what I understand, JAXB does not actually invoke any
factory methods, however, it does check for their presence in ObjectFactory.
The factory method looks simple enough:


public Person createPerson() {
    return new Person();
}

Now we can make changes to our Java classes and the corresponding WSDL/Schema files without
ever having to resort to code generation. In my mind, this is a preferred way.
We can see from Object-Relation mapping that “pure” “top-down” and
“bottom-up” don’t really exist. How often do we generate DDL from Java classes or vice versa?
Different data representations supported by different technologies have to be able to evolve
independently without having to play by each other rules. JAXB helps us to achieve it pretty easily.

Using XML Validation Framework with Web Services

A good Web service has to have a well defined and enforceable contract. Typically, the
contract is expressed using XML Schema language. Unfortunately, up until recently,
enforcing the contact was very problematic because of the huge performance overhead
associated with running schema validation. XML parsers used to load and parse the schema
for every validation request, which was clearly expensive.

This has changed with the release JAXP 1.3 which comes with the schema
validation framework
which allows to first load and compile a schema and then reuse
the compiled version for all subsequent validation requests. This may finally
make it feasible to use the schema validation for Web services in production
environment.

I decided to prototype the use of the validation framework for my Web services
implemented in XFire
(you may also want to explore a
schema validation component
implemented in ServiceMix ESB).

All it took is a fairly simple handler:


public class XFireSchemaValidationHandler extends AbstractHandler {
    
    private Schema schema = null;
    
    public XFireSchemaValidationHandler() throws SAXException{
        super();
        setPhase(Phase.PARSE);
        before(ReadHeadersHandler.class.getName());

        // Load the schema - note that handler is only
        // instantiated once, so we can keep the schema in 
        // an instance variable
        SchemaFactory factory = SchemaFactory
            .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        // I'm hardcoding the path for now, but we should be able
        // to extract it from the WSDL 
        StreamSource ss = new StreamSource(
                new File("/ws/xfire/etc/person.xsd"));
        schema = factory.newSchema(ss);
    }

	public void invoke(MessageContext ctx) throws Exception {
        InMessage message = ctx.getInMessage();
        XMLStreamReader streamReader = message.getXMLStreamReader();
        // create JDom from the stream - alternatively we can rely on
        // DOM and XFire DOM handler
        StaxBuilder builder = new StaxBuilder();
        Document doc = builder.build(streamReader);
        // get to the body first
        Element body = 
            (Element)doc.getRootElement().getChildren().get(0);
        // we assume that we use "unwrapped" mode and 
        // our payload is the direct child of the body element  
        Element payload = (Element) body.getChildren().get(0);
        
        // dump the message for testing purposes
        XMLOutputter outputter = 
            new XMLOutputter(Format.getPrettyFormat());
        outputter.output(payload, System.out);
        
        // create validation handler from the pre-complied schema
        ValidatorHandler vh = schema.newValidatorHandler();
        // the handler only works with SAX events, so we create 
        // SAX from JDom 
        SAXOutputter so = new SAXOutputter(vh);
        // Validator will run as a SAX handler and throw an exception
        // if the validation fails.
        so.output(payload);
        
        System.out.println("\nValidation passed");
        // rewind the stream reader for subsequent processing 
        message.setXMLStreamReader( new JDOMStreamReader( doc) );
	}
}

Unfortunately, the validation framework does not currently support StAX, so
an XML document has to be parsed using SAX or DOM (or JDom with subsequent SAX
conversion which is what I chose to do since it was quicker to develop
that way). I’m sure this conversion (and the validation process itself) adds
some performance overhead, but I’m hoping that it
won’t be too bad, especially for simple schemas (and from my experience 90% of
all Web services have very simple messages). JAX-RPC-compliant Web services engines
should see even less of performance penalty since everything is loaded into
DOM from the get-go. Perhaps I’ll try to run some performance tests and see what
happens.

Using XFire as a Client for WebSphere Application Server

XFire has great easy-to-use client API,
which makes it an ideal candidate for developing unit tests for Web services.
The best part is that it can interoperate with Web services implemented in other
containers and application servers. I recently started using it with IBM WebSphere Application Server (WAS) 6.1.

So what are the benefits of using XFire as opposed to a “native” Web services client
created using tools provided with your application server?

  • Many application servers/Web service containers use code generation to
    generate a client’s “proxy”. You have to remember to re-gen the client
    every time you change interface of your service.
  • Code generation has a number of issues. For example, a client generator
    will generate Java classes for parameters of a Web service. Oftentimes,
    it is desirable to use an existing class instead, so then the generated code must
    be changed.
  • Using “native” client does not test interoperability. When client and server
    are implemented using the same product, interoperability is almost guaranteed.
    Using a different client library potentially allows you to discover
    interoperability problems early on.
  • Finally, since XFire is open source, it can be packaged and distributed to
    customers and partners without requiring a license.

Invoking a Web service running in WebSphere does not require anything special:


public void testHelloService() throws Exception {
    
    Service serviceModel = new ObjectServiceFactory()
        .create(HelloService_SEI.class);
    XFireProxyFactory serviceFactory = new XFireProxyFactory();
    HelloService_SEI service = (HelloService_SEI) serviceFactory
        .create( serviceModel, 
        "http://localhost:9090/helloWS/services/HelloService");
    
    Person person = new Person();
    person.setFirstName("John");
    person.setLastName("Doe");
    String s = service.helloPerson(person); 
    assertEquals("Hello John Doe", s);
}

It must be noted though that I have not tried using any advanced features. It
would be interesting to see if, for example, XFire WS-Addressing implementation can
interoperate with WAS.

By the way, if you’re only going to use XFire client, you only a need few jars from
its distribution. Here is the list that I use (it could probably be trimmed down even more):


activation-1.1.jar
commons-codec-1.3.jar
commons-httpclient-3.0.jar
jdom-1.0.jar
jsr173_api-1.0.jar
mail-1.4.jar
saaj-api-1.3.jar
saaj-impl-1.3.jar
stax-api-1.0.1.jar
stax-utils-20040917.jar
wsdl4j-1.5.2.jar
wss4j-1.5.0.jar
wstx-asl-2.9.3.jar
xbean-2.1.0.jar
xfire-all-1.2.jar

Also, from my expreience SAAJ API only works with Sun JDK as opposed to JDKs provided by IBM as part of Rational development tools.

Choosing Between Axis2 and XFire

I used Axis 1.X in the past (not without some problems, but I was able to get the work done), and so when I needed an open source Web services stack, downloading Axis2 was my first impulse. Then, after playing with it for a short while, I had an afterthought and decided to look at XFire. I never looked back. Here are some things that made my decision not in Axis2 favor (there is actually a WS stack comparison on XFire site but I wanted to see for myself):

  • No JAX-WS or annotations support. I’m not even sure if Axis2 fully supports JAX-RPC for that matter, at least it was not obvious from the docs. I’m not a big fan of JAX-RPC (and who is?) but sometimes being able to use standard APIs is a plus.
  • Deployment model. Why invent a new “aar” archive type? So does it mean that now I have to run Ant (or the special AAR Eclipse plugin) to re-package my service every time I change a class? (It might be possible to put the exploded archive on the classpath but I did not try that.) How will this new component type integrate with app servers? How will the classloading work under different J2EE containers?
  • Data binding. It looks like there is some databinding support for the new AXIOM XML object model, but all examples use Axiom directly, so it was not very clear how to use it (at least not without resorting to WSDL2Java generator). Also, I don’t believe there is an ability to plug in different databinding frameworks such as Jaxb or Castor.

I’m sure Axis2 has a lot of good capabilities, its feature list seems impressive enough. But for now I will be sticking with XFire simply because it’s somewhat easier to setup, its API seems more intuitive, it integrates well with different conainers/app servers and also because of its JAX-WS support.

SOA Is Not the Only Form of Reuse!

Reuse is the key value proposition of an SOA. Being able to share business logic with ease by the virtue of publishing and consuming a Web service can greatly benefit an organization. But one has to realize that it is incorrect to expect SOA to meet all software reuse needs. Performance overhead of a distributed call over the network will always be a detriment to consuming a lot of fine-grained distributed functions, even with the exponential rise of network bandwidth and computing capacity. Many real-life business transactions are complex and thus require complex processing logic. Spreading this business logic over multiple service providers will undoubtedly affect performance. Consumers nowadays require instantaneous response to their requests, so businesses can’t hide the inefficiency of their architectures behind batch processes running once a day. All this means that Web services can only expose coarse-grained functions so that a number of distributed calls are kept under control. So if an application is only interested in a very discreet piece of a reusable business logic, it may not be available as part of SOA.

Reuse of UI components is another example where SOA does not have an answer. This problem can be addressed to some extent by portals. In most instances, however, it is easier to share a library of UI components among multiple applications.

In some sense, SOA was borne out of inability to achieve reuse at the library/binary level (including binary wire protocol level, such as IIOP) as it became clear that most large organizations will always run on heterogeneous infrastructures with bits and pieces supplied by multiple vendors. But I think that hype and enthusiasm created by SOA (look, our Microsoft and IBM applications can finally share some services!) somehow made everybody forget about good old library and component reuse. Granted, library reuse can only work for a given technology/platform, however, in most organizations number of these platforms is limited to very few (e.g., JEE, .NET, mainframe), so the benefits could still be substantial.

I have a feeling that SOA somehow slowed down the evolution of general purpose component management and dependency management tools as many vendors decided to focus almost exclusively on Web services. The idea of a registry/repository is very applicable to any unit of reuse, not just to a Web service. Most SOA implementation efforts put Web services registry/repository in the center of SOA, however very few organizations (as far as I know) implement general purpose component repository as a centerpiece of their development process. The repository products have been around for some time (Flashline, logidex) but their use does not seem to be very wide-spread in development organizations. Maven is becoming a de-facto standard in the Java world but its use is still pretty rare on large projects where “roll your own” is still the norm. I’m also not aware of a similar tool for .NET and other languages or of an open source or inexpensive tool that could handle component management for multiple platforms.

There is also lack of standardization. SOA suffers from overabundance of standards and specifications. Situation with component-level reuse is quite different. RAS is just too complex; MOF is geared toward modeling and MDA. SCA can be applied generically to binary components as well as to Web services, but component management is simply not in its scope.

I just hope that vendors and their customers will sooner or later realize that SOA by itself is not the goal – better business and IT efficiency (and hence better ROI) provided by software reuse is. This goal can’t be accomplished with “one size fits all” approach to software reuse.

First Impressions from Service Component Architecture

I’m currently playing with

Service Component Architecture (SCA)
,
currently supported in IBM products,
such as WebSphere Process Server and WebSphere Integration Developer (WID).

IBM implementation “wraps” all SCA standard
classes and namespaces using IBM proprietary classes and namespaces.
As a result, XML tags and class names don’t match
what’s in the spec. I suspect that the
reason for that was that the products were developed when the specification was
still in flux so hopefully it will change in future versions.

Looking at SCA, one immediately gets a deja vu feeling since SCA
resembles a lot an approach used in Spring and other IOC containers.
Similar to Spring, an SCA component definition file can contain references to other
components and configuration properties. However, SCA lacks some sophistication
of Spring, for example, there is no “auto-wiring” of references and so each reference has
to be wired explicitly in the component definition file. On the other hand, SCA is supposed to
be a cross-language framework (with even some talks of PHP being supported in the future).

On the bright side, SCA is simple and easy to understand.
Editing component definition files by hands is a snap and these file formats are pretty
intuitive (which is quite an achievement for an SOA-related spec).
This is certainly an advantage over similar-in-purpose JBI which comes across as a
much heavier framework with
lifecycle methods and prescribed container interactions a la EJB 2.0.
In SCA you only deal
with components and modules and so the small number of key concepts certainly
makes it easy to grasp.

Even though SCA is related to service integration and SOA, it does not force WSDL
down developers throats; regular Java interfaces are also supported.
Unfortunately, Java interfaces look like second-class citizens in WID
since the the IDE generates WSDL by default (although I was able
to create Java interfaces manually). Also, Java inteface-based components
can’t be referenced firectly from BPEL-based components (I guess,
because references have to match partner links in BPEL and those partner
links are WSDL-based). My personal preference would be to use Java intefaces
whenever possible since they are easier to deal with and easy to enforce
using regular language constructs.

SCA components could be implemented using POJOs.
There are no magic methods and a component implementation class does not need to inherit
from anything or implement any interface (except
for its own Java interface, if it has one).
As per the spec, annotations are also supported, although
it does not look like this support is in
IBM products at this point (they are still on JDK 1.4).

I was hoping that reliance on POJOs will provide for a nice test-driven development
experience. However, at this point, I’m still trying to figure out
a way for binding and invoking
a component from a Junit test outside of a container.
I can test a component’s implementation
directly (I construct SDO objects in my test classes and
pass them to implementation classes)
but I would like to be able to use proper SCA APIs and look up the
component dynamically so I can test against its interface.
Testing framework (which generates Junit test classes) that comes with WID for some reason
was giving me some problems and, in any case, I prefer to write my test classes by hand.

So to sum up, other than implementation-related glitches,
SCA looks like a nice generic component framework.

Is ESB the Starting Point for SOA?

This blog entry discusses
Forrester Wave report on ESB market. The report endorses ESB products and suggests that
ESB is “the most straightforward way to get started with service-oriented integration today”.

And I’ve always thought that the most straightforward way is to start
implementing services instead of infrastructure products (however useful these products might be).
As I blogged before, ESB is not
a magic wand that will make SOA happen with a flick of a switch.
SOA is about implementing services that provide some value to their consumers. Whether these services
are mediated by an ESB is completely secondary.

In my opinion, an ESB should come into play later, after certain critical mass of services is designed (
and potentially implemented) and
a need for mediations and transformation provided by an ESB becomes more obvious.
In certain environments with
a relatively homogeneous application set (e.g., an organization which is 100% J2EE shop),
ESB may not be required at all, assuming that all applications can speak Web services and
canonical data model is well defined.

Even in heterogeneous environments, there are options. For example, ESBs can be used to provide
mainframe integration capabilities (conversion to copybook format, etc.), but a
different approach could be to use CICS 3.1
which support SOAP/XML natively.

The bottom line is that the need for ESBs must be driven by specific business and technical requirements,
not by some kind of perceived goodness of ESB products in general. That’s why I recommend
analysing requirements, designing services (and, perhaps, implementing certain set of services) before
implementing ESB infrastructure.