IBM WebSphere 7 (currently in beta) comes a property-file based configuration tool that provides a "human-consumable" interface to the currently XML-based configuration repository of the application server. This is another proof that XML is simply not the right mechanism for managing configuration of complex software products.
From the release notes:
Properties (name/value pairs) files are more consumable for human administrators than a mix of XML and other formats spread across multiple configuration directories.
Kudos to IBM for recognizing that.
It is still not clear though how hierarchical relationships between configuration objects will be supported.
Back in WAS 6 world, I've been using a simple jython script that converts python named parameters into wsadmin format. This is an example of a resource described in this format:
WASConfig.DataSource(parent="testJDBCProvider", name="testDS", jndiName="jdbc/testDS",
description="Test DataSource", propertySet=dict(
resourceProperties=[
dict(name="dbName", value="testDB", type="java.lang.String" ),
dict(name="connectionAttribute",value="", type="java.lang.String")
]))
I think that a slightly more streamlined python-based format will be superior to properties.
It's pretty easy to create an Ant file for a simple project. A simple Ant script typically contains ubiquitous "init", "compile", "test", "war" (or "jar), "build" targets all wired together. It's easy to change and easy to understand and the script's flow has a declarative, rule-based feel to it. The problem is, projects and their build files rarely stay simple for long. Soon we need to add "validate.xml" target, junit reports, deployment to your application server and so on. Then we begin supporting multiple environments; we discover that our local desktop configuration is different from how integration environment is configured so we add numerous property files, "ftp" tasks and multiple "copy" targets for various application files. Before we know it, the build script becomes a convoluted mess of XML tags and there is nothing declarative about it anymore; it's morphed into a full-fledged, very procedural program. Perhaps we even had to resort to using ant-contrib "if" and "for" targets to implement procedural logic in Ant. And nothing is uglier than an "if" with complex conditions expressed in XML.
A better approach would be to implement "procedural" portion of the build script in Java or any of the scripting languages that Ant supports. The problem is, configuring and invoking Ant tasks from Java or a scripting language leads to verbose code. For example:
execTask = project.createTask("exec")
execTask.setOutputproperty(outputPropertyName)
execTask.setErrorProperty(errorPropertyName)
execTask.setResultProperty(resultPropertyName)
execTask.setExecutable(execName)
arg=execTask.createArg()
arg.setLine(paramString)
Doing the same thing in XML is shorter and cleaner:
<exec executable="${execName}" outputPropert="p1"
errorProperty="p2" resultProperty="p3">
<arg line="${params}" />
</exec>
So what can we do to make task invocation syntax more concise and easier to understand? In fact, the syntax could be drastically simplified with the help of simple Ant "adapters" that can be developed for popular scripting languages since Groovy, Ruby and python all have fairly intuitive syntax for supporting lists, dictionaries and other data structures. I developed such an adapter for jython. It uses python named arguments and dictionary syntax, so executing a "copy" task looks like this:
pant=PAnt(project)
pant.exTask("copy", todir="testdir", fileset={"dir":".","includes":"*.xml"} )
"PAnt" is the name of the "ant adapter" class for Jython. The class configures and executes Ant tasks based on the provided arguments using Ant task configuration API.
"pant" module also comes with a simple helper function called "nested" so that named arguments can be consistently used for nested elements. With syntax highlighting supported by most editors/IDEs (e.g., you can try PyDev for jython/python development), it allows for better visual distinctions between attribute names and values:
pant.exTask("copy", todir="testdir", fileset=nested(dir=".", includes="*.xml") )
To use "PAnt" from Ant, you can develop custom tasks using "scriptdef" or simply embed python code directly into a target:
<target name="test.pant" >
<script language="jython">
from pant import *
pant=PAnt(project)
pant.execTask(name="echo", message="foo")
</script>
</target>
The "pant" module itself is just a few lines of code as you can see from its code. Don't forget to properly setup your "python.path" if you want to make it globally available to all your Ant scripts.
There is also an open-source project Gant that provides similar (in fact, much more extensive) capabilities for Groovy, but I have not had a chance to play with it; I specifically wanted to use python/jython because jython can also be used for WebSphere Application Server administration.
In my mind scripting language-based approach for writing build files provides for much more flexible and easier to understand and maintain scripts. When you start implementing your Ant logic in python, you'll see that Ant targets become much more coarse-grained, since there is no need to create internal targets (the ones that are never invoked by the users) to simulate subroutines or conditional targets to simulate "if" statements . It is also nice to be able to use all the capabilities of a full-blown programming language as oppose a limited subset of procedural tasks that Ant provides (such as "condition"). Being able to user properly scoped variables instead of inherently global Ant properties is another great benefit.
At the same time, it is still possible to use Ant targets for expressing a flow wiring together major functions of the build script. I would prefer something less XML-like for this purpose too, but that's a task for another day.
Please refer to our official PAnt project page for more information and to download PAnt
XML is everywhere these days. It is used for passing data around, for specifying metadata and even as a programming language for tools such as Ant and Jetty.
When XML is generated by various development and run-time tools (e.g., for serializing Java objects into SOAP), its complexity and readability don't matter much since humans have to deal with raw XML only occasionally (e.g., to troubleshoot a problem).
However, more often than not, XML is written directly by developers mostly with the help of a validating XML editor/IDE (that is, if developers are lucky and Schema/DTD are available). WSDL (in the case of WSDL-to-Java approach), XML schema and Ant build files are a just a few examples when this is the case.
Using XML as a mark-up language for otherwise mostly text documents (e.g., XHTML) it's not a totally bad idea. However, XML is ill-suited for specifying complex metadata which dynamic dependencies or for wiring command-based logic (e.g., Ant) or for defining domain-specific languages. That is, ill-suited for humans.
For starters, XML is unlike any other programming language (or a natural language). Consider a basic XML construct: <name>value</name>. In any other language it would've been written as "name=value" (or "name:=value", or something similar). An assignment is a construct familiar to most of us from math, even though we may not understand the intricacies of r-value versus l-value. It is intuitive. XML relegates this basic construct to attributes that can only be used as part of an element. Using attributes, a simple assignment can be expressed as <name value="my value" />, which is a bit easier to understand than a purely element-based construct. However, "value" attribute still seems kind of redundant.
Another annoying feature of XML is closing tags. Closing tags is what makes XML verbose. (What's interesting, SGML, which XML is derived from, did not require closing tags, so one could write something like <TAG/this/.) In most programming languages we express grouping and nesting using brackets or parenthesis or braces. This is true for function arguments, arrays, lists, maps/dictionaries, tuples, you name it, in any modern programming language. XML creators for some reason decided that repeating the name of a variable (tag) is the way to go. This is a great choice for XML parsers but a poor alternative when XML is written/read by humans.
Closing tags do help when the nesting level runs deep. But it does hurt in cases when there is a need to express a simple construct with just a few (or one!) data items. Problem is, our brain can only process limited number of items at a time, so intermixing data that needs to be processed with tags that serve as delimiters for this data makes comprehension more difficult. For simple lists, a comma-delimited format could be a better choice in many situations.
In general, repeating the same set of tags over and over again to define repetitive groups makes XML difficult to read, especially when each element contains just text:
<welcome-file-list>
<welcome-file>index.html</welcome-file>
<welcome-file>index.htm</welcome-file>
<welcome-file>index.jsp</welcome-file>
</welcome-file-list>
Compare this with a simple property/comma delimited format:
welcome-file-list= index.html, index.htm, index.jsp
Finally, what's up with angle brackets? I suppose, brackets could be justified when an element has multiple attributes. In many cases, however, elements don't have attributes and so an angle bracket is simply a way to distinguish a tag name from data. This is again, counter-intuitive and different from many modern programming language. Normally, variable names are not bracketed or quoted, instead, values are. Also, if there was a need to use a special symbol for denoting variables, wouldn't using "$" or "${}" be a more intuitive option for most of us?
Of course, XML has many advantages, the key one being that it is very easy to develop grammars for XML documents (using DTD or Schema). Another one is the fact that grammars are extensible via namespaces. Finally, any XML grammar can be parsed by any XML parser; to a parser an XML document is just a hierarchy of elements.
This simplicity, however, comes at great price. Expressiveness of XML is extremely limited. It only has a limited number of constructs and no operators. While it's adequate for its role as a markup language for text files, it puts a lot of constraints on any more-or-less complex metadata format, let alone something requiring procedural logic, such as Ant or XSLT. As a result, intuitiveness of XML-based grammars suffers.
I'm not saying that we must stop using XML altogether. It has its place. But we should not be applying it blindly just because it's the only widely available tool for creating domain-specific languages. For starters, BNF/EBNF, should be part of any developer's arsenal (along with ANTLR). And good old name/value pair and comma-delimited formats should be seriously considered for simple situations that do not require support for hierarchical structures.
This article on techtarget is a great illustration of my point from the previous post about the importance of the proper design patterns and techniques required to be able to benefit from XML appliance capabilities.
When implementing Web services Java developers tend to think in terms of Java classes that XML documents map to. Using XSLT (or even Schema) for implementing part of their processing logic is not on their list because the common thinking is that it is too expensive to do it in Java.
With XML appliances the situation is exactly the opposite. XSLT all of a sudden becomes one of the best performing part of an application (although, I would imagine that using Java hardware acceleration such as the one provided by Azul might once again change that). This could be a serious "paradigm shift" for many developers and architects.
Another obstacle to more effective usage of appliances could simply be the lack of XSLT skills. XSLT is essentially a functional language and so it comes with a learning curve attached, especially for complex transformations. It is important to have a good knowledge of XSLT to understand what kind of work can be "outsourced" to an appliance. Not that many developers have this knowledge today, but perhaps it will change with more widespread use of XML appliances.
JSON is a simple object serialization approach based on the JavaScript object initializers syntax. The code for initializer (object literal) is put into a string and then interpreted using JavaScript eval() function or JSON parser (which is very lightweight):
serializedObj='{firstName:"john", lastName:"doe"}';
...
// This is just an example, JSON parser should be used instead
// to avoid security vulnerabilities of "eval"
var obj = eval("(" + serializedObj + ")");
document.getElementById("firstName").innerHTML=person.firstName;
JSON is used extensively in various AJAX frameworks and toolkits to provide easy object serialization for remote calls. It is is supported by both GWT and DOJO. There is a growing realization that perhaps JSON should be considered as an option when implementing SOA, for example Dion Hinchcliffe recently published a blog entry suggesting that JSON (and other Web 2.0 technologies) must be seriously considered by SOA architects.
So what are the benefits of using JSON relative to XML (for this comparison I want to focus just on XML and stay away from SOAP)? Here is my brief take on that, there are also numerous other articles and posts on the subjects.
| JSON | XML |
|---|---|
| JSON strengths | |
| Fully automated way of de-serializing/serializing JavaScript objects, minimum to no coding | Developers have to write JavaScript code to serialize/de-serialize to/from XML |
| Very good support by all browsers | While XML parsers are built-in into all modern browsers, cross-browser XML parsing can be tricky, e.g., see this article |
| Concise format thanks to name/value pair -based approach | Verbose format because of tags and namespaces |
| Fast object de-serialization in JavaScript (based on anecdotal evidence ) | Slower de-serialization in JavaScript (based on anecdotal evidence ) |
| Supported by many AJAX toolkits and JavaScript libraries | Not very well supported by AJAX toolkits |
| Simple API (available for JavaScript and many other languages) | More complex APIs |
| JSON weaknesses | |
| No support for formal grammar definition, hence interface contracts are hard to communicate and enforce | XML Schema or DTD can be used to define grammars |
| No namespace support, hence poor extensibility | Good namespace support, many different extensibility options in Schema |
| Limited development tools support | Supported by wide array of development and other (e.g., transformation) tools |
| Narrow focus, used for RPC only, primarily with JavaScript clients (although one can argue that it's one of the strengths) | Broad focus - can be used for RPC, EDI, metadata, you name it |
| No support in Web services -related products (application servers, ESBs, etc), at least not yet | Supported by all Web services products |
So the bottom line is that JSON and XML are, of course, two very different technologies; XML is much broader in scope so I'm not even sure if comparing them side by side is fair.
As an object serialization technology for AJAX (or should it now be called AJAJ since we've replaced XML with JSON?) JSON looks very appealing. Anybody who ever tried parsing SOAP directly in a browser (while having to support multiple browsers) can attest that this is not a straightforward task. JSON simplifies this task dramatically. So I think that ESB vendors should definitely start thinking about adding JSON to the list of formats they support.
One of the keys to SOA success is that it should be easy to consume a service, i.e., the entry barrier for service consumers must be low to support "grass root" SOA adoption. While a top-down SOA effort may succeed, it will certainly take longer than bottom-up ("grass-root") approach when developers are able to consume services as they see fit. AJAX/JSON fits this bill perfectly - it is easily understood by developers and it does not require any Web services -specific tools or infrastructure.
So overall I'm pretty enthusiastic about JSON.
Good Web services interoperability is an absolute must for a successful SOA implementation, but why interoperability has been so difficult to achieve?
I think that inability to comply with a published Web services contract expressed via its WSDL/Schema could be one of the leading causes of interoperability problems (I use the term "interoperability" pretty broadly here).
For example:
My biggest pet peeve is that most JAX-RPC implementations today don't even support schema validation out of the box (hopefully this will change in JAX-WS). I understand that full validation against a schema could be expensive, but can we at least handle required parameters/fields as they defined in the schema? Types mismatch will certainly cause an error, why not a missing mandatory field?
Note that I'm not talking about WS-I profiles, .NET/Java interop, and differences in WS-* implementations. I've seen all of the above problems while working with different Web services implementations in Java (mostly JAX-RPC).
And it is not just about interoperability - reliance on subtle conventions as opposed to a published contract makes architectures more fragile and more tightly coupled.
So my advice is simple - write a handler (at lest for development/test environments) and enable schema validation on the server and, under some circumstances (i.e., you don't control the service), on the client. This will save you from many interoperability problems, guaranteed.
Good Web service design starts with a schema. Binding, port type and all these other parameters of a WSDL file usually are not interesting at all – 99.9% of all services have trivial SOAP bindings, no headers and no declared faults. Also, majority of Web services today are document-style with one “part” per message. So schemas of input and output messages are really the key to understanding what service is about. In other words, schema truly defines the contract of a service.
There is no substitution for the schema – you can’t generate a good schema from a source code or a UML class diagram since they do not convey optionality, restrictions, patterns, element model (sequence or choice) and other aspect. XML Schema is verbose and complex (and that’s a topic for another post), but it is pretty expressive as far the data typing goes.
So I usually start a service design with a well-documented schema. The question is, however, how to then publish it so other people can review it (you know, people who represent service consuming applications – these folks are kind of important). Reviewing a schema in a “raw” format is out of question, especially if business people are involved. So a nicely formatted HTML documentation which could present the schema in a readable format (ideally, translating schema lingo into plain English, such as “minOccur=0 ” could translate into “optional”) is the way to go.
I suspect that SOA registries, such as Infravio, can take care of this problem. But oftentimes there is a need to prototype a solution quickly before SOA infrastructure is put into place. Also, at this day and age, open source/free/inexpensive tools might be a requirement for many projects.
The best known Schema documentation generator is xnsdoc. It is not free, but 49 EUR seems like a reasonable price. xnsdoc has an open source XSLT-based counterpart with fewer capabilities. xnsdoc does produce nice documentation along the lines of JavaDoc format (although I’m not convinced that this is the best choice – remember, the documentation has to be suitable for non-developers), however, in my view it does not do much in terms of explaining the schema in plain English. In other words, the assumption is that users are familiar with Schema concepts. I also found that the support for groups and multi-files schemas had some problems (but keep in mind that I’d done my testing almost a year back, so please check the latest version)
A better tool that I found is Bluetetra XSDDoc (99 USD). In my view, it provides more user-friendly schema documentations. You can see some examples here. Unfortunately, it is not clear if XSDDoc is still actively supported – its website provides only minimal information and the product has not been updated in a while.
I still think that there is a need for more interactive Wiki-style schema publishing tool that would allow users reviews schemas expressed in layman terms and comment on elements and types.
The client had already decided to standardize on XML Schema, so using Relax NG or Schematron was not an option.
XML Schema provides a lot of different capabilities but based on my experience I think that it could benefit from some improvements. Here are my random thoughts on this. Now, I don‘t claim to be the ultimate XML Schema expert, so take it for what it‘s worth.
So my wish list is actually quite simple: