The need for a more human-friendly alternative to XML is apparent to many people, myself included. This is the reason why quite a few different “light-weight markup languages”:http://en.wikipedia.org/wiki/List_of_lightweight_markup_languages have been created over the last several years. I guess they are called “lightweight” because they don’t use XML-like tags that tend to clutter documents. I’ve looked at several of them and found “YAML”:http://yaml.org to be the most mature out of the bunch as well as quite human-readable (as opposed to, say, JSON) and easy to understand. You can find some very good side-by-side XML vs. YAML comparison “here”:http://yaml.kwiki.org/index.cgi?YamlExamples or “here”:http://www.ibm.com/developerworks/xml/library/x-matters23.html, the difference in readability is stunning.

From what I understand, YAML is popular in the Ruby world and it is used for various “PHP projects”:http://www.symfony-project.org/book/1_0/08-Inside-the-Model-Layer. However, it is almost unknown in Java/J2EE circles. Which is a shame. While annotations somewhat limited the spread of “XML hell” in Java applications, XML still remains a de-facto file configuration format. I would venture to say that except for few outliers, YAML would be a better option as a format for configuration files. Why is it the case? One reason is that YAML format simplifies application support. Developers often say that they don’t care about readability of XML since they use IDE or editors that hide the complexity of XML. Indeed, being able to work with XML in a nice tree view-based editor is appealing. But this does not work when application configuration needs to be quickly analyzed and potentially updated on some remote machine that most likely only has VI or notepad (which is usually the case in production environments, which I find very ironic – shouldn’t the production machine have the most advanced editors and analysis tools to make troubleshooting as efficient as possible?) in response to some production problem. For configuration files, readability and ease of understanding is the key.

Of course, there is also an old trusty property/name-value format. It is, however, very limited, since it does not support any kind of nesting or scoping. So all properties become essentially global and haven’t we learned already that global variables is not a good thing?

YAML, on the other hand, allows for expressing arbitrary complex models. Anything that can be expressed in XML can also be expressed using YAML.

On the downside, YAML does not have a very broad ecosystem. There are not that many “editors that support YAML”:http://www.digitalhobbit.com/archives/2005/09/15/yaml-editor-support/. There is a “YAML Eclipse plugin”:http://code.google.com/p/yamleditor/, but in only gives color highlighting, no validation (here is “another plugin”:http://noy.cc/symfoclipse/download.html which I have not tried yet). There is no metadata support, at least for Java, although there is a “schema validator”:http://www.kuwata-lab.com/kwalify/ for Ruby (its Java port seems to be dead). There is also no XSLT equivalent.

There are two YAML parsers for Java – “jvyaml”:https://jvyaml.dev.java.net/ and “JYaml”:http://jyaml.sourceforge.net/index.html. They kinda work, but there is certainly room for improvement in terms of error messages and just the ability to detect and reject an incorrect document. Since YAML is supposed to be a language with minimal learning curve, the parsing has to be intuitive and bulletproof.

I still think that despite the shortcomings YAML is the way to go. Perhaps I will give a closer look to one of these parsers and see if I can tweak it a bit.

3 thoughts on “XML Alternatives and YAML

  1. Netbeans supports yaml, both validation and hilighting. It also does “wrapping”

  2. I was looking for XML alternatives. YAML is a viable alternative, but the Java support is still not there.

Comments are closed.