Archive for November, 2008

XML Alternatives and YAML

Posted on 11/19/2008 , Alexander Ananiev, 3 Comments ( Add )

The need for a more human-friendly alternative to XML is apparent to many people, myself included. This is the reason why quite a few different light-weight markup languages have been created over the last several years. I guess they are called “lightweight” because they don’t use XML-like tags that tend to clutter documents. I’ve looked at several of them and found YAML to be the most mature out of the bunch as well as quite human-readable (as opposed to, say, JSON) and easy to understand. You can find some very good side-by-side XML vs. YAML comparison here or here, the difference in readability is stunning.

From what I understand, YAML is popular in the Ruby world and it is used for various PHP projects. However, it is almost unknown in Java/J2EE circles. Which is a shame. While annotations somewhat limited the spread of “XML hell” in Java applications, XML still remains a de-facto file configuration format. I would venture to say that except for few outliers, YAML would be a better option as a format for configuration files. Why is it the case? One reason is that YAML format simplifies application support. Developers often say that they don’t care about readability of XML since they use IDE or editors that hide the complexity of XML. Indeed, being able to work with XML in a nice tree view-based editor is appealing. But this does not work when application configuration needs to be quickly analyzed and potentially updated on some remote machine that most likely only has VI or notepad (which is usually the case in production environments, which I find very ironic – shouldn’t the production machine have the most advanced editors and analysis tools to make troubleshooting as efficient as possible?) in response to some production problem. For configuration files, readability and ease of understanding is the key.

Of course, there is also an old trusty property/name-value format. It is, however, very limited, since it does not support any kind of nesting or scoping. So all properties become essentially global and haven’t we learned already that global variables is not a good thing?

YAML, on the other hand, allows for expressing arbitrary complex models. Anything that can be expressed in XML can also be expressed using YAML.

On the downside, YAML does not have a very broad ecosystem. There are not that many editors that support YAML. There is a YAML Eclipse plugin, but in only gives color highlighting, no validation (here is another plugin which I have not tried yet). There is no metadata support, at least for Java, although there is a schema validator for Ruby (its Java port seems to be dead). There is also no XSLT equivalent.

There are two YAML parsers for Java – jvyaml and JYaml. They kinda work, but there is certainly room for improvement in terms of error messages and just the ability to detect and reject an incorrect document. Since YAML is supposed to be a language with minimal learning curve, the parsing has to be intuitive and bulletproof.

I still think that despite the shortcomings YAML is the way to go. Perhaps I will give a closer look to one of these parsers and see if I can tweak it a bit.

Why are Environments So Poorly Supported?

Posted on 11/13/2008 , Alexander Ananiev, No Comments ( Add )

A concept of “environment” permeates software development lifecycle. No application is released into production directly from developers PCs. There has to be a place where an application can go through various stages of testing. We use different environments for that purpose, e.g., “QA environment” or “acceptance environment”.

An “environment” is just a collection of resources which could include middleware and OS/filesystem resources. In the simplest case, an environment for a J2EE web application consists of a single application server. Complex applications consisting of multiple components could utilize many different resources, including several different middleware products (e.g., app server, Web server, messaging infrastructure, ESB).

For any IT organization it is important to know how their resources are used. ITIL has a concept of CMDB that’s supposed to contain all IT resources. However, the granularity of CMDB implementations is usually too high (typically, a server level) which makes it difficult to use for software development. Also, CMDB is not really integrated with development processes and tools, it’s kind of a thing on its own.

Ideally, the environment concept must be supported by development, version control, change management and build/deploy tools. Environment metadata can be used to automatically install an application in a given environment. Testing tools can use this metadata to generate “smog” tests or to adjust existing test cases (e.g., by using different URLs/endpoints). There should be a capability to produce various reports showing what version of what application is installed in what environment.

Sadly, all these wonderful features are mostly missing from modern development and CM tools. Developers rely on scripting and informal use of environment variables. Essentially, today, each application has its own “selfish” view of what an environment is. This makes providing consistent operations and support difficult. This is especially true in virtualized environments where each logical environment may consist of many different VMs.

Case in point are build servers and build tools. I looked at several build servers and found explicit environment support only in AntHill. All the others I looked at (several, I won’t name them) omit the environment concept completely (except for some lame support of environment variables). To me, this is really odd. While build servers have their root in continues integration, their key selling point in an enterprise is actually release management (at lease for commercial products; there are many great open source build servers to choose from if continuous integration is the only goal). So how can a release process be automated if the foundation of this process is missing entirely from the tool that’s supposed to help with the automation?

Build tools are the same way. There is some deployment support in both Maven (Cargo plugin) and Ant, but no way of supporting environment as an entity.

Most Popular Posts

Recent Tweets

Recent Posts

Blog Categories

Blog Archives