In Defense of XML

I read a Reddit post defending the creators of the XML standard. This was focused on XML basics. It excluded fringe technologies such as WS-*, SOAP, XML-RPC, and RDF. The author acknowledged that a big part of XML was related standards like XML Schema and XQuery. However he declared that the main point of XML was standardizing a file format and creating a meta language. The best part of the post was the many comments that were received.

First let’s cover the bad. SOAP requests take forever to parse. XML parsing in general seems to be inefficient. It is slow for large data transfer. Most developers do not like XML. This may be due to the APIs like SAX and DOM being horrible.

That is quite a list of bad comments. But each of them, there was a corresponding good comment. XML has been applied to many domains. It is good for large and complex documentation. It is also good for configuration files. Microsoft Visual Studio has a nice XML editor built in. XML is essentially SGML for the web. In other words, it is meant for data transfer.

Initially I thought I would be involved in a new addition to our system that will parse XML input data. However I am finding that I am too busy with other new parts to our system this year. So I have not been as active in my XML studies since my training class ended. It is still nice to read what people in the XML industry have to say about the topic.

Parsing the XML

Today our system receives input files from mainframe systems in custom format. However starting next year, we will be getting some files in XML format. Today the lead developer on that task shared some of the technology options to deal with reading the input data.

I liked some of the first options that were presented. Essentially they were using a high level language such as C or Java to work with the XML parser provided by Oracle Corporation. The destination of our data is a huge Oracle database.

One of the other options was to use XML DB to pull the XML right into our database as XML. Then we could use some code to further process the XML data from the database. This sounded good as we have a number of developers with Oracle PL/SQL coding talent.

A DBA on our project thought storing huge amounts of XML in our database would be painfully slow. Instead he recommended we transform the data using XSLT. The we could directly load the transformed data into our database with quick loading programs from Oracle. This was all good food for thought.

XML Notepad 2007

While reading an article on the best free tools for your PC, I spotted a product called XML Notepad 2007. This is a free download from Microsoft. I decided to give it a try. We are moving to an XML format for input files next year.

The download was not too large. What really impressed me was how fast the tool installed. Sure it is just a little XML tool. But these days even the simple tools seem to take forever to install. In contrast, once installed, the tool seemed to take a while to launch.

The “read me” file that came with the install said XML Notepad 2007 has been downloaded over one million times. I wonder whether that was the number of downloaded at the time when they released this version (v2.5). Or perhaps the application calls home to Microsoft to determine the current counts.

The application has a right pane which show information from your XML document. At first it seems sparse. But it fills up as you select items in the tree on the left hand pane. This application feels like a simple viewer. However it states that you can use it to create XML documents as well.

I found the tree like structure on the left hand pane useful for determining the structure of the XML in some test files. In other words I could see the hierarchy of parent and child relationships. Make no mistake. This product is no XMLSpy. But it will do until my company can purchase me an XMLSpy product.

XBRL

Altova has come out with MissionKit. This is a set of tools for XML, UML, and databases. In case you did not know, UML is the Unified Modeling Language.

I know Altova for their superior XMLSpy application for XML development. In fact, XMLSpy is a part of MissionKit. I am excited because they are purchasing some XMLSpy licenses on my project.

Altova added XBRL support to their tool kit. XBRL stands for the Extensible Business Reporting Language. It has tags for accounting purposes. This helps businesses exchange financial information. If their toolkit and XBRL support are anything like XMLSpy, I predict good things from it.

Cloud Computing Overview

If you read the trade press, you know there has been a lot of hype over Cloud Computing. At first I ignored it. But Cloud Computing does not look like it is going away any time soon. So I figured I had better pay some more attention to it. My main interest is determining how it impacts me as a programmer.

I saw an interview with a manager who discussed her take on Cloud Computing. She defined it as a deployment model. She also called it a shared resource model. I guess the word model comes with the territory here. That is kind of like the word paradigm or capitulation. But I digress. The manager said some of Cloud Computing is like a hosting model. There goes the model word again. Cloud Computing comes with its own security risk due to the sharing of resources. There are scalability issues depending on the numbers of users and load.

But I am a Cloud Computing newbie. So let’s go back a bit. Cloud Computing promises virtually unlimited CPU, memory, storage and bandwidth. The only limit seems to be how much money you have to pay for it. That may seem like a bandit’s scheme. However you are supposed to only have to pay for what you consume. This is analogous to electricity usage.

Unfortunately there are no standards for Cloud Computing. Two large and popular implementations are Amazon Web Services (AWS) and Google Application Engine (GAE). Those are good buzzwords to know for your next cocktail party. There may not be a service level agreement available for users of these platforms.

Experts think that Cloud Computing is going to be a hit in the enterprise. The idea is still in its infancy. In addition to Amazon and Google, Microsoft has recently come out with its Azure platform for Cloud Computing. That however is a topic for a future blog post.

End of Online Class

I have been engaged in an online class on XML. Some time ago I received the last class via email. It might be time to start using my XML skills to use. Hands-on learning is the best way to really learn a subject. I thought I would share what I learned in my last online class. This last class was a review of some XML related topics not covered yet.

XHTML 1.0 is HTML 4. It is HTML rewritten to be valid XML. It is modularized (i.e. split into modules). Some of the cores modules in XHTML are structure, text, hypertext, and lists. Images are notably not part of the core modules.

XPath is a language to express a specific part of an XML document. It is used by both XSLT and XPointer. You can use XPath to navigate an XML document. The root element is specified by the slash character. The asterisk character selects every element in the path. The each sign (@) matches XML attributes. All of these parts are easier to understand with examples. Perhaps I will provide some XPath examples in the future.

Next year my project is going to start receiving all input data in XML file format. I believe they are using XML Schema to be exact. Now I do not directly work on the team that loads data into our system. I work on applications that use that data. However I might be involved with the changes to read the XML source files pretty soon. Practice makes perfect.

XSL and XSLT

Once again I got an email from my online XML course. This week there were two topics covered – XSL and XSLT. Unfortunately the coverage of XSLT was very light. I think I have written about XSLT with a little more depth before. If not, it might be time for a little more study and a future post. This week starts with an understanding the XML itself is not concerned with display. That’s where XSL comes in.

XSL stands for eXtensible Stylesheet Language. This language contains formatting information for XML. The really interesting part is that XSL stylesheets are XML documents themselves. XSL was one of the first applications written in XML.

Cascading Style Sheets (CSS) define how XML looks in a web browser. They can also be applied to HMTL source as well. Finally the eXtensible Stylesheet Language for Transformations (XSLT) is a language which transforms XML into other formats. Any example would be to transforms XML to HTML. However you can choose other outputs formats, including XML itself.

The heart of XSLT is the use of templates. Now that I look back on my blog posts, I find that I did not cover XSLT in depth anywhere yet. I did see that I wrote myself a note about there being a follow on class to the hands on training I took previously. Maybe that will be a good candidate for me to take next year. Then I will be able to write at length about this complex topic. Until then be well.