Don Peterson has written an excellent article on SQL Server Central about costs and benefits of using XML. Actually, he has updated a previous article, but his points still work. He original penned these thoughts when XML was the silver bullet buzzword of the day.
The self-documenting nature of XML is often cited as facilitating cross application communication because as humans we can look at an XML file and make reasonable guesses as to the data's meaning based on hints provided by the tags. Also, the format of the file can change without affecting that communication because it is all based on tags rather than position. However, if the tags change, or don't match exactly in the first place the communication will be broken. Remember that, at least for now, computers are very bad at guessing.
Regardless of the specific means used, communication across systems requires two things:
- Agreement on what will be sent (what the data means), and
- Agreement on a format.
Using XML does not alter this requirement, and despite claims to the contrary, it doesn't make it any easier. In order to effect communication between systems with a text file, both the sender and receiver must agree in advance on what data elements will be sent (by extension, this mandates that the meaning of each attribute is defined), and the position of each attribute in the file. When using XML each element must be defined and the corresponding tags must be agreed upon. Note that tags in and of themselves are NOT sufficient to truly describe the data and its meaning, which of necessity includes the business rules that govern the data's use unless a universal standard is created to define the appropriate tag for every possible thing that might be described in an XML document and that standard is rigorously adhered to.
I can remember sitting in meetings where XML was positioned as:
- The end of programming as we know it
- The end of data modeling as we know it
- The answer to every data management problem ever faced
- The end of RDBMS's
- The answer to every performance problem we'd ever faced (WHAT?)
- The end of data integration problems.
I do use XML, nearly daily, but I never understood all the hoop-la about it. First, the concept is really simple, for the most part. Second, I didn't understand (and still don't) how it solves any integration problems at all. The hard part about data integration is agreement on what each piece of data means. Simply throwing a tag (label) around a piece of data doesn't do much. Even giving it some very rudimentary datatypes is less meaningful than what we normally do with data models.
I definitely see the value in using XML for certain technical solutions. I've even specified and architected such solutions. But I haven't see the light of XML being the answer to almost every IT problem.
Peterson's article addresses these points and more. I'd love to see your take on his positions...and to hear what your responses are when you are told something about the magical properties of tags and XML.