Valid vs. well-formed
I am new to XML and I'm wondering what the difference is between valid and well-formed? I know that valid XML is that which conforms to a DTD like HTML4, but what is well-formed?

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning writers. Our goal is to keep you informed on recent service-oriented architecture (SOA) and SOA-related topics such as integration, governance, Web services, Cloud and more.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchSOA.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchSOA.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

Well formed is a weaker form of XML compliance in which the principle requirement is that all tags are balanced. i.e. all start-tags have associated end-tags. Apart from that, there are only a small number of rules such as quoting all attribute values, representing any ampersand characters with "&" and any less-than characters with "<".

Well formed XML is very useful because there are application areas (such as rendering) where fully blown validation against a DTD is not required and would overly complicate matters.

However, when designing XML systems, it is advisable to have the sword of Domocles hanging over your head to remind you to create a schema for your XML data. These days, you do not necessarily have to use DTD syntax. You now have a choice of DTDS, RelaxNG, Schematron, and W3C XML Schema.

There are pros and cons to using each of these but any one of them is better than no schema at all. If you rely purely on well formed XML you will end up writing hum-drum data validation logic into your business logic where it will create a maintenance nightmare.

Having said that, be warned that there are times when embedded business logic is the most practical route. The more complex the business rules, the less likely it is that you will be able to cleanly express the rules for validating the XML documents using declarative schema languages.


This was first published in December 2001