XML Developer Tip
(Receive this column in your inbox,
click Edit your Profile to subscribe.)
Euro symbols and XML documents
Those XML content developers whose work comes at a price -- or at least, whose work includes pricing information -- and whose scope extends into European markets will want to know how to include the Euro symbol (€) into their documents. Traditionally, adding symbol references to XML documents means including some kind of numeric or symbolic character entity. One common source for the Euro character is the Unicode character set, from which the following entities may be used with UTF-8 or UTF-16 encodings:
- Numeric (hex): €
- Numeric (decimal): €
- Symbolic (name): €
That said, there are some important things to remember when using entities in XML documents:
- Entities must be declared. In the DOCTYPE section of an XML document the following would be valid: <!ENTITY euro "€">
- Remember that numeric character references are not recognized in CDATA sections in XML DTDs or documents, and that they cannot appear within XML names associated with elements, attributes, or IDs, either.
For those using XHTML or HTML, a declaration for the euro is included in the HTML Special entity set available at www.w3.org/TR/html401/sgml/entities.html#h-24.4 (it's named "euro" and appears as the last entry on the aforementioned Web page). To use these declarations, you must include this markup in your DOCTYPE section for XHTML or HTML documents:
<!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special//EN//HTML"> %HTMLspecial;
Otherwise, explicit declaration of the entity itself—as noted above—is required in XML documents.
One more thing to note: the reason such contortions are necessary in XHTML and HTML is because the default HTML/XHTML character set, known as ISO 8859-1 or ISO-Latin-1, does not include the euro character (the set was defined long before the euro came into being). A new character set known as ISO 8859-15, also known as ISO-Latin-9 adds the euro and switches out some seldom seen characters to better support Finnish and French character needs. To use this character set, the following XML header definition works:
<?xml version="1.0" encoding="iso-8859-15"?>
In this new character set, the euro takes over for the generic currency symbol encoded as ¤ within that set.
Armed with this information, your XML, HTML, and XHTML documents should soon be ready for business in the standard European currency. (For more information on this topic, please see Rick Jeliffe's far more detailed article "Euro-XML".
About the Author
Ed Tittel is a principal at LANWrights, Inc., a network-oriented writing, training, and consulting firm based in Austin, Texas. He is the creator of the Exam Cram series and has worked on over 30 certification-related books on Microsoft, Novell, and Sun related topics. Ed teaches in the Certified Webmaster Program at Austin Community College and consults. He a member of the NetWorld + Interop faculty, where he specializes in Windows 2000 related courses and presentations.
For More Information:
- Looking for free research? Browse our comprehensive White Papers section by topic, author or keyword.
- Are you tired of technospeak? The Web Services Advisor column uses plain talk without the hype.
- For insightful opinion and commentary from today's industry leaders, read our Guest Commentary columns.
- Hey Codeheads! Start benefiting from other time-saving XML Developer Tips and .NET Developer Tips.
- Visit our huge Best Web Links for Web Services collection for the freshest editor-selected resources.
- Choking on the alphabet soup of industry acronyms? Visit our helpful Glossary for the latest lingo.
- Visit Ask the Experts for answers to your Web services, SOAP, WSDL, XML, .NET, Java and EAI questions.
- Discuss this issue, voice your opinion or just talk with your peers in the SearchWebServices Discussion Forums.
This was first published in December 2002