Home > SOA Tips > XML Developer > Two tricky techniques for preserving character entities in XSLT 2.0
SOA Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

XML DEVELOPER

Two tricky techniques for preserving character entities in XSLT 2.0


Ed Tittel
07.14.2004
Rating: -5.00- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


Thanks to a recent story by Bob DuCharme for XML.com, entitled "Entity and Character References," whose focus is XSLT 2.0, I found myself pondering a problem typical for those who take XML documents through multiple parsers while working through various transformations or operations. DuCharme succinctly observes that while a parser's job is to take entity references (in SGML those symbolic names that start with an ampersand and end with a semicolon, like the character entities &amp; for ampersand and &lt; for the less-than symbol) and replace them with their values. Trouble is, if you're trying to create output that needs and expects characters entities in the final document, you're in a bit of a pickle if a parser somewhere early in the chain replaces &amp; with "&" and &lt; with "<".

But there is a two-step maneuver that makes this relatively easy to gloss, without having to store those items as unparsed character data in CDATA sections, or through use of XSLT's disable-output-escaping attribute. By first using numeric references rather than character entities -- that is &#38; rather than &amp; and &#60; in ISO-Latin-1 -- you can use XSLT to transform this stuff exactly as you wish during a final editing pass (or at least, something that follows after the last parser that might otherwise make substitutions you don't want). This, of course, is step number one.

Step number two depends on using the character map feature in XSLT 2.0, whereby you can convert input strings consisting of specific characters into whatever you instruct your markup to do. In this case, you can take numeric character references (which are not entities, and hence not parsed) and turn them into character entities so they're ready when you need them. A character map basically defines a substitution table that the XSLT processor uses so that when it finds a certain string, instead of writing it directly to the results tree, it inserts a corresponding replacement instead. Thus, the following example:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

  <xsl:output use-character-maps="num2ent"/>

  <xsl:character-map name="num2ent">
    <xsl:output-character character="&#38;" string="&amp;"/>
    <xsl:output-character character="&#60;" string="&lt;"/>
  </xsl:character-map>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

This markup does nothing more than write the entire results tree verbatim to output except when it encounters the two numeric entities specified, in which case it replaces them with the desired character entities. Obviously, thanks to Mr. DuCharme, you can grab this code and add whatever <XSL:output-character...> replacements you want and you've got a handy-dandy tool. This is particularly useful when you have to run content through other applications (like MS Office components) that may not perform entirely sensible replacements for you, or when you want to create markup as final output (something anybody who teaches markup must do all the time). Very handy indeed!


Ed Tittel is a writer, trainer, and consultant based in Austin, TX, who writes and teaches on XML and related vocabularies and applications. E-mail Ed at etittel@lanw.com.


Rate this Tip
To rate tips, you must be a member of SearchSOA.com.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
XML
National Weather Service policy supports XML
XML and democracy at work: The Election Markup Language (EML)
For interesting interface access, check out Xamlon
Royalty-free, revolutionary UBL
Altova strikes again with MapForce 2005
Beating the RSS crunch with aggregation/bloglines
Voice, speech, SIP, and XML: ECMA-269
Microsoft Baseline Security Analyzer and XML
An open source, native XML database: dbXML 2.0
Second-generation XML security preview: SAML

XML Developer
Use the soapUI software tool to tame WSDL
WSDL 2.0, new messaging for Web services
Using RELAX NG For data integration
Efficient XML Interchange tackles data verbosity
XML to DDL imports, synchronizes database schemata
The basics of MathML 3.0
Migrating to XSLT 2.0
What's up with XML 2.0?
Say hello to XPath 2.0
Podcasting software covers many bases

XML and XML schema
What's the future of XML?
SOA pattern of the week (#7): policy centralization
Try XML-based Extensible Business Reporting Language (XBRL) for accounting reports
What's new at the W3C
Ganymede: Modeling tools target SOA, UML
Data services mashups emerge for SOA
Making sense of data services mashups
XML turns 10
SOA helps save 100-year-old business
Oracle maps heterogeneous data services strategy for SOA

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
class diagram  (SearchSOA.com)
Fast Infoset (FI)  (SearchSOA.com)
GeoRSS  (SearchSOA.com)
Keyhole Markup Language  (SearchSOA.com)
RELAX NG  (SearchSOA.com)
state diagram  (SearchSOA.com)
Universal Business Language  (SearchSOA.com)
Vector Markup Language  (SearchSOA.com)
XML infoset  (SearchSOA.com)
XML pipeline  (SearchSOA.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



SOA Trends and Strategy - SOA Education, SOA Development, SOA Implementations
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2001 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts