XSLT is the XML Style Sheet Language, an XML-based application that permits savvy programmers and content developers to read, parse, and manipulate XML documents on the way to creating all kinds of interesting outputs. In this first tutorial on the subject, you learn the basics of XSLT structure and syntax (see my
XSLT follows normal rules of XML syntax, which means it uses a standard document preamble, and requires matching closing tags for any opening tags that contain content, and proper syntax for empty elements (so that <empty/> or <empty></empty> are syntactically equivalent). Everything else is a matter of formulating one or more series of properly nested XML declarations. For complete details on XML syntax, please refer to any of the references cited in the kick-off piece mentioned in the previous paragraph. Every XSLT stylesheet uses the xsl:stylesheet element as the document element (that is, as the container within which all other XSLT elements must occur). This means that any XSLT document can take the following basic form:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <!-- comments adhere to SGML/HTML/XML syntax, as shown here --> <!-- xmlns identifies the XML namespace and version in use --> <!-- all other XSLT markup occurs before the next element --> </xsl:styleheet>
That said, the xsl:transform element carries the same syntactic value as xsl:stylesheet. Thus, you could use transform instead of stylesheet at the document's open and close, and still produce a valid document. The xsl:stylesheet element is used much more commonly than that xsl:transform element, though.
Table 1 lists and describes all top level elements in the XSLT syntax. In other words, any of these elements can occur directly inside the xsl:stylesheet or xsl:transform elements. These elements provide the building blocks for creating XSLT documents, but of all of them the xsl:template element is by far the most commonly used. All elements appear in alphabetical order, and the "Empty?" designation identifies elements that are commonly empty (though they may indeed sometimes include content, and use a pair of tags, they more usually appear as an empty element ending in />).
|No||Defines set of named attributes for use in the result document|
|Yes||Defines characters and symbols to use when converting numbers into strings|
|Yes||Used to import external stylesheet references into a stylesheet, definitions in the importing stylesheet take precedence over those from the imported stylesheet|
|Yes||Used to incorporate external stylesheet references into a stylesheet, where those from the included document have the same precedence as those from the including document|
|Yes||Defines a named key to be used with the key ( ) function for operating on patterns and expressions|
|Yes||Permits a namespace used in the stylesheet to map to a different namespace in the result document|
|Yes||Controls the format of the result document, and drives the output phase of XSLT processing (the initial phase invovles constructing a result tree, this element governs the second phase that outputs those results)|
|No||If this appears as a top-level element, it defines a parameter that's global to the entire document; if it occurs within a child of xsl:template it defines a parameter local to that template|
|Yes||In tandem with xsl:strip-space, controles how white space from the source document is handled; this element preserves whitespace text nodes|
|No||Defines a template for producing output, either when a pattern or a name is matched (the workhorse element in XSLT)|
|Yes||Used to define a local or global variable in a stylesheet, and to assign it a value|
Recall that XSLT document processing involves parsing an input document, using it to build an internal tree representation to which results are associated (called the results tree) and then against which output transformations are applied to create a result document (also called an output document). Understanding this process helps to substantially illuminate XSLT in general, not coincidentally also shedding great light on its syntax as well.
The flow for most XSLT documents is to scan the input document and to apply relevant templates to that input to create and build the results tree. When processing the input document is complete, the results tree is output, applying whatever output methods, encodings, declarations, and so forth that the xsl:output element allows. To make things more interesting, this element may occur more than once in an XSLT document, so that the conceptual output of that document encompasses the entire sequence of xsl:output elements and whatever effects and outputs they produce.
A quick rundown of other key XSLT elements puts this XML application more fully into its intended context, after which we can explore a short example document. Table 2 lists other key XSLT elements and includes information about valid parent/container elements (for a complete alphabetical listing of XSLT elements, consult Michael Kay's XSLT Programmer's Reference; a citation appears in the introduction to this tutorial series).
|template body||No||Defines a set of document nodes to process, where processing occurs by selecting template rules as appropriate|
|template body||Varies||Used to invoke a named template, much like calling a subroutine or procedure in a program|
|template body||No||Makes a literal copy of the current node to the result tree without making any conversions, or copying any child nodes or attributes|
|template body||Yes||Makes a literal copy of the current node to the result tree, but also copies attributes and child nodes|
|template body||Yes||Use to number nodes in sequence, or to format a numeric value for output|
|xsl:apply-templates, xsl:for-each||Yes||Use to define a sort key and a sort order for nodes selected by parent element|
|template body||No||Outputs literal text to the current node in the result tree|
|template body||Yes||Writes string value of an expression to the result tree|
To cement your (hopefully) growing understanding of XSLT syntax, here's a short but powerful XSLT stylesheet to peruse. Given the flat XML file named order-ingredients-quesadillas.xml, you can create a comma-separated value file (perfect for importing the information into a spreadsheet or database) using the XSLT file.
Example 1 Input: Order-ingredients-quesadilla.xml
<?xml version="1.0" encoding="UTF-8" ?> <ingredients> <ingredient name="onion" type="any" qty="1" units="c" prep="sm dice"> <ingredient name="cheese" type="cheddar" qty="1.5" units="c" prep="shredded"> <ingredient name="beans" type="refried" qty="1.5" units="c" prep="none"> <ingredient name="tortillas" type="wheat" qty="16" units="count" prep="none"> </ingredients>
To this file, we'll apply the following XSLT stylesheet which will output a CSV formatted plain-text file:
Example 1: Stylesheet: CSV-xform-ingredients.xsl
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="ingredient"> <xsl:value-of select="@name"/>,
<xsl:value-of select="@type"/>, <xsl:value-of select="@qty"/>, <xsl:value-of select="@units"/>, <xsl:value-of select="@prep"/>, <xsl:text> </xsl:template> </xsl:stylesheet>
Basically what this does is to parse the input file and for each element of type ingredient, picks up the attributes in order, and outputs its value as a string, followed by a comma. After all attributes are handled the final directive outputs a carriage return linefeed character (shown as the numerical entity ). The output created looks like this:
Example 1: CSV Output produced
onion, any, 1, c, sm dice cheese, cheddar, 1.5, c, shredded beans, refried, 1.5, c, none tortillas, wheat, 16, count, none
Though it's just the tiniest tip of a huge berg of capability, this simple code fragment illustrates XSLT's real power pretty convincingly. This is as compact as most programming languages, if not more, and does the job nicely.
In the next XSLT tutorial, we'll discuss the XSLT processing model and dig deeper into the underpinnings of terminology such as document tree, result tree, output document and more.
About the author
Ed Tittel is a full-time writer and trainer whose interests include XML and development topics, along with IT Certification and information security topics. E-mail Ed with comments, questions, or suggested topics or tools for review.
This was first published in July 2005