What is XPath Anyway?
The XML Path Language, or XPath, is a language that defines a syntax for locating items in an XML document. It was originally defined for use with XSL transformations and most readers will have encountered it in that context. Java programmers recognized that XPath expressions could be very useful and with the release of Java 1.5, XPath arrived in the standard toolkit in the javax.xml.xpath package.
Getting an Instance of XPath
Like many other APIs in JAXP, in order to get an instance of a working class you start with a factory. Although it seems cumbersome, this architecture provides flexibility and allows for future expansion. In the following example, the parameter handed to the newInstance method says that we want to build XPath objects that work with the default W3CDOM model, the only one supported in Java 1.5.
Once you have an XPath object, there are two ways to put it to work. You can have it evaluate an expression or you can have it compile the expression to create an instance of XPathExpression that incorporates the expression logic and can be used repeatedly.
A Simple XPath Example
The first XML example I am going to use is the web.xml file for the example servlets in Tomcat 5.5.9. In the following statement, doc is a reference to the JAXP
Document for the web.xml file.
That is the text content extracted from the following section of the web.xml document, note that the evaluation preserved all of the text content of all of the nodes contained in the first "filter" element found.
It is important to note that only the first node satisfying the expression contributed to the output. Returning the full text content of the first node is the default for that particular "evaluate" method call. Contrast that simple XPath statement with the number of org.w3c.dom.Node method calls which would be required to extract that text from 6 separate elements and you b
To continue reading for free, register below or login
To read more you must become a member of SearchSOA.com
');
// -->

egin to see the attraction of working with XPath.
Evaluation for Different Content Types
There are four different XPath methods named "evaluate", two are defined as returning a java.lang.String and two as returning a java.lang.Object reference. Therefore in writing a statement using evaluate, you may have to provide a specific type cast. The methods which provide for returning various object types are controlled by means of constants defined in the XPathConstants class.
For example, we can get all five of the nodes in the example web.xml document using the following statement.
NodeList nl = (NodeList)xpath.evaluate("/web-app/filter", doc, XPathConstants.NODESET );
Where the returned type implements the org.w3c.dom.NodeList interface methods. Note that although "NodeList" sounds like it should implement the java.util.List interface, it does not. The XPathConstants and the corresponding Java reference types that will be returned can be summarized as follows:
Now for a more complex example. The XML document source will be the server.xml file that Tomcat uses to define the service to be created and the connectors
that will be exposed. Here are the pertinent XML elements. The real file is much larger.
The following code, where doc is a org.w3c.dom.Document containing server.xml, locates the element having the "name" attribute equal to "Catalina". Inside that element it finds the first element and locates the attribute named "enableLookups". The text value of that attribute is then used
to create a Boolean object which is returned.
Note that although the examples I have been using start with a org.w3c.dom.Document object, the evaluate method can apply an expression to any node in a document.
Using XPathExpression Instances
Instead of using the XPath evaluate method used in the first examples, you can build an XPathExpression instance that contains the expression and use it repeatedly. For example we could reproduce the output from the first examplewith the following:
The intent of the XPathExpression class is to let the programmer define a suite of search expressions which can be reused, thus saving a bit of
programming complexity.
Performance of the XPath Toolkit
Surely nobody would expect XPath, which is built on top of standard JAXP classes, to be faster than those classes. To get at the performance penalty for using XPath I timed the creation of XPathExpression instances and subsequent evaluation with an expression to get a NodeList of the nodes in a web.xml file. The Java statements required to do this (given an existing instance of XPath) are:
Using the methods in the org.w3c.dom package, this would be accomplished by code like the following to first get a NodeList containing the elements:
Followed by looping through the elements to get each element as the contents of a second NodeList:
The timing results using my AMD Athlon 1.4GHz cpu can be summarized as follows:
The other performance indicator of interest is the amount of memory used, so I measured the memory consumed by creating 1,000 instances of XPathExpression. This turned out to be very small, approximately 500 bytes per instance.
Apparently the convenience and flexibility of using XPath comes with a considerable execution speed penalty. However, for many applications programmers will be glad to accept a speed penalty in exchange for simplicity and flexibility. I think we can all be glad that XPath is now a part of the Java standard library.
References
The W3C's XPath Recommendation 1.0 is at:
http://www.w3.org/TR/xpath
The chapter on XPath in Elliote Rusty Harold's book, "Processing XML with Java"
is available online at:
http://www.cafeconleche.org/books/xmljava/chapters/ch16.html
The JavaDocs for the javax.xml.xpath package are available online at:
http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/xpath/package-summary.html