|
|
||||||||||||||||||||
| Home > Will binary XML solve XML performance woes? | |
| Commentary: |
|
||
Guest Commentary
XML's blatant inefficiency is one oft-cited downside to anything XML-based, including Web services. Text-based, metadata-laden XML is intended both for machine processing and human readability, resulting in message sizes that can easily be 10 to 50 times larger than equivalent messages sent via binary encodings. To make matters worse, conducting a simple point-to-point exchange between XML conversant endpoints might require each of the following operations: decryption, validation, parsing, marshalling, serialization, canonicalization, document signing and encryption. Each of these steps must be executed on a per-message basis, and as such can impose a significant load on processing machines. To make matters worse, XML traffic is content-oriented, rather than protocol-oriented. As a result, devices responsible for performing any operation on XML traffic must make decisions based upon the content of the messages, rather than the protocols that underlie those messages. All of these operations impact XML processing performance, threatening to grind such XML processing to a halt. While XML's verbosity and inefficiency may be acceptable for situations with moderate transaction volumes, XML's processing overhead, storage requirements, and bandwidth consumption become quite problematic when transaction volumes are high. As a result, companies increasingly require techniques for improving the performance of such critical tasks as content-based security, policy enforcement, malformed message protection, authorization and authentication, encryption and decryption, and schema validation of XML messages. One emerging approach for improving the performance of XML processing treats XML as a binary format – rather than the text-based format so often maligned as the root of XML's inefficiency. Such an iconoclastic approach to XML flies in the face of the conventional wisdom about the benefits of text-based XML. Nevertheless, binary XML is gaining some traction in the marketplace, and may help solve many of the performance problems that promise to swamp tomorrow's XML-laden networks. Why aren't compression or encoding good enough? So, compression may solve the bandwidth issue, but it worsens the processing problem. In addition, GZIP is not type-aware and does not compress large sets of floating-point numbers well. Furthermore, compression and encoding formats like zip and base64 offer an "all or nothing" approach – once a message is encoded, the recipient must decode the entire message in order to work with any part of it. However, much like compression, usage of base64 or equivalents requires a first-pass encoding step as well as a decoding step once the file is received by the end-point, so any marginal gains in network bandwidth are also lost in processing time. Finally, it's quite likely that one might not even realize network performance gains using encoding, since encoded documents can often be much larger than their original format. Moving and processing XML in binary In this natively binary format, it's possible to transmit whole XML documents or stream them incrementally, without sacrificing performance speed. At the same time, the encoding software validates the documents as a side effect of the binary encoding mechanism. In effect, the binary XML format is a "pre-parsed" version of an XML document that an endpoint can readily consume, without any additional decompression or validation. The endpoint can also consume just the part of the message it needs, without needing to consume or parse the entire message. As a result, endpoints can process binary-encoded documents many times faster than the equivalent text-encoded XML files, and considerably faster than any other XML compression scheme. Another advantage of binary XML is the ability to handle data types in their native format. For example, the binary encoding represents floating point numbers so that the endpoint need not translate between strings and integers, for example, and thus impinge on processing time. As a result of the pre-parsing and binary representation, binary XML promises performance improvements several orders of magnitude over their text-based brethren. The downside to binary XML It is also not clear if solving the parsing and transmission problems of XML will truly result in significant overall performance increase. In many situations, XML processing represents only a small part of the overall processing load for a given XML message. Binary XML does not address the processing costs that result from security look-ups, semantic mapping, transformation, and other complex processing tasks, suggesting that binary XML might not be worth the trouble, since the processing bottleneck may be elsewhere. Alternatives to binary XML The ZapThink take On the other hand, XML by itself is a great technology, but you need more than just XML to do anything important. Security, reliability, process, management, and loose coupling require more than just a document format language, leading to bloat, complexity, and vendor influence on the XML format. After all, business users simply want products that provide the benefits of business agility and IT asset reuse in the face of IT heterogeneity. Binary XML addresses the bloat of XML, but represents a movement away from the simpler roots of the language. So, who's right? And more importantly, will binary XML gain adoption as a solution to XML's performance challenges? At the end of the day, it is the technology consumers, and not the technology producers, that determine the viability of a technology. Binary XML offers significant benefits in particular situations, including high transaction volume environments, the exchange of large documents, and interactions burdened with limited bandwidth and limited processing capability, such as on mobile phones and PDAs. However, the downsides of limited capabilities for Service intermediaries and a somewhat vendor-dependent implementation will limit binary XML's applicability in more fixed environments where interoperability trumps performance.
Copyright 2004. Originally published by ZapThink LLC, reprinted with permission. ZapThink LLC provides quality, high-value, focused research, analysis, and insight on emerging technologies that will have a high impact on the way business will be run in the future. To register for a free e-mail subscription to ZapFlash, click here. For more information:
'); // -->
|
|||||||||||||||||||||||||
| About Us | Contact Us | For Advertisers | For Business Partners | Site Index | RSS |
| |
|
|||||||