With the success of multi-core architectures, enterprise applications derive benefits by executing the code in parallel. By definition, a multi-core microprocessor is one that combines two or more independent processors
Keeping aside hardware implications, we will look into the software implications of multi-core computing. The emergence of multi-core processor marks a revised thinking from a software perspective too. Applications won't be able to leverage processor enhancements unless software is highly concurrent. Code components which can be run concurrently benefit the most from multi-core architectures and even multiple applications can benefit a lot from multi-core architectures. Then you can run each virtual machine independently of others.
The basic steps in designing parallel applications are:
- Partitioning: Decomposing the design into smaller chunks.
- Communication: One chunk may require data from another chunk for its smooth execution. This information flow is specified in the communication phase.
- Agglomeration: Obtain an algorithm to execute efficiently on a parallel computer. In this phase we combine or agglomerate tasks identified by the partitioning phase to provide a smaller number of useful tasks.
- Mapping: Here, we specify where each task is to be executed.
The requirements of a business application do not lend themselves to either grid or parallel environment (Cory Isaacson, "Using intelligent parallel processing in a service oriented architecture"). The three main reasons are:
- Order of processing: Business logic must be performed in a particular sequence to ensure the integrity of a business process. Each transaction waits for the previous transaction to be completed before it gets processed. This order of processing is difficult to maintain in a parallel grid environment.
- Centrally shared resources: Applications have a centralized resource throughout the application. This creates a bottleneck.
- Unpredictable behavior and resource needs: The size and processing requirements of business processes vary through out the day or within a given hour. This makes the division of an application into equal-sized blocks difficult, as well as the allocation of resources.
Multi-core computing in SOA
Service-oriented architectures are gaining popularity due to their inherent flexibility. SOA is an enterprise driven, complex, managed, standards-based and highly customizable architecture to enable maximum flexibility and control. SOA is compositional, meaning new applications are built by plugging services together. However, SOA solutions need higher computational resources due to the dependence upon XML-like languages. SOA applications are basically distributed in nature and can hence take the advantage of multi-CPU and multi-server architectures. To truly benefit from SOA, a parallel processing approach to software design and implementation for SOA is required.
In context of SOA, multi-core computing can be exploited in the following areas
- Scalability via distributed instances of the same logical service
- Parallelizing service execution by detecting scope for parallelism
- Multi-service processes/service orchestration scalability by distributing different services over different threads
- In-situ separation of different processes in SOA computing – marshalling, schema processing, validation, demarshalling into separate threads for maximizing throughputs for services
Parallel processing in XML
XML, SOAP and WSDL are major industry standards used to build SOA applications. XML has emerged as the lingua-franca of SOA. However, processing of XML is computationally demanding and a number of optimization techniques have been developed to address the performance problem, but none of them are quite satisfactory. XML has turned out to be the major bottleneck for SOA applications. With the emergence of multi-core architectures, concurrent processing of XML could be one of the solutions to enhance the performance of XML processing.
The XML data model has gained huge popularity because of its ability to represent a wide variety of structured (tabular) and unstructured (textual) data as well as in its ability to integrate heterogeneous data sources like email messages, Web pages, traditional/relational databases, etc. for display on a variety of devices namely computers, mobile phones and PDAs. Parallel processing of XML is one of the active research areas for both academia and industries due to its high popularity. One solution is to use software pipelining. In this approach, XML parsing is divided into a number of stages and each stage would be executed by a different thread. This approach provides greater speed, but is hard to implement due to synchronization, load balance and memory access costs. Another related approach is a data-parallel approach wherein an XML document is divided into a number of chucks and each thread works on the chunk separately.
The main technical challenge here is how to decompose XML data and how to calculate the suboptimal allocation. An XML document can be divided into equal sized chunks by simply treating it as sequence of characters. This would mean each chunk beginning from the middle of some string (a tag name, an attribute name, element, etc.) whose grammatical value is unknown. This could be resolved by backtracking and communication, however it increases the computational overhead. Yet another problem is that with XML being a tree-structured data model, such a division will result in arbitrary branches on the tree and merging back the results would be difficult. Instead of the equal-sized decomposition, decomposing the XML document into a logical structure would be efficient (Wei Lu, Kenneth, Yinfei Pan, "A parallel approach to XML parsing", Grid computing, 7th IEEE/ACM international conference, September 2006).
Tools and support needed for the vision
- XML parallelism detection tools
- Code analysis tools for XML programs/parsers
- Parallel Parsers for XML
- XML/XQuery decomposition tools for exploiting parallelism
- Tools for automatic parallel XML processing code generators
SOA can benefit from multi-core architectures, however a set of associated tools and programmer-usable artifacts should be created and made popular so that you do not lay the burden on programmer to create parallel XML processing code.
About the author
Deepti Parachuri is currently working with SOA/WS Centre of Excellence at SETLABS, the research wing of Infosys Technologies Ltd. She holds a MS degree from IIT-Madras, India. Her major research area has been Semantic Web scaling RDF, OWL, OWL-S and Agent technologies. She had also analyzed XML schema compression methodologies to be used in SOA-based applications.
This was first published in April 2007