Early in the development of Web services and SOA, application developers began to focus on how
to integrate various applications and their elements – databases are perhaps the best example.
Nowadays, there is a wealth of data services available including products from the likes of Thomson
Reuters, Dun & Bradstreet and many others.
"It is an evolving and expanding world that includes giants and many niche services, not only for
financial data but things like weather, resources, and technology," says Noel Yuhanna, an analyst
at Forrester Research. These "data services" are typically wrapped in XML, combined, and then
imported into various applications. To further facilitate growing traffic, a range of niche players
have emerged, complemented by IT giants such as Microsoft and IBM.
Requires Free Membership to View
|
|||||||||||||||||
According to Fawaad Khan, an integration architect in
Accenture's global SOA practice, there are multiple challenges when a company wants to harness an
outside commercial Web data service (or even a publicly available data source).
"The foremost issue in using commercial Web data services is the effort required to validate the
quality and reliability of the data being received upfront," says Khan.
It can be very tempting to use third party Web services to get—say, a list of items and their
prices for an employee purchasing product catalog—only to discover data quality issues mid-way
through testing.
Another major risk involved in using commercial Web data services, according to Khan, is the
provider's financial and organizational stability as a business entity. "Performing an appropriate
amount of due diligence in the beginning on a commercial Web services data provider is a prudent
strategy to avoid potential rework later," says Khan.
Internally, Khan says, you should design your enterprise application in such a way that it doesn't
lose all its functionality if the commercial Web data service is unavailable for some reason.
Consider using caching technologies, where feasible, to minimize external dependencies and
potentially increase application performance and responsiveness and be sure your integration
architecture provides isolation via well-defined interfaces. "You shouldn't have to change your
enterprise application if the transport or the messaging protocol changes—for example from SOAP to
RESTful—in your commercial Web services," he says.
Looking externally, the first step is usually to analyze the APIs for the Web data services of
interest to understand not only their inputs and outputs but also what type of data validations, if
any, are required to successfully invoke the Web services; and how to process exceptions in the
event that the invocation is not successful.
Second, says Khan, who has helped many clients design and implement enterprise solutions based on
SOA and Web technologies, you have to consider the available messaging formats (like REST, SOAP,
and Ajax) and the transport protocols, and design both your Web services and technology platform
for invoking the external Web data services.
Then, he notes, "You may also have to design and implement the integration layer between your
internal Web services and the enterprise application of interest if it doesn't natively support Web
services integration."
Khan says invoking commercial Web data services with open, standards-based or de-facto standard
technologies like HTTP, XML, JSON, Ajax, and so on, is preferable to using proprietary tools and
specialized technology or application connectors. This process can be simplified, he says by
employing enterprise service bus (ESB) applications.
Khan views an ESB as an implementation of an architectural pattern to support integration of data
between Web service providers and consumers. An ESB can enable cross-enterprise Web data services
by supporting various Quality of Service (QoS) requirements, including guaranteed delivery,
security, audit/logging, mediation, and transformation, such as going from HTTP externally to JMS
internally. In other words, an ESB serves to decouple the consumer from the provider of Web
services allowing both to evolve as requirements change.
Thus, an ESB can provide a platform for more easily implementing various integration design
patterns for Web services. For instance, any of the leading ESB products can perform security, data
validation, dynamic routing, and transformation—both at the transport and message level—out of the
box. These capabilities, if needed, should be acquired from either an open source community or a
commercial vendor but not custom built, he advises.
Khan says thoroughly performance testing the service level agreements (SLA) for response times
provided by your Web services vendor is key. Additionally, testing for scalability based on your
user base size is also important, especially when integrating with mission-critical enterprise
systems. From an operations perspective, capturing audit information such as incoming data
elements, their values and response times in the commercial Web services helps with more efficient
troubleshooting and issue resolution.
Finally, he adds, "Don't underestimate the complexity of the effort required for implementing
security functions such as authentication and authorization, which are particularly challenging
with external Web services," particularly if you need to provide granular access to some functions
while restricting others.
Enterprise Mashups make sense of Web data services
With another view of the connection challenges, Dana Gardner, principal analyst at Interarbor
Solutions, says issues mostly boil down to the types of data and the choice of Web standards.
However, those issues are further compounded by the emerging challenge of the volume, size, and
complexity of the data sets.
"For instance, the flood of XML data coming in is often much greater than would have been handled
by internal applications -- so it requires a different approach technically to mine, search, and
manage that data," he said. Fundamentally, it can be an issue of scale and scalability.
Gardner says BI and analytic tools are coming into the market that can handle these ever larger
data sets. Also emerging are new tools to simplify and automate more of the importation process. As
an example, Gardner cites Kapow Technologies, which provides a means for easily acquiring data from
the Web interface layer. "This is a technology that allows you to overcome issues such as format
and bring data directly into a mining activity or other application," he says.
"Kapow is ETL [extract, transform, and load] for Web data," says Ron Yu – vice president, marketing
at Kapow Technologies. "We have our own proprietary browser and JavaScript engine so we can access
data in the same way you would see it. We eliminate the misunderstandings the knowledge worker
would have with IT," he adds. Yu says Kapow also provides a Windows-based client development
environment that supports live loading of HTML and XML data through a viewer – data can be
extracted with a point-and-click system. "You can apply business rules to a transformation
capability that can further massage the data prior to actually loading it."
For its part, Composite Software, Inc., provides data virtualization. The company recently
announced a joint solution with Kapow to accelerate the integration of Web data in large-scale data
virtualization environments. The joint solution is called Composite Application Data Services for
Web Content, and is powered by the Kapow Web Data Server. Robert Eve, executive vice president at
Composite says as a data virtualization middleware company, Composite offers an alternative or
complement to an ETL-to-data warehouse style of integration. "We do those steps in one shot with a
view and a pull method built on our high performance query capability."
Among other players is Denodo, which provides a data integration and data virtualization software
platform called Denodo Platform, as well as support, training, and consulting services. In the
Forrester Wave Information-As-A-Service, Q1.2010, study, Yuhanna and his co-author described Denodo
as delivering, "…simplified, low-cost, rapid deployment of data services with options to scale to
enterprise-class performance, reliability and scalability."
Outside data meets inside apps
In the report, Yuhanna says large, established players such as Microsoft (BizTalk Server, Microsoft
SQL Server), IBM (InfoSphere Information Server), Informatica (Informatica Platform for data
services), and Red Hat (MetaMatrix Enterprise Data Services Platform, JBoss SOA Platform) offer the
broadest products and still dominate the market in terms of revenue. However, he notes, the niche
players like Kapow, Composite, and Denodo are growing rapidly by offering easier and more automated
ways to connect outside data with internal applications.
Fundamentally, he adds, the whole information-as-a-service phenomenon is part of the larger inter-
and intra-enterprise trend to mashup applications and data more freely and more frequently. As
such, it will continue to grow in importance.
The trend line is clear – Web data services are growing in importance and even if you aren't using
them now, you probably will. "I'm convinced we will see more services coming in from the Web
specifically because companies that specialize in providing data or services have a distinct
competitive advantage," says Mike Karp, a vice president at Ptak, Noel and Associates, a consultant
and analyst firm. Indeed, Karp says in many cases the Web data could potentially be routed to
another external provider for analysis prior to delivery to your own internal applications.
"With that kind of arrangement many of the potential technical challenges of working with Web data
services can be reduced to a service level agreement – in other words, they become someone else's
problem," he says.
Although Web data services can potentially come from anywhere and be delivered to anywhere, Karp
adds one final caution: geography still matters. If you need a service or data supplied at high
speed, it helps if the provider is close, in order to avoid latency, he says.

Join the conversationComment
Share
Comments
Results
Contribute to the conversation