The tantalizing prospect of cloud computing is changing the way people in IT think. Instead of
massive and ever growing data centers, it may now be possible to simply tap into potentially
unlimited resources residing externally, in the "cloud." Of course, these visions have already
begun to take tangible form in cloud services such as Amazon EC2 and Microsoft Azure. However,
according to analysts and others, the potential of the cloud -- at least for data-intensive
applications -- will be limited without the application of a crucial enabling technology --
distributed data grids.
A distributed data grid, also called a distributed data cache, operates between the database and
the in-memory of an application and provides a temporary repository for data, enhancing performance
by improving access and eliminating bottlenecks.
|... people hear about the opportunity to have more instances in
the cloud, but more instances doesn't necessarily translate into more performance...
Mike Gualtieri, Analyst, Forrester Research,
Analyst Mike Gualtieri and his colleague at Forrester Research, John R. Rymer, have proposed an
additional term -- elastic caching -- which captures a particularly useful characteristic of some
data grids. Their recent report, The Forrester Wave: Elastic Caching Platforms, Q2 2010,
describes the technology and some of the key vendors in the space.
Gualtieri says it is important to recognize that within the concept of data caching there are a
range of solutions. The best way to describe a distributed cache is something that acts across one
or more nodes. The reason Gualtieri terms some types of distributed cache elastic is that
they can add and remove nodes while running. "And we think that is important and more descriptive
of the defining characteristics of a data grid," he says.
By contrast, there are a number of potent but non-elastic distributed caching schemes, one of which
is Memcached, an open source caching product that is widely used at Facebook and other Web
properties. "Memcached is distributed but not elastic, you can decide that you have enough data so
that you will require eight servers or 80 servers, but if it turns out you need more or fewer you
have to shut down in order to add or remove them," he explains.
Many people associate cloud computing with scale, notes Gualtieri. Certainly the cloud allows you
to scale instances of machines -- but you can't easily scale applications and data in the cloud
because applications and data haven't been architected to take advantage of the "extra
Likewise, if you think of a typical relational database packed with customer or order information,
when it comes to the cloud, that database becomes your bottleneck. "If you are getting more and
more transactions against that database you can try to speed it up by adding five more servers, but
how do you split the data? You can't," says Gualtieri. "So elastic caching is really interesting
because it has a huge impact for the cloud -- it is a solution for scaling data," he adds.
Because of its elasticity, nodes can be added in real time; if you start with four servers and add
four more, these platforms will rebalance the data fairly evenly across the nodes and if any node
goes down you are not down because they replicate the data. "So elastic caching also provides fault
tolerance and high availability at a fraction of the cost of what it would take just to
re-architect a database," he adds.
According to Gualtieri, the quest to deliver cloud scalability has also spawned a few other
variations, notably the NO SQL movement. "On first glance it sounds like an attempt to get rid of
SQL but the term actually stands for Not Only SQL," he says. Of course, he notes, traditional
relational database are great at transactional integrity; they always provide consistent
By contrast, notes Gualtieri, the NO SQL crowd talks about a concept called eventual
consistency. For example, when someone does an update on Twitter or Facebook, it isn't absolutely
necessary that every user on the internet sees it that second -- as long as it arrives eventually.
"It isn't like decrementing $100, you need a relational database for that," says Gualtieri.
For all the data that doesn't need absolute timeliness or consistency, NO SQL can provide that
eventual consistency. "You give up some of the transactional integrity but what you get is an
inexpensive way to scale a large amount of non-transactional data," he says.
Historically, notes Gualtieri, NO SQL grew out of the attempts by Amazon and EBay to master issues
of scale. "What has happened over the years is that these technologies and similar ones have made
their way into open source projects," one of which -- Cassandra -- is an open source NO SQL "that
is very much like elastic caching in that the data is distributed, spread across multiple nodes,
and it is fault tolerant," he explains. However, he adds, in general, NO SQL is not as well defined
or developed as elastic caching -- and most NO SQL products are open source.
Elastic caching products
Coming back to the elastic caching vendors, Gualtieri's report pegs IBM (WebSphere eXtreme Scale),
Terracota (Ehcache FX edition), GigaSpaces (XAP), and Oracle (Coherence) as leaders in the
Cameron Purdy, vice president of development at Oracle says "The goal with Coherence is to
dramatically simplify the usage of the software and the learning curve for building, deploying and
operating data grids, regardless of their size."
|Any time an application is running on more than one server, it
has state that it has to manage across those servers, and that is what a data grid
Cameron Purdy, vice president of development at Oracle,
Purdy reiterates the point that if a cloud is to achieve true capacity-on-demand, it needs to be
able to shift its server footprint by tens, hundreds or even thousands of servers. "Any time an
application is running on more than one server, it has state that it has to manage across those
servers, and that is what a data grid enables, and specifically that is what Coherence enables," he
Purdy says Oracle Coherence is not a relational database management system (RDBMS). Instead,
Coherence manages application data in the form that the application works with that data, such as
"objects" in languages such as Java, C# and C++. Oracle Coherence manages live application data
(application state, sessions, caches, etc.) in memory, using multiple servers to provide both
scalability and availability in managing that application data. Furthermore, Coherence does so
automatically as the server footprint grows and shrinks, all without loss of data or interruption
The Forrester report also named a second tier of "strong performers," namely GemStone Systems
(GemFire), Alachisoft (NCache), and ScaleOut Software (State Server).
William Bain, founder and CEO of ScaleOut Software, and a veteran of the parallel computing
industry, says "We combine distributed caching with parallel data analysis." Bain says elasticity
is central to the idea of distributed caching. "What distributed caching offers is the ability to
have lots of threads on lots of servers accessing a common pool of data -- and as applications grow
in scope they have the ability to have scalable storage," he says.
Like Gualtieri, Bain says Memcached "runs out of steam" when the data is being updated rapidly and
being read, as occurs in e-commerce shopping carts. "In a distributed cache it can be read and
updated by many web servers and web farms -- and it can scale," he says.
Advice on data caching
Bain says CTOs and their architects should make an effort to come up to speed on distributed
caching and data grids and understand the power they provide in scaling application states. "By
becoming savvy on this technology they will be able to move applications to the cloud and scale
seamlessly across a large pool of virtual servers," he says. On the other hand, Bain predicts that
those who don't take advantage of distributed caching will find it is hard to eliminate all the
bottlenecks to scalability and storage in multiserver environments.
Expanding on that concept, Purdy says in order for an application to take full advantage of a data
grid, the application should have a strong domain model. In other words, the information that the
application consumes, creates and uses internally to run should be expressed as -- and managed as
-- a set of defined entities with well-understood relationships among those entities.
"In modern object oriented languages such as Java, these entities are typically implemented as Java
classes, and are often referred to simply as domain objects," he notes. According to Purdy,
applications that have a strong domain model tend to be very easy to move to a data grid, while
applications that do not encapsulate data access and representation -- such as those that sprinkle
direct SQL database access throughout the application -- are difficult to adapt to data
Summarizing, Gualtieri says his general advice is if you are considering cloud computing, you must
ask if the application architecture is elastic and then ask three follow-up questions. The first is
how can you scale your database? The second is how can your data shrink and grow to take advantage
of the cloud, and how can your application code shrink and grow? The third question is about
performance -- what is your performance strategy in the cloud?
"The reason I mention performance is because people hear about the opportunity to have more
instances in the cloud, but more instances doesn't necessarily translate into more performance
because you won't necessarily know what platforms you are running on," he says. In fact, he advises
planning to do load and performance testing in the cloud. "You can't assume that just because it is
in the cloud it will perform better -- start with the data first, because that is going to be a
bottleneck," he adds.