Home > SOA News > MapReduce moves from secret Google goo to enterprise architecture - Part 1
SOA News:
EMAIL THIS

MapReduce moves from secret Google goo to enterprise architecture - Part 1

By Rich Seeley
14 Sep 2008 | SearchSOA.com

News on SOA, EAI, Web services
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google

Read Part 2

MapReduce, a parallel programming and data execution architecture that Google and others have used to churn massive amounts of Web data, is now ready to move into the enterprise world. GreenPlum and Aster Data are early to the fray.

MapReduce is loosely related to 'map' and 'reduce' functions associated with so-called functional programming methods. Google's MapReduce has a counterpart in the Hadoop open-source Java MapReduce implementation. Now, two commercial MapReduce implementations have come forward from GreenPlum and Aster Data.

"MapReduce sits on top of the global distributed file system Google has built, and it allows Google developers to write parallel applications that make use of the data," explained Scott Yara, president and co-founder of GreenPlum.

Curt Monash, analyst and strategic advisor to the software industry, was early to write on GreenPlum and Aster Data. He said that that MapReduce could be useful in three categories of applications. He dubs these: Text tokenization, indexing, and search; Creation of other kinds of data structures (e.g., graphs); and, Data mining and machine learning.

"MapReduce is most useful when you have lots of data to analyze," Monash told SearchSOA. "A lot of the use cases are at the biggest Web companies. But anybody with large analytic data processing tasks, anybody with data warehouses in the hundreds of terabytes should take a look at it."

Although it is useful in other data intensive applications, data processing requiring massive parallel processing are ideally suited for capabilities of MapReduce, he said.

"The absolute ideal use case is one where the job simply cannot get done unless a lot of processors are used in parallel," Monash said.

Yara at GreenPlum said the key thing MapReduce adds to his database product is the ability to analyze structured and unstructured data both inside and outside the database using parallel processing on commodity hardware.

"Customers have data that lives everywhere," Yara said. "It could live in Web services, it could live in files, it could live databases. People want to be able to write programs that make use of that data whether it's unstructured or structured. With MapReduce, you can use the GreenPlum database as a parallel data processing platform without having to use the declarative model of relational SQL.

While Monash prefers to speak of MapReduce as a technology for Grid computing, Yara looks to a time when it will be the "programming language of choice for the Cloud."

"The big thing about Cloud computing is the ability to operate in parallel," Yara said. "By integrating MapReduce into the database you can do that well beyond the means of traditional SQL."

Yara sees MapReduce entering the corporate computer world as part of a concept he calls "Enterprise Data Cloud." Enterprise themselves can start to build their own Cloud-based infrastructure, and offer that up as a utility to their business units, he explained. Read Part 1 | Read Part 2



Tags: Data services for SOAService-oriented architecture (SOA) developmentVIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


RELATED CONTENT
Data services for SOA
SOA Pattern of the Week (#6): Canonical Schema
Medical imaging group build HL7 messaging hub with InterSystems Ensemble
Using atomicity to gain SOA granularity
Use JavaScript with the iPhone to create smart phone apps
Componentized XML Query tool takes a step forward
Podcast: SearchSOA tips on software architect skills
Services reuse drives ROI for SOA, survey finds
Tibco releases Complex Event Processing (CEP) suite with new rules, query interfaces
MapReduce moves from secret Google goo to enterprise architecture - Part 2
Distributed processing to boost performance at online book marketplace

Service-oriented architecture (SOA) development
SOA products for June
Enterprise Architecture in the Agile age - Part 2, Architects and developers
Enterprise Architecture in the Agile age - Part 2, Architects and developers
EA modeling tools communicate across disciplines
Using atomicity to gain SOA granularity
Hurwitz on SOA governance, services management
Reporter's Notebook: Jack Vaughan on agile methodology
OSGi Mini Tutorial
SOA growth and change: TechTarget survey shows SaaS, BPM emerging
Java EE servers said giving way to lightweight application frameworks

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary



SOA Web Services: Application Server, Portals, Java, Microsoft .NET
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2001 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts