An architect's guide: How to use big data
A comprehensive collection of articles, videos and more, hand-picked by our editors
The proliferation of mobile, social and cloud technologies is requiring applications to handle increasingly large amounts of Web-driven information to be processed and accessed in real time. This movement has helped create an environment where organizations need to utilize big data management tools to gain value from the vast quantities of information and their disposal.
There are numerous instances where data accessed in real time is essential for an organization's financial gain. For example, retailers and mobile providers can work together to monetize value from the enormous amount of information retailers have on customer buying patterns and demographics. With information on a customer's purchasing habits, a retailer may opt to send a coupon to his or her mobile device if they're located near the store.
Using big data management tools
While there are several routes an organization may decide to take, deriving the greatest value from big data may require some changes in processes and technology. Users expect responses in real time, regardless of whether there is a spike in traffic, for example. "Companies today can't afford to run applications like they did in the past and expect that their customers are going to stick around," said Christina Wong, Red Hat senior product marketing manager.
Big data is inherently dirty; it's not quality data.
vice president of data collaboration and integration, Progress DataDirect
Marcia Kaufman, COO and founding partner of Hurwitz & Associates, concurs, noting the importance of being able to improve scalability through distributed computing. "When you are not able to handle the speed, no matter how advanced your application is, it slows the whole process down," said Kaufman.
Some organizations are turning to big data management tools, like in-memory data grids (IMDGs), to get a grasp on massive data volumes and keep applications competitive. Distributed across multiple servers, IMDGs help optimize data while increasing access and speed. "Data grids are useful because they can be combined with both the organization's traditional relational database and new forms, such as a NoSQL database or streaming of data," said Kaufman.
Using a traditional approach to non-traditional technologies may not be the best way to meet modern application requirements. Instead of relying on IMDGs to speed up access to data, Kaufman said organizations are starting to use the grid as its primary store as well as a layer to support its traditional environment.
While implementing new big data management tools may seem like a huge undertaking, those not privy to IMDGs may be pleasantly surprised by the ease of use. Getting involved with IMDGs isn't as overwhelming as some may presume, according to Wong. "The point of entry for data grids is actually pretty low," she said. "You don't have to revamp your entire IT infrastructure; it's a complementary technology."
Understanding big data
While harnessing big data can create many opportunities for an organization, it's important not to overestimate what it can help achieve. "Big data is inherently dirty; it's not quality data," said Tony Fisher, vice president of data collaboration and integration at Progress DataDirect. "All the issues we had with managing data before are true with big data." Information still needs to be mixed with other parts of an organization.
Some IT professionals have asserted that big data isn't much different than the data that has existed all along. Fisher doesn't completely agree with that assertion. If an organization didn't have a good foundation managing data before the big data influx, then managing big data is going to present an even bigger challenge, he said.
Before an organization can undergo any sort of big data initiative, Fisher said a hard look at what needs to be accomplished should be taken. A lot of time and energy can be spent collecting data and becoming big data enabled, but it doesn't translate into anything worthwhile if problems that weren't being solved aren't addressed.
Maxine Giza is the associate site editor for SearchSOA and can be reached at firstname.lastname@example.org.