Given the promise of new analytics technologies, becoming more data-driven is on the minds of most IT decision makers these days. In a recent report on the impact of big data on analytics, "More than half of the organizations polled identified analytics as among their top five IT priorities," says Julie Lockner senior analyst and VP of data at the Enterprise Strategy Group (ESG), an IT strategic advisory firm based in Milford, Mass.
"With the promise big data is poised to bring," says Lockner, "organizations are exploring their options for solving business challenges with emerging [data] technologies. It's just not practical or cost-effective to use traditional [database] platforms and technologies that were designed before the big-data era."
Enter Apache's Hadoop, the open-source software framework named by its creator after his son's toy elephant. According to Lockner, the highly scalable Hadoop permits running analytics on massive data sets effectively and efficiently, whether that data is structured or unstructured.
"Where traditional databases hit their limits, Hadoop starts to emerge as a much better fit for solving unique analytics challenges," Lockner says. "Because data can be incorporated from multiple sources with varying types of data structures, Hadoop enables more analysis across multiple data feeds in a single platform -- solving some of the toughest data integration challenges commonly associated with relational data warehouse architecture."
The Outer Limits of Data Warehouse Technology
For the security team at Salt Lake City-based Zions Bancorporation ($51.5 billion in total assets), it was just such challenges that kicked off a pioneering journey into the world of Hadoop. "By mid-2008 we were reaching the limits of our traditional data warehouse technology," recalls Preston Wood, chief security officer for Zions' converged security organization, which is responsible for mitigating threats across eight branded banking operations and 500 physical locations throughout the Western United States.
The quest for a solution began in 2009 with an investigation of Zion's existing Microsoft (Redmond, Wash.) and Oracle (Redwood Shores, Calif.) technologies, as well as other technologies within the firm and new solutions on the market, Wood relates. After developing a list of six potential vendors, he says, he and his team quickly focused on two Hadoop-based solutions. The team, Wood explains, recognized the potential in Hadoop for "making security decisions proactively rather than reactively, based on mining business intelligence and combining it with event data from security devices."
Of the Hadoop offerings, Mountain View, Calif.-based Zettaset's Security Data Warehouse (SDW) provided the most mature enterprise solution, Wood says. At the time, "The Hadoop space was even smaller and more immature than it is now," he comments. "Since we didn't want to become Hadoop experts ourselves, it was important that our partner offer an enterprise-class solution along with a road map to continue evolving its platform."
Adopting SDW, however, wasn't about abandoning all traditional technologies, such as security information and event management (SIEM) tools, Wood adds. "It doesn't replace a traditional SIEM; it augments it," he says. "SIEM becomes one main feed into the Security Data Warehouse."
An integrated approach is key, notes ESG's Lockner. "Hadoop isn't intended to replace current transaction processing systems or mature data warehouses," she says. "Rather, it's a platform that enables the data processing tasks to be moved to a more cost-effective alternative."
Zions finalized a deal with Zettaset in late 2010, with deployment occurring at the start of 2011. Although Zettaset now provides its SDW as either a turn-key or software-only solution, at the time an infrastructure investment by Zions was required. With Zettaset's assistance, the bank's Wood reports, Zions selected a heterogeneous mixture of high-performance servers and switches to form a 30-node, Linux-based Hadoop ecosystem capable of storing hundreds of terabytes and scaling to multiple petabytes.
After SDW was installed on the new hardware, migration to the new system began. "It only took about 30 days to install the hardware and software infrastructure," notes Wood. "Then we migrated over a period of months, starting with our lowest-risk decisioning and reporting processes and working our way up the list."
Be Prepared for a Hadoop Deployment
Wood credits his team's advance preparations for the swift rollout. "They did their homework and educated themselves," he says. "So we had no organizational change issues. And for the minimal ingestion and operationalization challenges we experienced, Zettaset ensured processes were modified appropriately."
Indeed, taking the initiative to learn about Hadoop in advance is vital to a deployment's success or failure, experts emphasize. "It's important to dedicate a small, technically skilled team to start learning about Hadoop," says ESG's Lockner. "All too often, Hadoop projects fail because a team deployed it in advance of truly knowing its capabilities ... and lacked the business case that justified not using solutions already licensed and installed in the data center."
While Lockner notes that there are few documented Hadoop use cases in banking, Zions views its exploration of the unknown as a success. "We're adding data sources to our analytics that we wouldn't even have considered before," Zion's Wood reports. "In some cases, it takes just days to implement where it otherwise would have taken months."
He adds, "Adopting big data is really going to be a game changer. Our Hadoop deployment is allowing us to explore opportunities around data and even around non-obvious data. It's also giving us the capability to scale quickly and cost-effectively, whether it's double, triple or even quadrupling in size."