As another Strata + Hadoop World comes to a close, I’m left in my annual dazed, contemplative state — and not just from fatigue or a hangover. There’s a lot to soak in. The conference has grown by leaps and bounds over the past several years, which seems to have a direct correlation with continued expansions of the Hadoop platform and the use cases it supports.
My job is all about sharing and promoting customer stories, so the plethora of new and interesting nominations that were submitted for this year’s Data Impact Awards got me pretty excited. On Tuesday, we held our annual Data Impact Awards event, bringing together about 250 Cloudera customers, prospects and other guests to celebrate the achievements of the CDH community and to announce this year’s Data Impact Award winners.
In 2015, we recognized CDH users for their work across nine categories:
Data Discovery & Analytics
Path to Production
Data Driven Transformation
Government, Nonprofit & NGO
Operational Efficiency ROI
Security & Compliance
Here’s the quick recap of results from the 2015 Awards, which drew 55 percent more nominations this year than last.
Business Impact – Coupons.com, Inc.
Coupons.com, Inc. is a leading digital promotion and media platform that connects brands, retailers, and consumers. The company is pioneering behavioral digital coupon targeting based on shoppers’ purchase behavior through the deployment of its Hadoop platform. Brands can now quickly access information regarding user demographics, shoppers’ behavior patterns, product information, and individual store information across multiple national retailers.
With improved query speeds, brands can respond faster to offer more targeted campaigns. Coupons.com, Inc. noted approximately 230% more shoppers purchasing products compared to a control group using CDH to power digital promotions. The targeted campaigns have also contributed to an overall increase in the average basket value – 61% larger for comparable purchasers.
Data Discovery & Analytics | Path to Production – Markerstudy (nominated by Zoomdata)
Markerstudy Limited is responsible for the marketing and distribution of insurance products in the UK, including private and public hire, fleet, motorcycle, private car, and commercial vehicle. To keep pace with growing demand for services, Markerstudy initiated The Big Data Insight Project, built on a Cloudera enterprise data hub.
The project allows Markerstudy to analyze hundreds of millions of insurance quotes in seconds and look across 100 percent of all quotes rather than a five percent sample size. This allows Markerstudy to build a more accurate picture and spot emerging trends and patterns as they happen. Business users are also able to conduct their own data exploration using analytics tools, visualizations, and transaction level search.
Data Driven Transformation – Lineage Logistics
Lineage Logistics is the second-largest cold storage warehousing network in the world, with over 111 facilities in 21 states. The company stores temperature-sensitive materials, blast freezes fresh products, brokers temperature-controlled transportation, and provides a variety of value-added logistics services. Lineage owns over 20 percent of the third-party cold storage capacity in the United States, with annual throughput in excess of 15 billion pounds per year.
Prior to using Hadoop, Lineage Logistics operated with disparate warehouse management systems. In order to analyze its data, users would contact 15 system administrators and receive data in 15 different formats to get a collective view. The company needed to consolidate its systems and implement a new framework for standardizing company-wide data collection and analysis.
The consolidation of Lineage’s data empowered the company to take data-driven action on matters core to its business. Among other use cases, Hadoop allowed Lineage to redesign the layout of its warehouses, using graph theory and combinatorial optimization (bin-packing) to increase storage capacity in excess of 30 percent. Such density gains result in a proportional reduction in power cost per unit of production and avoid the time, expense, and environmental consequences of constructing new facilities to create additional capacity.
Government, Nonprofit, and NGO – Department of Homeland Security, Science and Technology Directorate, Homeland Security Advanced Research Projects Agency (HSARPA)
Homeland Security Advanced Research Projects Agency (HSARPA) was established to use innovation and modernization to further scientific advances that support the Department of Homeland Security. HSARPA’s first project leveraging Hadoop was in partnership with the National Fire Incident Report System (NFIRS).
By reviewing trends and patterns, graph analytics, and geospatial views for incident types, equipment failures, and casualties, HSARPA and NFIRS have been able to better inform firefighters both in the field and in proactive efforts to combat fires.
The resulting joint database constitutes the world’s largest national, annual collection of incident information, and the solution allows that data to be analyzed and accessed by fire departments at state and local levels for a fraction of the cost of traditional systems. This unique partnership of systems has proven to be one of the most successful, productive, and cost-beneficial programs ever attempted on a national level.
Operational Analytics – Odyssey
Odyssey is a regional leader in the cyber security solutions and services sector, and a major managed security protection and outsourcing services provider. One of the company’s core business pillars – the ClearSkies Security-as-a-Service (SECaaS) platform – addresses the challenge of bringing together the openness and flexibility of the cloud with the need for strict control of information dictated by security principles.
Odyssey saw that the task of processing, analyzing, and correlating increasingly vast amounts of security-related log data while supporting all functions of the platform was too demanding to manage with current tools and conventional analysis. Odyssey implemented a Cloudera-powered enterprise data hub to address this challenge and has seen a massive enhancement in the platform’s pivotal aspects including processing, statistical and user behavioral analytics capabilities. Prior to setting up Odyssey’s EDH, it would have been impossible to aggregate the billions or even trillions of log data generated over the course of a year in just one day. Due to the high performance processing of Cloudera’s EDH, this capability is done not just in a day, but in mere hours, minutes, and sometimes seconds.
Operational Efficiency ROI – Costco
Costco Wholesale Corporation operates an international chain of membership warehouses, primarily under the “Costco Wholesale” name, that carry quality, brand name merchandise at substantially lower prices than are typically found at conventional wholesale or retail locations. Costco’s warehouses present one of the largest and most exclusive product category selections found under a single roof.
Costco has been using Hadoop to improve operational efficiencies that have resulted in quantifiable bottom-line savings. One recent successful use case involved Costco implementing Cloudera Search to query multi-spelled, misspelled, brand-varied, and text-based data – a job SQL-based tools were not capable of executing. By implementing Cloudera Search to easily “Google” or free-text search Costco’s inventory, Costco floor staff can now use a simple iPad app to search inventory and avoid bringing in a floor manager. This has shifted a four to five minute process involving two people to just one person providing answers in a matter of seconds.
Security and Compliance – Visa, Inc.
Visa, Inc. is a premier payments technology company with a global network that connects thousands of financial institutions with millions of merchants and cardholders every day. With the rise in unknown malware, advanced threats, and insider threats growing daily and causing significant damage to organizations, businesses and individuals, Visa created the Visa Security Analytics (VSA) product to combat attacks.
VSA leverages CDH for longer-term security data set retention. The main goal of the project was to develop user behavioral analytics to better detect advanced and insider threats before they strike and reduce time in security investigations. With the VSA CDH powered platform, Visa now has a thorough understanding of the different stages an attacker may go through during a targeted attack and the different techniques an attacker may use to carry out each stage.
With this knowledge, Visa can detect threats produced by a user, a device, or an application by using machine learning, behavior modeling, peer group analysis, real-time statistical analysis, anomaly detection, and predictive modeling. CDH powers the model to detect cyber attacks and insider threats all in real time.
Social Impact – Thorn (in partnership with Digital Reasoning)
Thorn: Digital Defenders of Children (www.wearethorn.org) is a non-profit dedicated to driving technology innovation to fight child sexual exploitation. Thorn partners across the tech industry, government, and non-governmental organizations and works to deter predatory behavior, disrupt platforms that enable abuse, and accelerate victim identification.
In the United States, children sex trafficking is an issue that often presents itself as children being bought and sold online, using online classified sites or escort pages. Thorn set out to leverage the online information about these crimes in order to more rapidly find these children and connect them with victim services. Thorn and Digital Reasoning created Spotlight, a cloud-based collection and analysis tool used to provide intelligence and leads on suspected human trafficking networks and individuals in order to identify and assist victims. The underlying architecture leveraged by Spotlight is CDH, which provides both distributed processing to run state of the art natural language processing and analytic algorithms on data that is harvested and organized in HDFS.
Spotlight has become the leading investigative tool for child sex trafficking investigations in the United States. Currently, Spotlight has over 1,300 law enforcement users across 46 states. Since its launch in October 2014, Spotlight has been used in over 860 trafficking cases and has helped identify over 300 victims, including 50 children.
Thanks again to our 2015 judges, who determined the winners for this year’s program:
- Matt Aslett, 451 Research
- Tom Bain, CounterTack
- Martha Bennett, Forrester
- Andreas Bitterer, BARC
- Drew Conway, Project Florida
- Raj Dalal, BigInsights
- Wayne Eckerson, Eckerson Group
- Mike Ferguson, Intelligent Business Strategies Limited
- Bob Gourley, CTOvision.com and Crucial Point LLC
- Philip Howard, Bloor Research
- Jeffrey T. Hunter, Capgemini
- Claudia Imhoff, Intelligent Solutions, Inc.
- Ping Li, Accel Partners
- Ben Lorica, O’Reilly
- Curt Monash, Monash Research
- Narendra Mulani, Accenture
- Carl Olofson, IDC
- Jake Porway, DataKind
- Tom Pringle, Ovum
- Neil Raden, Hired Brains Research
- Nik Rouda, ESG
- Svetlana Sicular, Gartner
- Kim Stevenson, Intel
- Rick van der Lans, R20/Consultancy
- Ashish Verma, Deloitte Consulting
- Dan Vesset, IDC
- William McKnight, McKnight Consulting Group