Wednesday, February 27, 2013

Sears Going All-In with Hadoop



Sears’ revenue has declined from $50 billion in 2008 to $42 billion in 2011, while its competition, namely Wal-Mart and Target, has steadily grown. Most notably being Amazon’s growth from $19 billion in 2008 to $48 billion in 2011. 

Phil Shelley, Sear’s executive VP and CTO, plans on tackling this by getting closer to its customers.


"We wanted to personalize marketing campaigns, coupons, and offers down to the individual customer, but our legacy systems were incapable of supporting that." 

Sears hopped on the Apache Hadoop train in 2010.  Some differences noted in the article are listed below:

Pre-Hadoop (mainframe, Teradata, and SAS servers):

  • ·         Data storage: 90 days to two years
  • ·         Usage of available data: 10%
  • ·         Process analysis of marketing campaigns: 6 weeks
  • ·         Cost: $3,000 - $7,000 per year

Post-Hadoop:


  • ·         Data storage: everything
  • ·         Usage of available data: 100%
  • ·         Process analysis of marketing campaigns: 1 week
  • ·         Cost: “one-third of relational platforms”


According to the article, Sears is still the largest appliance retailer and service provider in the US.  With its role and enormous interest in big data, there’s a great opportunity for them to capitalize on lost market share and/or possibly new ventures.

Already saving a $500,000 from mainframe reductions, Shelley claims that eliminating all mainframes could save “tens of millions”, while still providing 20-100% performance. Shelley also wants to remove ETL completely. ETL is extract, transform, and load processing. ETL creates multiple copes of data, while Hadoop has everything in one place. An ETL process that took 10 hours to run on IBM software only takes 17 minutes with Hadoop. They’re slowly removing ETL in very nondestructive way for the business. 

With a main Hadoop cluster of 2 PB of data on 300 nodes, Sears is beating out some of the competition, such as Wal-Mart, in big data development.  Sears’ quarter results from July 28, 2012 shows that earnings were up 163%, while same-store sales were down. Sears claims that due to their new knowledge, they’re able to sell fewer items more profitably.

Shelley is also the CEO of MetaScale, a Sears subsidiary that provides big data solutions for other companies, based on their specialized expertise. MetaScale provides a subscription cloud service to use their clusters. A great service they provide is helping others transition from mainframe data to Hadoop.

"You have to go fast and be bold without taking stupid risks.”

Article: http://www.informationweek.com/global-cio/interviews/why-sears-is-going-all-in-on-hadoop/240009717

No comments:

Post a Comment