SOURCE: Cloudera


April 19, 2011 06:00 ET

Groupon Gets a Great Deal for Its Data in Cloudera's Distribution for Apache Hadoop

Daily Deal Giant Chooses Cloudera to Get Maximum Insight From Its Data Deluge

PALO ALTO, CA--(Marketwire - Apr 19, 2011) - Cloudera, the leading provider of Hadoop-based data management software and services, today announced that Groupon, the pioneering daily deal site, has implemented Cloudera's Distribution for Apache Hadoop (CDH) to get more value from the massive amounts of data and information they collect and generate.

With more than 70 million registered users in more than 500 global markets, Groupon has been dubbed the "fastest growing company ever" by Forbes magazine. Data is one of Groupon's most strategic assets; Groupon relies on information from both vendors and customers to make daily deal transactions run smoothly. Prior to deploying CDH, Groupon realized that they needed better ways to organize and make sense of the data generated by their massive user base for the long term.

Groupon first approached the Hadoop experts at Cloudera to assist in laying the foundation for a large-scale data system. The goal was to build an IT infrastructure that could keep up with the speedy rate at which Groupon amasses data without impeding the expansion of the business. Groupon worked closely with the Cloudera team to capture their ever-swelling collection of data into Hadoop, take advantage of the ease of scale of the system, and ultimately be prepared for future growth while consistently gaining new insights into its customers and business.

"We were eager to try Hadoop based on the technology's promise to make sense of massive amounts of data, and it hasn't disappointed," said Mark Johnson, chief data officer, Groupon. "Cloudera's distribution and support have been instrumental in helping Groupon deliver on our goal to be a technology leader."

Groupon will use Hadoop as a staging area for all of their extreme data. Savvy analysts will be able to go directly to the finest level of detail on data before it has been through the cleansing process. Data that has been refined and processed in Hadoop will go into an analytic DBMS for additional analysis. The company has plans to leverage CDH beyond core Hadoop to include other projects such as Flume, Pig, Hive, Oozie and HBase.

"Cloudera is committed to companies with large amounts of complex data like Groupon by providing the Hadoop-based platform industry-standard in CDH along with unsurpassed Hadoop-related support and services," said Amr Awadallah, CTO, Cloudera. "Groupon is a perfect example of how enterprises can best make use of Hadoop to get the most insight out of their data and we're proud to be working with such a trail-blazing company."

Groupon's goal of building a world-class infrastructure has encouraged many talented engineers to join their teams in Palo Alto and Chicago. The data team at Groupon is rapidly growing, which is indicative of the heightened interest in data management that both Cloudera and Groupon are seeing in the IT industry.

About Groupon
Groupon, launched in November 2008 in Chicago, features a daily deal on the best stuff to do, eat, see and buy in more than 500 markets around the world. Groupon uses collective buying power to offer unbeatable prices and provide a win-win for businesses and consumers, delivering more than 900 daily deals globally. For more information, visit To learn more on how to become a featured business on Groupon, visit

About Cloudera
Cloudera is the leading provider of Apache Hadoop-based software and services and works with customers in financial services, web, telecommunications, government and other industries. The company's products, Cloudera Enterprise and Cloudera's Distribution including Apache Hadoop, help organizations profit from all of their information. Cloudera's Distribution including Apache Hadoop is the most comprehensive Apache Hadoop-based platform in the industry. Cloudera Enterprise is the most cost-effective way to perform large-scale data storage and analysis and includes the tools, platform and support necessary to use Hadoop in a production environment. For more on Cloudera, please visit

Contact Information

  • Media Contact:
    Ray George
    LEWIS Pulse for Cloudera
    Phone: 650-922-3825
    Email Contact