SOURCE: LexisNexis HPCC Systems

LexisNexis HPCC Systems

December 12, 2011 11:20 ET

HPCC Systems From LexisNexis Breaks World Record on Terasort Benchmark

HPCC Systems 4 Nodes Cluster Sorts 100 Gigabytes in 98 Seconds and Is 25% Faster Than a 20 Nodes Hadoop Cluster

ATLANTA, GA--(Marketwire - Dec 12, 2011) - HPCC Systems™ from LexisNexis® Risk Solutions announced today that it has established a new world record for completing the Terasort benchmark. Results achieved in December 2011 show that an HPCC Systems four node Thor cluster took only 98 seconds to complete a Terasort with a job size of 100 gigabytes (GB) on a cluster five times smaller than Hadoop. The HPCC Systems four node cluster was comprised of one (1) Dell PowerEdge C6100 2U server with Intel® Xeon® processors E5675 series, 48GB of memory, and 6 x 146GB SAS HDD's. The Dell C6100 houses four nodes inside the 2U enclosure. The previous leader ran the same Terasort benchmark in 130 seconds on a 20-node Hadoop cluster using equivalent node hardware. HPCC Systems is an Open Source, enterprise-proven Big Data analytics-processing platform.

"These results demonstrate that HPCC Systems is a leader in Big Data processing," said Armando Escalante, SVP and CTO of LexisNexis Risk Solutions and head of HPCC Systems. "We can process Big Data 25% faster on a cluster five times smaller than Hadoop. We can help organizations save money. Not only are we Open Source, but we can also do the job with fewer nodes and less programming, which translates into cost savings for companies that need to crunch Big Data and are concerned about big expenses," said Mr. Escalante.

The HPCC Systems platform ran the test in three lines of ECL code compared to previous tests that took more than 700 lines of Java MapReduce code in the Hadoop equivalent. Leveraging ECL code on the HPCC platform substantially increases programmer productivity since there is less code to write.

ECL Program Listing:

// Perform global terasort
rec := record
string10 key;
string10 seq;
string80 fill;
in := DATASET('nhtest::terasort1',rec,FLAT);
// End

HPCC Systems grew out of the need for LexisNexis Risk Solutions to manage, sort, link, join and analyze billions of records within seconds. Designed by data scientists, HPCC Systems is a data intensive supercomputer that has evolved for more than a decade, with enterprise customers who need to process large volumes of data in critical 24/7 environments.

HPCC Systems is comprised of a single architecture, a consistent data-centric programming language, and two processing platforms: the Thor Data Refinery Cluster and the Roxie Rapid Data Delivery Cluster. The core of the technology platform is the Enterprise Control Language (ECL), which is a declarative, data-centric programming language optimized for large-scale data management and query processing.

The Thor Data Refinery Cluster, now also available on Amazon Web Services (AWS), is responsible for ingesting vast amounts of data, transforming, linking and indexing that data, with parallel processing power spread across the nodes. The Roxie Rapid Data Delivery Cluster provides highly scalable, high-performance online query processing and data warehouse capabilities. For more information, visit HPCC Systems at

About HPCC Systems™
HPCC Systems™ from LexisNexis® Risk Solutions offers a proven, data-intensive supercomputing platform designed for the enterprise to process and solve Big Data analytical problems. As a superior alternative to Hadoop and legacy technology, HPCC Systems offers a consistent data-centric programming language, two processing platforms and a single, complete end-to-end architecture for efficient processing. Customers, such as financial institutions, insurance carriers, insurance companies, law enforcement agencies, federal government and other enterprise-class organizations leverage the HPCC Systems technology through LexisNexis® products and services. For more information, visit HPCC Systems at

Connect with us on Twitter (, Facebook ( and LinkedIn (

About LexisNexis Risk Solutions
LexisNexis® Risk Solutions ( is a leader in providing essential information that helps customers across all industries and government predict, assess and manage risk. Combining cutting-edge technology, unique data and advanced scoring analytics, we provide products and services that address evolving client needs in the risk sector while upholding the highest standards of security and privacy. LexisNexis Risk Solutions is part of Reed Elsevier, a leading publisher and information provider that serves customers in more than 100 countries with more than 30,000 employees worldwide.

Contact Information

  • Media Contact
    Kristina Grammatico
    Email Contact