Pepperdata Diagnoses Troublesome Hadoop Clusters With Complimentary Health Check

Company Offers Comprehensive Analysis of Cluster Performance on Any of the Three Major Hadoop Distributions -- Cloudera, Hortonworks and MapR


CUPERTINO, CA--(Marketwired - Jun 27, 2016) - Many organizations use Hadoop to gain competitive advantage, but because of the very nature of distributed systems there are inherent performance limitations that even the most advanced Hadoop users will inevitably encounter. Pepperdata, the world experts in the performance of distributed systems at scale, today announced the availability of Hadoop Health Check, a complimentary, expert assessment that evaluates and diagnoses Hadoop clusters of 100 nodes or more, and provides full visibility into current cluster conditions. With the Hadoop Health Check program, organizations can easily determine how to improve their current Hadoop operations by solving the innate limitations of distributed systems with automated, policy-based prioritization and active management in their production Hadoop environments.

How Healthy is Your Hadoop Infrastructure?
When a company signs up for the Health Check program, the Pepperdata software is installed on a production cluster for up to 72 hours. During this time, the software collects all Hadoop performance data and provides a high-level diagnostic report with granular insight into the cause of common issues such as:

  • Problem users or jobs: Pinpoint which users or jobs consume the most resources, including CPU, RAM, disk, and network usage that could be causing performance complications. The tool can also determine individual costs of shared cluster resources based on actual usage.
  • Wasted cluster capacity: Identify where capacity isn't fully utilized to increase throughput, and even benchmark utilization and performance against industry leaders.
  • Bottlenecks and root cause: Quickly identify where bottlenecks that hamper job completion occur. For enterprise users, Pepperdata's Adaptive Performance Core™ uses real-time processing to observe and reshape application usage to prevent bottlenecks.

"Pepperdata is rolling out a Health Check program that within days, or even hours, can uncover and diagnose even the most insidious performance problems that creep in and hide inside large, complex big data operations," said Mike Matchett, analyst, Taneja Group. "We expect that enterprises at the point of really scaling and growing a big data production stack will experience significant ROI with their first Health Check. It's a big step towards companies understanding that with tools like Pepperdata they have the capabilities they need to accelerate their big data deployment plans."

Solving Problems for Real Customers with Complete Cluster Diagnostics
Hadoop Health Check helps companies identify performance issues that have been plaguing their clusters, but have long gone unidentified or undiagnosed. Below are a few real world use cases that show the types of performance challenges Pepperdata has solved for companies:

  • A global storage company couldn't figure out why a certain job that took eight hours one week only took one hour the week prior. With Pepperdata, the company was able to identify where a spike in user demand was causing jobs to queue for seven hours before they actually executed. This granular visibility also made it possible to correlate the spike in demand to a specific user and set of jobs.
  • A mobile analytics provider was having trouble determining why its cluster was running slowly. Pepperdata was able to pinpoint a widespread overload of disk input/output capacity as the cause of the problem. Because of this, the company was able to exclude the problem causing files so that jobs could complete on schedule.
  • An online real estate site was experiencing a very slow running EMR workflow. The workflow that had a 17-hour runtime was essentially a black box because all metrics were lost once the cluster terminated. Pepperdata allowed for real-time analysis of the workflow, in addition to comparative analysis between runs that identified task-level visibility inefficiencies in the code. By resolving this issue, the company reduced its job run time from 17 hours to three.

"Pepperdata has helped numerous organizations obtain for the first time a full and clear picture of how their Hadoop clusters are performing in any distribution environment. With the new Hadoop Health Check program we have made it much easier and accessible for more companies to see the value and improvement that our product brings," said Sean Suchter, CEO and cofounder, Pepperdata. "Not only does Pepperdata dynamically solve the most common performance issues associated with distributed computing, but our support of all the major distributions means we can confidently offer a prescriptive data-backed assessment to guarantee optimal performance."

To sign up for a free Hadoop health assessment from Pepperdata, visit: pepperdata.com/healthcheck

About Pepperdata
Pepperdata develops software that governs and guarantees consistent, peak performance of Hadoop clusters from hundreds to thousands of nodes. Enterprises, from Fortune 500 companies to SMBs, trust Pepperdata to deliver transparency and control over distributed systems, and eliminate blind spots in Hadoop environments. Pepperdata provides the only solution that can anticipate and avert cluster performance issues at both the user and job level to create order out of the chaos inherent in distributed computing. Its Adaptive Performance Core™ has predictive learning capabilities that can anticipate a cluster's performance by looking 30 seconds into the future to anticipate changing conditions. Pepperdata then uses this information to reshape application usage of CPU, RAM, network and disk without user intervention, so that jobs can complete on time. Pepperdata software dynamically prevents bottlenecks in multi-tenant, multi-workload clusters so that numerous users and jobs can run reliably on a single cluster at maximum utilization, increasing throughput by 30 to 50 percent. Job performance is enforced on the fly based on priority and current cluster conditions, eliminating fatal contention for hardware resources and the need for workload isolation. The software also precisely pinpoints where problems are occurring so that IT teams can quickly identify and fix troublesome jobs. By capturing global knowledge of each cluster and controlling processes second by second to deliver Quality of Service, the software reclaims control over unpredictable cluster environments so that enterprises can realize untapped value from existing distributed infrastructures. The distributed systems supervisor installs in under an hour runs on existing clusters, and is compatible with all major Hadoop distributions. With Pepperdata, organizations can put their big data to use in production to meet business objectives today and satisfy future use cases.

Contact Information:

Media Contact:
Bhava Communications
Brianna Galloway
925.922.0708