SOURCE: Qubole

Qubole

November 22, 2017 11:22 ET

Qubole Announces Spark on Lambda

Ability to run Apache Spark applications on AWS Lambda allows for more elastic resource usage

SANTA CLARA, CA--(Marketwired - Nov 22, 2017) - Qubole, the big data-as-a-service company, today announced a technology preview of Spark-on-Lambda, enabling Apache Spark applications to run on AWS Lambda for highly elastic workloads. While AWS Lambda is typically used for short duration, stateless functions, Spark-on-Lambda shows the potential of complex big data systems running on truly serverless compute infrastructure. 

Spark-on-Lambda provides simple, serverless and elastic Apache Spark on AWS Lambda. This prototype utilizes massive compute burst capabilities of AWS Lambda by successfully scanning 1 TB of data on thousands of concurrent Lambda functions. In order to validate the scale, we also successfully performed sort of 100 GB of data. Technical information on this implementation can be found at http://www.qubole.com/blog/spark-on-aws-lambda/. The code is available on Github at https://github.com/qubole/spark-on-lambda.

"Qubole customers run some of the largest Spark clusters in the world. We wanted to show that a complex technology like Spark can be implemented on a serverless compute infrastructure like Lambda and scale efficiently," said Ashish Thusoo, CEO, Qubole. "Spark on Lambda can eliminate most of the operational complexities of running Spark clusters, handle bursty workloads more effectively and be more cost efficient."

Spark on Lambda's elasticity works perfectly for a number of use cases, including:

  • Interactive and ad-hoc data analysis where compute on demand is critical.
  • ETL transformation of click stream, access logs or even data science workloads. The necessary data pre-processing and preparation can fit perfectly into AWS Lambda runtimes.
  • Streaming applications with a discrete flow of events and varying queue length are perfect candidate for Spark on Lambda's elasticity.

For detailed technical information, please visit the Qubole blog. Qubole will be demonstrating Spark-on-Lambda at AWS Re:Invent 2017 in Las Vegas at Sands Expo booth 834 and Aria booth 201.

About Qubole
Qubole, the leading cloud-agnostic, big-data-as-a-service provider, is passionate about making data-driven insights easily accessible to anyone. Qubole is building the industry's first autonomous data platform. The cloud-based data platform, Qubole Data Service (QDS), removes the burden of maintaining infrastructure and enables customers to focus on their data. QDS is context-aware, self-managing, and self-learning to deliver unbeatable agility, flexibility and total cost of ownership. Qubole customers process nearly an exabyte of data every month. Qubole investors include CRV, Harmony Partners, Innov8, Lightspeed Venture Partners, Norwest Venture Partners, Singtel and IVP. For more information visit www.qubole.com.