February 17, 2015 09:00 ET

Dato Updates Machine Learning Platform, Puts Spotlight on Data Engineering Automation, Spark and Hadoop Integrations

New Updates to "GraphLab Create" Platform Let Data Science Teams Wrangle Big Data at Interactive Speeds and Build Intelligent Applications Faster

SEATTLE, WA--(Marketwired - Feb 17, 2015) - Today at Strata + Hadoop World San Jose, Dato (formerly known as GraphLab) announced new updates to its machine learning platform, GraphLab Create, that allow data science teams to wrangle terabytes of data on their laptops at interactive speeds so that they can build intelligent applications faster. With Dato, users leverage machine learning to build prototypes, tune them, deploy in production and even offer them as a predictive service, all in minutes. These are the intelligent applications that provide predictions for a myriad of use cases including recommenders, sentiment analysis, fraud detection, churn prediction and ad targeting.

Continuing with its commitment to the Open Source community, Dato is also announcing the Open Source release of its core engine, including the out of core machine learning (ML)-optimized SFrame and SGraph data structures which make ML tasks blazing fast. Commercial and non-commercial versions of the full GraphLab Create platform are available for download at

Enterprises large and small are eager to use machine learning technologies to help them extract new value from their data, yet building the applications to do so is a complex and time-intensive process. Dato has now built in new capabilities for automated feature engineering as well as for automatically tagging and de-duplicating data, greatly reducing the effort needed to accomplish these tasks. Data scientists, developers and engineers working on prototyping models of intelligent applications can now reach deployment-ready versions quickly and efficiently. Dato's GraphLab Create can now handle all types of data from many different sources including Impala, Cloudera, Hortonworks as well as Spark, and allows for a robust and scalable feature engineering and model building pipeline. This means significant time saved in preparing data for analysis as well as in building the machine learning models that will generate predictions from this data.

"Our company goal has been to translate the vast experience of our data scientists into a platform that businesses can use to build intelligent apps fast," said Carlos Guestrin, CEO of Dato. "With the features and key integration points of the latest release, we reaffirm our strategic direction as well as our deep commitment to the data science community."

Businesses including LivingSocial and Zillow use Dato's machine learning platform to quickly and easily build intelligent applications based on any type of data including graphs, tables, text and images.

"GraphLab Create gives us easy access to some of the most advanced machine learning and this lets us iterate on our ideas faster." - Grecia Lapizco-Encinas, Data Scientist at LivingSocial

News Highlights:
New features available in the GraphLab Create platform include:

  • Predictive Service Deployment Enhancements: enables easy integrations of Dato predictive services with applications regardless of development environment and allows administrators to view information about deployed models and statistics on requests and latency on a per predictive object basis.
  • Data Science Task Automation: a new Data Matching Toolkit allows for automatic tagging of data from a reference dataset and deduplication of lists automatically. In addition, the new Feature Engineering pipeline makes it easy to chain together multiple feature transformations--a vast simplification for the data engineering stage.
  • Open Source Version of GraphLab Create:  Dato is offering an open-source release of GraphLab Create's core code. Included in this version is the source for the SFrame and SGraph, along with many machine learning models, such as triangle counting, pagerank and more. Using this code, it is easy to build a new machine learning toolkit or a connector from the Dato SFrame to a data store. The source code can be found on Dato's GitHub page.
  • New Pricing and Packaging Options: updated pricing and packaging include a non-commercial, free offering with the same features as the GraphLab Create commercial version. The free version allows data science enthusiasts to interact with and prototype on a leading machine learning platform. Also available is a new 30-day, no obligation evaluation license of the full-feature, commercial version of Dato's product line.

Dato Presence at Strata + Hadoop World San Jose

  • Large-scale Machine Learning Day: Wednesday, February 18 from 9am - 5pm in LL20 D. An all-day, hands-on training program led by Carlos Guestrin will provide a quick start to building and deploying predictive applications at scale.
  • Wanted: Women in Data, Tech, and STEM: Friday, February 20 from 10:40am - 11:20am in LL20B. Alice Zheng discusses her work and achievements, the attitudes that enabled her career successes, and reinforce the value of gender diversity for inventiveness, ingenuity and business success as part of a panel discussion.
  • Cloudera Theater: Friday, February 20 from 11:15-11:25: The Use of GraphLab Create In the Deduplication of Zillow Data.
  • Dato Booths: Demos and data scientists will be on hand at Dato's booth in the Innovator's Pavilion (P3) and the Cloudera Partner Pavilion (K1).

About Dato (formerly GraphLab)
Dato is the company behind the fastest and most complete platform for building predictive and intelligent applications. Started at Carnegie Mellon in 2009 as an open source project under the guidance of Carlos Guestrin, PhD., the software was initially intended for applying large scale machine learning to graph analysis. The functionality has since been much augmented to include tables, text, images and is now in broad use to make recommendations, detect fraud, score marketing content and generally deliver predictive capabilities at many notable e-tailers, service providers and Fortune 500 firms. Dato and its deeply experienced team of data scientists and technology veterans is based in Seattle. For more information, visit

Contact Information