July 03, 2014 09:00 ET

ICSI Works With Yahoo Labs and Lawrence Livermore Lab to Offer Analytics Tools for Over 100 Million Flickr Images and Videos

50TB Computing Program Runs Analysis on the Entire Flickr Creative Commons Dataset, One of the Largest Public Multimedia Datasets Ever Released to the Public

BERKELEY, CA--(Marketwired - Jul 3, 2014) - The International Computer Science Institute (ICSI), a leading center for computer science research, today announced a collaboration with Yahoo Labs and Lawrence Livermore National Laboratory to process and analyze the recently released Yahoo Flickr Creative Commons 100 Million (YFCC100M) dataset, a publicly available corpus of user-generated content comprising more than 100 million images and videos.

ICSI has developed a number of research tools to extract meaning from the vast amounts of multimedia data freely available online, giving researchers the ability to draw powerful conclusions from the data. Such work includes:

  • Audio and visual recognition techniques that can reliably identify the geographic location of a video or photo's origin point.
  • Video concept detection, which uses acoustic analysis and segmentation of similar sounds to treat sounds like keywords, making it possible to reliably search abstract concepts like "baby catching a ball" or "animal dancing to music."

ICSI is collaborating directly with Lawrence Livermore Lab to process the massive dataset using the lab's supercomputer, the Cray Catalyst.

"The media that people choose to upload with a Creative Commons License are full of information: they tell us about the people in them, where they are and what is happening, even if none of that is explicitly laid out," said Gerald Friedland, research director of Audio and Multimedia at ICSI. "ICSI's sophisticated computing tools help us make sense of that data at scale, and there is so much we can learn by fully leveraging the rich Creative Commons dataset that Flickr has amassed over the past decade."

The dataset can be requested through Yahoo's Webscope program here, and ICSI's research analytics tools will be hosted on an Amazon instance via ICSI's web site in August of this year.

The development of ICSI's research tools are supported by grants from the NSF, NGA, and IARPA's ALADDIN program.

About ICSI:
The International Computer Science Institute (ICSI) is a leading center for research in computer science and one of the few independent, nonprofit research institutes in the United States. With its unique focus on international collaboration and its affiliation with the University of California at Berkeley, ICSI brings together the most influential U.S. scientists and experts from around the world in areas such as computer networking and security, speech and language processing, algorithms, bioinformatics, computer architecture, computer vision, multimedia analysis, and artificial intelligence. For more information, visit ICSI at, follow us at, or read our blog at

Contact Information

  • Media Contact:
    Marie Williams
    Email Contact