SOURCE: BloomReach


October 02, 2012 08:00 ET

BloomReach Reduces Duplicate Web Pages With Release of Industry's First Algorithmic De-Duplication Technology

New Dynamic Duplication Reduction Technology Improves Page Relevance to Drive Discoverability, Increase Revenues and Improve the User Experience

MOUNTAIN VIEW, CA--(Marketwire - Oct 2, 2012) - BloomReach, the Big Data Marketing Applications company, today announced the industry's first algorithmic de-duplication technology, called BloomReach Dynamic Duplication Reduction (DDR). BloomReach DDR detects and massively reduces duplicate pages on client websites without manual intervention -- freeing up business resources to focus on content development and other strategic activities. The new technology is part of the BloomReach Web Relevance Engine through the BloomSearch Big Data Marketing Application. BloomReach DDR automatically prunes out 95 percent or more of pages with duplicate content but different URLs, increasing the signal-to-noise ratio and ensuring that the most relevant pages get found.

Duplicated content on large and medium websites is a common problem that attributes to lost revenue. Many versions of the same page can end up in a search-engine index for numerous reasons such as extra parameters for analytics and multiple paths from filtering -- useful for marketing analytics but not for discovery. This poses a major challenge for natural search indexing and retrieval because it diverts traffic across several pages, causing lost traffic and sales. SEO professionals use rel-canonical tags to direct crawlers to the primary version of a page, but for some web businesses, it is challenging to implement and maintain rel-canonicals. BloomReach DDR finds and addresses duplicate content without rel-canonicals and can provide SEO professionals with the data to implement those tags if desired.

"Duplicated content on websites is a significant contributor to lost revenue because duplicate pages are not meant to be indexed and are often blocked by indexing technology -- so the primary version is effectively invisible. BloomReach addresses duplication algorithmically, which is scalable to the largest websites. BloomReach DDR identifies and addresses duplication without taking up valuable staff resources with tedious manual work," said Dr. Ashutosh Garg, CTO and co-founder of BloomReach. "With DDR, BloomSearch further ensures that the most relevant and highest quality web content gets found -- only unique pages are indexed."

BloomReach uses deep crawl and semantic interpretation technologies to continuously review all content on a site and automatically discover and act on duplicate pages. Upon discovery, BloomReach ensures that all BloomSearch-generated links within widgets and thematic pages across the site only point to the primary version. Finally, the primary page is the one crawled, indexed and discovered while the duplicates are essentially invisible to crawlers, but still useful for web analytics.

BloomReach DDR reduces index spam, concentrates natural search traffic on the most relevant pages and maximizes coverage from the crawl quota. It also lets client SEO resources spend more time creating compelling pages, writing new content, ensuring that tags are descriptive and accurate -- and, when desired, adding rel-canonical tags.

This new DDR technology is now available to existing BloomSearch customers of the Web Relevance Engine at no additional cost. Incremental traffic to the primary page is paid for on the same cost-per-click model as with all BloomSearch traffic.

About BloomReach
BloomReach's Big Data Marketing Applications maximize our customers' revenues -- attracting unmet demand and creating better user experiences by making the most relevant products and services easier to find.

BloomReach created the Web Relevance Engine (WRE), which collects and semantically interprets billions of consumer interactions, pages and daily. The cloud applications powered by the WRE dynamically adapt websites to capture existing consumer demand across search, social and advertising channels, driving an average of 94 percent increase in non-branded natural search traffic and significant incremental revenues across its large customer base from the retail, travel and listings industries.

The BloomReach team, comprised of accomplished leaders in machine learning, large-scale systems science, big data and search from companies like Google, Cisco and Facebook, is dedicated to delivering relevant results to customers across channels in a myriad of industries. BloomReach is headquartered in Mountain View, California and is backed by investment firms Bain Capital Ventures and Lightspeed Ventures. For more information, please visit

Contact Information

  • For media inquiries, contact:
    Samuel Moore
    Atomic PR in San Francisco
    (415) 593-1400 or Email Contact