A Data Science Central Community
PRETEND for a moment that you are Google’s search engine.
Someone types the word “dresses” and hits enter. What will be the very first result?
There are, of course, a lot of possibilities. Macy’s comes to mind. Maybe a specialty chain, like J. Crew or the Gap. Perhaps a Wikipedia entry on the history of hemlines.
O.K., how about the word “bedding”? Bed Bath & Beyond seems a candidate. Or Wal-Mart, or perhaps the bedding section of Amazon.com.
“Area rugs”? Crate & Barrel is a possibility. Home Depot, too, and Sears, Pier 1 or any of those Web sites with “area rug” in the name, like arearugs.com.
You could imagine a dozen contenders for each of these searches. But in the last several months, one name turned up, with uncanny regularity, in the No. 1 spot for each and every term:
The company bested millions of sites — and not just in searches for dresses, bedding and area rugs. For months, it was consistently at or near the top in searches for “skinny jeans,” “home decor,” “comforter sets,” “furniture” and dozens of other words and phrases, from the blandly generic (“tablecloths”) to the strangely specific (“grommet top curtains”).
This striking performance lasted for months, most crucially through the holiday season, when there is a huge spike in online shopping. J. C. Penney even beat out the sites of manufacturers in searches for the products of those manufacturers. Type in “Samsonite carry on luggage,” for instance, and Penney for months was first on the list, ahead of Samsonite.com.
With more than 1,100 stores and $17.8 billion in total revenue in 2010, Penney is certainly a major player in American retailing. But Google’s stated goal is to sift through every corner of the Internet and find the most important, relevant Web sites.
Does the collective wisdom of the Web really say that Penney has the most essential site when it comes to dresses? And bedding? And area rugs? And dozens of other words and phrases?
The New York Times asked an expert in online search, Doug Pierce of Blue Fountain Media in New York, to study this question, as well as Penney’s astoundingly strong search-term performance in recent months. What he found suggests that the digital age’s most mundane act, the Google search, often represents layer upon layer of intrigue. And the intrigue starts in the sprawling, subterranean world of “black hat” optimization, the dark art of raising the profile of a Web site with methods that Google considers tantamount to cheating.
Despite the cowboy outlaw connotations, black-hat services are not illegal, but trafficking in them risks the wrath of Google. The company draws a pretty thick line between techniques it considers deceptive and “white hat” approaches, which are offered by hundreds of consulting firms and are legitimate ways to increase a site’s visibility. Penney’s results were derived from methods on the wrong side of that line, says Mr. Pierce. He described the optimization as the most ambitious attempt to game Google’s search results that he has ever seen.
“Actually, it’s the most ambitious attempt I’ve ever heard of,” he said. “This whole thing just blew me away. Especially for such a major brand. You’d think they would have people around them that would know better.”
TO understand the strategy that kept J. C. Penney in the pole position for so many searches, you need to know how Web sites rise to the top of Google’s results. We’re talking, to be clear, about the “organic” results — in other words, the ones that are not paid advertisements. In deriving organic results, Google’s algorithm takes into account dozens of criteria, many of which the company will not discuss.
But it has described one crucial factor in detail: links from one site to another.
If you own a Web site, for instance, about Chinese cooking, your site’s Google ranking will improve as other sites link to it. The more links to your site, especially those from other Chinese cooking-related sites, the higher your ranking. In a way, what Google is measuring is your site’s popularity by polling the best-informed online fans of Chinese cooking and counting their links to your site as votes of approval.
But even links that have nothing to do with Chinese cooking can bolster your profile if your site is barnacled with enough of them. And here’s where the strategy that aided Penney comes in. Someone paid to have thousands of links placed on hundreds of sites scattered around the Web, all of which lead directly to JCPenney.com.
Who is that someone? A spokeswoman for J. C. Penney, Darcie Brossart, says it was not Penney.
“J. C. Penney did not authorize, and we were not involved with or aware of, the posting of the links that you sent to us, as it is against our natural search policies,” Ms. Brossart wrote in an e-mail. She added, “We are working to have the links taken down.”
The links do not bear any fingerprints, but nothing else about them was particularly subtle. Using an online tool called Open Site Explorer, Mr. Pierce found 2,015 pages with phrases like “casual dresses,” “evening dresses,” “little black dress” or “cocktail dress.” Click on any of these phrases on any of these 2,015 pages, and you are bounced directly to the main page for dresses on JCPenney.com.
Some of the 2,015 pages are on sites related, at least nominally, to clothing. But most are not. The phrase “black dresses” and a Penney link were tacked to the bottom of a site called nuclear.engineeringaddict.com. “Evening dresses” appeared on a site called casino-focus.com. “Cocktail dresses” showed up on bulgariapropertyportal.com. ”Casual dresses” was on a site called elistofbanks.com. “Semi-formal dresses” was pasted, rather incongruously, on usclettermen.org.
Read full version at http://www.nytimes.com/2011/02/13/business/13search.html?src=me&...
A very interesting post!
I am curious about one question, which might not correlate to the content of this post, that how Google actually makes profit? Consider that most of its service to public is free and it seems that Google cannot( or should not) have a direct control on the rank of websites.
Seems to me that also long as result positioning is based on a popularity measure (how many sites link to your site), there is no real way to combat this behavior. Based on the article, the risk just might be worth the punishment. And odds are that Google will be less likely to completely ban a company that is high on their paid advertisers list because that would be hurting their own wallet.
So, if you can take the punishment and are big enough customer on the google paid search side, this might be a viable strategy (at least, short-term). If it is true that this move kept JC Penney's afloat, by significantly increasing their online holiday sales and the "punishment" left them no worse off than they were before they engaged in the campaign, then from their perspective, it just might've been worth it.
No matter what google does, seems to me there will always be a way to game the system. And those who have enough money to do it, with enough incentive/desperation, will do it.
It seems that more weight should be put on the intentional surf model (the links that a user actually clicks when broswing) than on the random surfer model, even though, the intential surfer model can also be gamed. The fact that JCP was successful in gaming the system by adding links to unrelated pages means that at this time google had heavily reliance on links probability.
How can a search engine ensure that the link that shows up as first in the search results is actually the most relevant, highest quality link for a given keyword at the time of the search in a way that can't be easily gamed??