Subscribe to DSC Newsletter

How could Amazon increase sales by redefining relevancy?

By improving its search and relevancy engines, to include item price as a main factor. The type of optimization and ROI boosting described below applies to all digital catalogs. Here we focus on books.

Search engine:

When you perform a keyword search on Amazon in the book section, Amazon will return a search result page with (say) 10 suggested books matching your search keyword. This task is performed by the search engine. The search engine will display the books in some order. The order is based either on price or keyword proximity.

Relevancy engine:

If you search for a specific book title, Amazon will also display other books that you might be interested in based on historical sales from other users. This task is performed by the relevancy engine, and it works as follows:

If m(A,B) users both purchased book A (the book you want to purchase) and another book B over the last 30 days, and if k(A) users purchased A, and k(B) users purchased B, then the association between A and B (that is, how closely these books are related from a cross-selling point of view)  is defined as R(A,B) = m(A,B) / SQRT{k(A) * k(B)}.

The order in which suggested books are displayed is entirely determined by the function R(A,*).

A better sorting criteria:

Very expensive books generate very few sales, but each sale generates huge profit. Cheap books  generate little money, but the sales volume more than compensates for the little profit per book. In short, if you show books that all have exactly the same relevancy score to the user, the book  that you should show up in the #1 position is the book with optimum price, with regard to total expected revenue. In the above chart, the optimum is attained by a booking selling for $21.

This chart is based on simulated numbers, assuming that the chance for a sale is an exponentially decreasing function of the book price. That is,

P(sale | price) = a * exp(-b*price)

A more general model would be:

P(sale | price, relevancy score) = a * exp(-b*price) * f(relevancy score)

Another way to further increase revenue is by including user data in the formula. A wealthy user has no problems purchasing an expensive book. Users who traditionally buy more expensive books  should be shown more expensive books, on average.


When a sale takes place, how do you know if it is because of showing rightly priced books at the top, or  because of perfect relevancy? For instance, relevancy between 'data science' and 'big data' is very good, but relevancy between 'data science' and 'cloud computing' is not as good. Does it make sense to suggest an expensive 'cloud computing' book to a wealthy user interested in a 'data science' book, or is it better to suggest a less expensive book related to 'big data', if your goal is to maximize profit?

Separating the influence of relevancy from the price factor is not easy.

Note: the price factor is particularly useful when keyword or category relevancy is based on "small data".

Views: 1551


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Vincent Granville on December 26, 2011 at 4:00pm

Interestingly, my e-book entitled Data Science by Analyticbridge is now on Amazon, but when you search for "data science" on Amazon, it does not show up. Instead, other books not related to "data science" show up. Is it because I just uploaded the book  a few days ago? If you search for "Analyticbridge" or "Vincent Granville" though, then my e-book does show up. 

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017 is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service