Skip to content

Addendum on search and support vector machines

by Michael Nielsen on February 6, 2012

In the last post I described how to use support vector machines (SVMs) to combine multiple notions of search relevance. After posting I realized I could greatly simplify my discussion of an important subject in the final section of that post. What I can simplify is this: once you’ve found the SVM parameters, how should you use them to rank a long list of documents for a particular query, in practice?

Here’s the setup. We suppose we’ve already taken a bunch of training data, as in the last post, and used that training data to find a SVM with soft margin. That SVM can be used to rank any query and document. In particular, the SVM is specified by a vector spacer , and we rank a document spacer above a document spacer for query spacer if:

spacer

where spacer is the feature difference vector. (In the last post a parameter spacer also appeared in this condition, but as I noted in one of the exercises in that post, for the particular problem of search spacer , which is why it doesn’t appear here.)

In the final section of the last post I did lots of perambulations with the condition [*] to figure out how to construct a linear order ranking documents, given a particular query spacer . We can simplify life quite a bit by recalling that spacer where spacer is the feature vector for query spacer and document spacer . The condition above can then be rewritten as:

spacer

This condition suggests a much easier way of ranking documents: simply treat spacer as a score of how good document spacer is when the query is spacer , so that we rank spacer above spacer whenever the score for spacer is better than spacer . We then just construct a linear ordering of the documents, based on the score. Or, if we prefer, we can find just the top 10 (or 20, or whatever) documents by iterating linearly over the documents, and keeping a running tally of the best 10 (or 20) found to date.

This is quite a bit simpler than the discussion at the end of my last post. It only works, though, because our classifer (the SVM) is a linear function of the feature difference vectors. It’s not difficult to construct other types of classifer for which it really wouldn’t be obvious how to construct this kind of score. For those classifiers you could still fall back on the analysis I did in the last post, so it wasn’t all wasted effort.

Incidentally, one thing I like about the equation [**] is that it makes it a lot clearer why spacer is called the weight vector. It means that the score for page spacer on query spacer is just a linear combination of the different feature values, with weights given by the weight vector. That makes the notation spacer and nomenclature “weight vector” much clearer, as well as making it a lot clearer what the vector spacer means!

From → Uncategorized

2 Comments
  1. spacer
    Kaveh permalink

    Hi Mike

    Thanks for these two posts. I think you need another w dot on the rhs of **.

    • spacer
      Michael Nielsen permalink

      Thanks, corrected!

Comments are closed.

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.