2014年2月21日星期五

Chapter 9: relevance feedback and query expansion

Global Techniques: Global methods are techniques for expanding or reformulating query terms independent of the query and results returned from it, so that changes in the query wording will cause the new query to match other semantically similar terms.

Local Techniques: Local methods adjust a query relative to the documents that initially appear
to match the query

The Rocchio algorithm for relevance feedback
The underlying theory. 
An application of Rocchio’s algorithm. Some documents have been labeled as relevant and nonrelevant and the initial query vector ismoved in response to this feedback.

Probabilistic relevance feedback
Rather than reweighting the query in a vector space, if a user has told us some relevant and nonrelevant documents, then we can proceed to build a classifier. One way of doing this is with a Naive Bayes probabilistic model.

Misspellings. If the user spells a term in a different way to the way it is spelled in any document in the collection, then relevance feedback is unlikely to be effective.

Cross-language information retrieval. Documents in another language are not nearby in a vector space based on term distribution. Rather, documents in the same language cluster more closely together.

Mismatch of searcher’s vocabulary versus collection vocabulary. If the user searches for laptop but all the documents use the term notebook computer, then the query will fail, and relevance feedback is again most likely ineffective.

Relevance feedback on the web
Relevance feedback has been shown to be very effective at improving relevance of results. Its successful use requires queries for which the set of relevant documents is medium to large. Full relevance feedback is often onerous for the user, and its implementation is not very efficient in most IR systems.In many cases, other types of interactive retrievalmay improve relevance by about as much with less work.

没有评论:

发表评论