2014年2月21日星期五

Relevance Feedback Revisited

Cranfield Method
1400 documents
The vector space model uses all terms in rerneved relevant documents for query expansion, but Harrnan (1988) showed that adding only selected terms from retrieved relevant
documents was more effective than adding all the terms, at least for the one test collection used (Cranfield 1400).

The past study if relevance feedback:
The Vector Space Model

The Probabilistic Model

Croft and Harper (1979) extended this weighting scheme by suggesting effective initial search methods based on showing that the weighting scheme reduced to IDF weighting when no documents have been seen (the initial search). In 1983 Croft further extended the probabilistic weighting by adapting the
probabilistic model to handle within document frequency weights.

Query Expansion: 
Since the sorting techniques used have a major effect on performance using query expansion, four new sorting techniques were investigated. Unlike the earlier techniques which were based on products of factors known to be important in selecting terms (such as the total frequency of a term in the
relevant subset, the IDF of a term, etc.), these new sorts were all related to ratios or probabilities of terms occurring in relevant documents vs occurring in non-relevant documents.

Relevance Feedback in Multiple Iterations:
Threemajor recommendations can be made from sutnmsrizing the results from term weighting and query expansion using a probabilistic model.


没有评论:

发表评论