1400 documents

The vector space model uses all terms in rerneved relevant documents for query expansion, but Harrnan (1988) showed that adding only selected terms from retrieved relevant

documents was more effective than adding all the terms, at least for the one test collection used (Cranfield 1400).

The past study if relevance feedback:

The Vector Space Model

The Probabilistic Model

probabilistic model to handle within document frequency weights.

**Query Expansion:**

Since the sorting techniques used have a major effect on performance using query expansion, four new sorting techniques were investigated. Unlike the earlier techniques which were based on products of factors known to be important in selecting terms (such as the total frequency of a term in the

relevant subset, the IDF of a term, etc.),

**these new sorts were all related to ratios or probabilities of terms occurring in relevant documents vs occurring in non-relevant documents.**

**Relevance Feedback in Multiple Iterations:**

Threemajor recommendations can be made from sutnmsrizing the results from term weighting and query expansion using a probabilistic model.

