2014年2月28日星期五

Muddiest Point:

1. In designing the search interface, how many factors should be taken into consideration?
2. Could you show me some classic user interface design to me and explain why they are so popular?
3. What is the best source for user interface design?
4. How human factors involve in interface design and which is more important?

Chapter 10: User Interface Visualization

Main: This Chapter will mainly outline the process of the user information;

Human Computer Interaction: 
Design Principals:
1. User Feedbacks
2. Reduce Working Memory Load
3. Provide Interface for Alternative Users

The role of visualization:
Difficult: To make abstract inherent idea becoming  blatantly

Evaluating Interactive System:   
The technique for visualization:
1. Brushing and linking
Meaning: Connecting two or more views of the same data the titles and histograms
2.panning and zooming
Move back and forth to the same view
3. focused plus contexts
4.magic lenses

The process of the visualization: 
Form the idea, generate the query, and return the results on whether it stops or continue

Non-search part of the information Process:
The Four main Starting points for the user interfaces:


Lists of Collection: 
 Display Hierarchies
Example: Yahoo, and HB System

Automatically Derived Collection Overview:
Evaluation of Graphical Overview:
Co-citation Clustering for overviews:
Examples Dialog and Wizards:

Automated Source Selection:


Generating the Queries: 
1. Boolean Queries
2. Faceted Queries
3. Graphical Approach for Queries
4. Phrase and Proximity
5. Natural Language and Language Processing

Visualization for Text Analysis

This chapter describes ideas that have been put forward for understanding the contents of text collections from a more analytical point of view. 

Visualization for text mining: 
1.TAKMI Text Mining System
2.Jigsaw system (Gorg et al., 2007) was designed to allow intelligence analysts to examine relationships among entities mentioned in a document collection and phone logs. 
3.The BETA system, part of the IBM Web Fountain project (Meredith and Pieper, 2006), also had the goal of facilitating exploration of data within dimensions automatically extracted from text. 
4. The TRIST information “triage” system (Jonker et al., 2005Proulx et al., 2006) attempted to address many of the deficiencies of standard search for information analysts' tasks.

1.2 The work Frequency of the visualization 
1.The SeeSoft visualization (Eick, 1994) represented text in a manner resembling columns of newspaper text, with one “line” of text on each horizontal line of the strip 
2. The TextArc visualization (Paley, 2002) is similar to SeeSoft, but arranged the lines of text in a spiral and placed frequently occurring words within the center of the spiral. 

1.3 Visualization And relationship 
Some more recent approaches have moved away from nodes-and-links in order to break the interesting relationships into pieces and show their connections via brushing-and-linking interactions.



THE DESIGN OF SEARCH USER INTERFACES

1.1 Keeping the User Interface Simple 
1. Do not want to be interrupted
2. Focused is the intensive task 
3. Be fitted to all ages 

1.2 The historical Shift in the design of the interface 
1.3 The process of the interface design 
 
  • Learnability: How easy is it for users to accomplish basic tasks the first time they encounter the interface?
  • Efficiency: How quickly can users accomplish their tasks after they learn how to use the interface?
  • Memorability: After a period of non-use, how long does it take users to reestablish proficiency?
  • Errors: How many errors do users make, how severe are these errors, and how easy is it for users to recover from these errors?
  • Satisfaction: How pleasant or satisfying is it to use the interface
After a design is testing well in discount or informal studies, formal experiments comparing different designs and measuring for statistically significant differences can be conducted.

1.4 The guidelines for the Design 
  • Offer informative feedback.
  • Support user control.
  • Reduce short-term memory load.
  • Provide shortcuts for skilled users.
  • Reduce errors; offer simple error handling.
  • Strive for consistency.
  • Permit easy reversal of actions.
  • Design for closure.
1.5 Offer Efficient and Informative Feedback 
Show Search Result Immediately 
Show Informative Documents immediately
Allow Sorting of Results By Criteria 
Show Query Term Suggestions 
Use Relevant Indicators Sparsely 
Support Rapid Response 

 
  • Offer efficient and informative feedback,
  • Balance user control with automated actions,
  • Reduce short-term memory load,
  • Provide shortcuts,
  • Reduce errors,
  • Recognize the importance of small details, and
  • Recognize the importance of aesthetics.
 


2014年2月21日星期五

Muddiest Point

I am still confused about the cranfield method and could you please provide more examples on that?

What is the main strategies for global and local method in doing relevance feed back?

Chapter 9: relevance feedback and query expansion

Global Techniques: Global methods are techniques for expanding or reformulating query terms independent of the query and results returned from it, so that changes in the query wording will cause the new query to match other semantically similar terms.

Local Techniques: Local methods adjust a query relative to the documents that initially appear
to match the query

The Rocchio algorithm for relevance feedback
The underlying theory. 
An application of Rocchio’s algorithm. Some documents have been labeled as relevant and nonrelevant and the initial query vector ismoved in response to this feedback.

Probabilistic relevance feedback
Rather than reweighting the query in a vector space, if a user has told us some relevant and nonrelevant documents, then we can proceed to build a classifier. One way of doing this is with a Naive Bayes probabilistic model.

Misspellings. If the user spells a term in a different way to the way it is spelled in any document in the collection, then relevance feedback is unlikely to be effective.

Cross-language information retrieval. Documents in another language are not nearby in a vector space based on term distribution. Rather, documents in the same language cluster more closely together.

Mismatch of searcher’s vocabulary versus collection vocabulary. If the user searches for laptop but all the documents use the term notebook computer, then the query will fail, and relevance feedback is again most likely ineffective.

Relevance feedback on the web
Relevance feedback has been shown to be very effective at improving relevance of results. Its successful use requires queries for which the set of relevant documents is medium to large. Full relevance feedback is often onerous for the user, and its implementation is not very efficient in most IR systems.In many cases, other types of interactive retrievalmay improve relevance by about as much with less work.

Relevance Feedback Revisited

Cranfield Method
1400 documents
The vector space model uses all terms in rerneved relevant documents for query expansion, but Harrnan (1988) showed that adding only selected terms from retrieved relevant
documents was more effective than adding all the terms, at least for the one test collection used (Cranfield 1400).

The past study if relevance feedback:
The Vector Space Model

The Probabilistic Model

Croft and Harper (1979) extended this weighting scheme by suggesting effective initial search methods based on showing that the weighting scheme reduced to IDF weighting when no documents have been seen (the initial search). In 1983 Croft further extended the probabilistic weighting by adapting the
probabilistic model to handle within document frequency weights.

Query Expansion: 
Since the sorting techniques used have a major effect on performance using query expansion, four new sorting techniques were investigated. Unlike the earlier techniques which were based on products of factors known to be important in selecting terms (such as the total frequency of a term in the
relevant subset, the IDF of a term, etc.), these new sorts were all related to ratios or probabilities of terms occurring in relevant documents vs occurring in non-relevant documents.

Relevance Feedback in Multiple Iterations:
Threemajor recommendations can be made from sutnmsrizing the results from term weighting and query expansion using a probabilistic model.


A Study of Methods for Negative Relevance Feedback

Topic: Exploring the techniques of improving the different negative feedback

Proposed of the question:
Given a query Q and a document collection C, a retrieval system returns a ranked list of documents L. Li denotes the i-th ranked document in the ranked list. We assume that Q is so dif ficult that all the top f ranked documents (seen so far by a user) are non-relevant. The goal is to study how to use these negative examples, i.e., N = fL1; :::; Lf g, to rerank the next r unseen documents in the original ranked list: U = fLf+1; :::; Lf+rg. We set f = 10 to simulate that the rst page of search results are irrelevant,
and set r = 1000.

Score combination:

Q is query, D is document, when beta equals 0, it does not perform any negative feedback, but when beta is equals to negative, they will penalize the data.

Two methods to be:
Local Method: Rank all the documents in U by the negative query and penalize the top  p documents.
Global Method: Rank all the documents in C by the negative query. Select, from the top p documents of this ranked list, those documents in U to penalize.
VECTOR SPACE MODEL

conclusions: 
Negative feedback is very important because it can help a user when search results are very poor.
This work inspires several future directions. First, we can study a more principled way to model multiple negative models and use these multiple negative models to conduct constrained query expansion, for example, avoiding terms which are in negative models. Second, we are interested in a learning framework which can utilize both a little positive information (original queries) and a certain amount of negative information to learn a ranking function to help dif fcult queries. Third, queries are dif cult due to different reasons. Identifying these reasons and customizing negative feedback
strategies would be much worth studying.






2014年2月18日星期二

Notes: Improving the effectiveness of information retrieval with local context analysis

Problem: the automatic expansion for the queries, which is better: the global or local method

Technique for automatic expansion:
1.  term clustering
2. local: documents based on the top ranked retrieved from the query
Effect: the local feedback can improve the retrieved results
3. new method: local context analysis: concurrences analysis techniques
use the top-ranked documents for expansion

Global Techniques:
1. term clustering: group the terms based on their concurrence
problem: it can not solve the ambiguous terms  
2. Dimensional Reduction:
3. Phrasefinder: so far the most successful skills,

Local Techniques:
1. local feedback
2. Local context analysis:
The most critical function of a local feedback algorithm is to separate terms in the top-ranked relevant documents from those in top-ranked nonrelevant documents.

The most frequent terms (except stopwords) in the top-ranked documents are used for query expansion.

HYPOTHESIS. A common term from the top-ranked relevant documents will tend to cooccur with all query terms within the top-ranked documents.

Method:
1. Build Concurrence Metrics
2. Combining the degrees of cooccurrence with all query terms
3. Differentiating rare and common query terms


 problems [Ponte 1998].
In summary, local context analysis takes these steps to expand a query Q on a collection C:
(1) Perform an initial retrieval on C to get the top-ranked set S for Q.
(2) Rank the concepts in the top-ranked set using the formula f~c, Q!.
(3) Add the best k concepts to Q.
Figure 1 shows an example query expanded by local context analysis

The application for the local context analysis:
1. The cross language retrieval
2. The topic segmentation
3. Distribution of IR, choosing the right collections to search for

The experimental conclusion:

2014年2月14日星期五

Evaluation

IIR chapter 8. OR MIR 3 
2. Karen Sparck Jones, (2006). What's the value of TREC: is there a gap to jump or a chasm to bridge? ACM SIGIR Forum, Volume 40 Issue 1 June 2006 http://doi.acm.org/10.1145/1147197.1147198 
3. Kalervo Järvelin, Jaana Kekäläinen. (2002) Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems (TOIS) Volume 20 , Issue 4 (October 2002) Pages: 422 – 446  http://doi.acm.org/10.1145/582415.582418  


The TREC Programme has been very successful at generalising. It has shown that essentially simple methods of retrieving documents (standard statistically-based VSM, BM25, InQuery, etc) can give decent ‘basic benchmark’ performance across a range of datasets, including quite large ones. But retrieval contexts vary widely, much more widely than the contexts the TREC datasets embody. The TREC Programme has sought to address variation, but it has done this in a largely ad hoc and unsystematic way. Thus even while allowing for the Programme’s concentration on core retrieval system functionality, namely the ability to retrieve relevant documents, and excluding for now the tasks it has addressed that do not fall under the document retrieval heading, the generalisation the Programme seeks, or is assumed to be seeking, is incomplete. However, rather than simply continuing with the generalisation mission, intended to gain more support for the current findings, it may be time now to address particularisation.


environment and contexts, 
The simple historic model for environment variation within this laboratory paradigm was of micro 
variation, i.e. of change to the request set - say plain or fancy, or to the relevance criteria and hence set -
say high only or high and partial, for the same set of documents; less commonly there has been change to 
the document set while holding request or relevance criteria/practice constant.

Changing all of D, Q and R might seem to imply more than micro variation, or at least could be deemed
to do so if the type of request and/or style of relevance assessment changed, not merely the actual document sets. Such variation, embodying a new form of need as well as new documents to draw on in trying to meet it, might be deemed to constitute macro variation rather than micro variation, and therefore as implicitly enlarging the TREC Programme's reference to contexts.

TREC Strategies: 
This TREC failure is hardly surprising. In TREC, as in many other retrieval experiment situations,
there is normally no material access to the encompassing wider context and especially to the actual users,whether because such real working contexts are too remote or because, fundamentally, they do not exist as prior, autonomous realities.

The factors framework refers to Input Factors (IF), Purpose Factors (PF), and Output Factors (OF).
Input Factors and Purpose Factors constrain the set of choices for Output Factors, but for any complex task cannot simply determine them.Under IF for summarising we have properties of the source texts. This includes their Form, Subject type, and Units, which subsume a series of subfactors, as illustrated in Figure 2. It is not di cult to see that such factors apply, in the retrieval case, to documents. They will also apply, in retrieval, to requests, past and current, and to any available past relevant document sets. The particular characterisations of documents and requests may of course be di erent, e.g. documents but not requests might have a complex structure.

retrieval case - the setup with both system and context - o ers.
The non-TREC literature refers to many studies of individual retrieval setups: what they are about,
what they are for, how they seem to be working, how they might be speci cally improved to serve their
purposes better. The generalisation goal that the automated system research community has sought to
achieve has worked against getting too involved in the particularities of any individual setups. My argument here is that, in the light of the generalisation we have achieved, we now need to revisit particularity. That is, to try to work with test data that is tied to an accessible and rich setup, that can be analysed for what it suggests to guide system development as well as for what it o ers for fuller performance assessment. We need to start from the whole setup, not just from the system along with whatever we happen to be able to pull pretty straightforwardly from the setup into our conventional D * Q * R environment model.

Cumulated Gain-Based Evaluation of IR Techniques
Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents, or document components, should be identified and ranked first for presentation. This is often desirable from the user point of view.In order to develop IR techniques in this direction, it is necessary to develop.

Modern large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents, or document components, should be identified and ranked first for presentation. This is often desirable from the user point of view. In order to develop IR techniques in this direction, it is necessary to develop.

The second point above stated that the greater the ranked position of a relevant document, the less valuable it is for the user, because the less likely it is that the user will ever examine the document due to time, effort, and cumulated information from documents already seen. This leads to comparison of IR techniques through test queries by their cumulated gain based on document rank with a rank-based discount factor. The greater the rank, the smaller the share of the document score that is added to the cumulated gain.

The normalized recall measure (NR, for short; Rocchio [1966] and Salton and McGill [1983]), the sliding ratio measure (SR, for short; Pollack [1968] and Korfhage [1997]), and the satisfaction—frustration—total measure (SFT, for short; Myaeng and Korfhage [1990] and Korfhage [1997]) all seek to take into account the order in which documents are presented to the user. The NR measure compares the actual performance of an IR technique to the ideal one (when all relevant documents are retrieved first). Basically it measures the area between the ideal and the actual curves. NR does not take the degree of document relevance into account and is highly sensitive to the last relevant document found late in the ranked order.

They first propose the use of each relevance level separately in recall and precision calculation. Thus different P–R curves are drawn for each level. Performance differences at different relevance levels between IR techniques may thus be analyzed. Furthermore, they generalize recall and precision calculation to directly utilize graded document relevance scores.

The nonbinary relevance judgments were obtained by rejudging documents judged relevant by NIST assessors and about 5% of irrelevant documents for each topic. The new judgments were made by six Master’s students of information studies, all of them fluent in English although not native speakers. The
relevant and irrelevant documents were pooled, and the judges did not know the number of documents previously judged relevant or irrelevant in the pool

The proposed measures are based on several parameters: the last rank considered, the gain values to employ, and discounting factors to apply. An experimenter needs to know which parameter values and combinations to use. In practice, the evaluation context and scenario should suggest these values.

2014年2月7日星期五

Probabilistic Information Retrieval and other models

Muddies Point:
1. What are the differences between the bayes model and bayes network models.
2. What will be IR models closing related to other similar methods.
3 What are the disadvantages for choosing the probability models.

The probability models:
The conditional event: P(A, B) = P(A ∩ B) = P(A|B)P(B) = P(B|A)P(A)
From these we can derive Bayes’ Rule for inverting conditional probabilities:
Bayes' Rule:
P(W|L)={\frac  {P(L|W)P(W)}{P(L)}}={\frac  {P(L|W)P(W)}{P(L|W)P(W)+P(L|M)P(M)}}

The 1/0 loss case
In the simplest case of the PRP, there are no retrieval costs or other utility concerns that would differentially weight actions or errors. You lose a point for either returning a nonrelevant document or failing to return a relevant document (such a binary situation where you are evaluated on your accuracy
1/0 LOSS is called 1/0 loss). The goal is to return the best possible results as the top k
documents, for any value of k the user chooses to examine. The PRP then says to simply rank all documents in decreasing order of P(R = 1|d, q). BAYES OPTIMAL a set of retrieval results is to be returned, rather than an ordering, the Bayes

C0 · P(R = 0|d) − C1 · P(R = 1|d) ≤ C0 · P(R = 0|d′) − C1 · P(R = 1|d′)

The Binary Independence Model  first present a model which assumes that the user has a single
step information need. As discussed in Chapter 9, seeing a range of results might let the user refine their information need. Fortunately, as mentioned there, it is straightforward to extend the Binary Independence Model so as to provide a framework for relevance feedback, and we present this model in

Since each xt is either 0 or 1, we can separate the terms to give:
O(R|~x,~q) = O(R|~q) · t:xt=1
P(xt = 1|R = 1,~q)
P(xt = 1|R = 0,~q)
t:xt=0
P(xt = 0|R = 1,~q)
P(xt = 0|R = 0,~q)

Adding 1
2 in this way is a simple form of smoothing. For trials with categorical outcomes (such as noting the presence or absence of a term), one way to estimate the probability of an event from data is simply to count the number of times an event occurred divided by the total number of trials.This is referred to as the relative frequency of the event. RELATIVE FREQUENCY Estimating the prob MAXIMUM LIKELIHOOD ability as the relative frequency is the maximum likelihood estimate (or MLE), ESTIMATE MLE because this value makes the observed data maximally likely.

Finite automata and language models
If instead each node has a probability distribution over generating different terms, we have a language model. The notion of a language model is inherently probabilistic. A language model is a function LANGUAGE MODEL that puts a probability measure over strings drawn from some vocabulary. That is, for a language model M over an alphabet S:

Under the unigram language model the order of words is irrelevant, and sosuch models are often called “bag of words” models, as discussed in Chapter 6 (page 117). Even though there is no conditioning on preceding context, this model nevertheless still gives the probability of a particular ordering of
terms. However, any other ordering of this bag of terms will have the same probability. So, really, we have a multinomial distribution MULTINOMIAL over words. So long DISTRIBUTION as we stick to unigram models, the language model name and motivation could be viewed as historical rather than necessary.