Personal Web Revisitation by Context and Content Keywords with Relevance Feedback



Getting back to previously viewed web pages is a common yet uneasy task for users due to the large volume of personally accessed information on the web. This paper leverages human’s natural recall process of using episodic and semantic memory cues to facilitate recall, and presents a personal web revisitation technique called WebPagePrev through context and content keywords. Underlying techniques for context and content memories’ acquisition, storage, decay, and utilization for page re-finding are discussed. A relevance feedback mechanism is also involved to tailor to individual’s memory strength and revisitation habits. Our 6-month user study shows that: (1) compared with the existing web revisitation tool Memento, History List Searching method, and Search Engine method, the proposed WebPagePrev delivers the best re-finding quality in finding rate (92.10%), average F1-measure (0.4318) and average rank error (0.3145). (2) Our dynamic management of context and content memories including decay and reinforcement strategy can mimic users’ retrieval and recall mechanism. With relevance feedback, the finding rate of WebPagePrev increases by 9.82%, average F1-measure increases by 47.09%, and average rank error decreases by 19.44% compared to stable memory management strategy. Among time, location, and activity context factors in WebPagePrev, activity is the best recall cue, and context + content based re-finding delivers the best performance, compared to context based re-finding and content based re-finding.


Click Me:-  Latest IEEE 2018 PG Projects




A number of techniques and tools like bookmarks, history tools, search engines, metadata annotation and exploitation, and contextual recall systems have been developed to support personal web revisitation. The most closely related work of this study is Memento system , which unifies context and content to aid web revisitation. It defined the context of a web page as other pages in the browsing session that immediately precede or follow the current page, and then extracted topic-phrases from these browsed pages based on the Wikipedia topic list. In comparison, the context information considered in this work includes access time, location and concurrent activities automatically inferred from user’s computer programs. Instead of extracting content items from the full web page as done in, we extract them from page segments displayed on the screen in the user’s view, and assign a probabilistic value to each extracted term based on user’s page browsing behaviors (i.e., dwell time and highlighting), as well as page’s subject headings and term frequency-inverse document frequency (tf-idf), reflecting user’s impression and likeliness of using the keyword as recall content cues.




  • Enabled users to search for contextually related activities (e.g., time, location, concurrent activities, meetings, music playing, interrupting phone call, or even other files or web sites that were open at the same time)
  • Find a target piece of information (often not semantically related) when that context was on
  • To tailor to individual’s web revisitation characteristics, as well as human user’s context 





Our personal web revisitation framework with relevance feedback. It consists of two main phases. (1) Preparation for web revisitation. When a user accesses a web page, which is of potential to be revisited later by the user (i.e., page access time is over a threshold), the context acquisition and management module captures the current access context (i.e., time, location, activities inferred from the currently running computer programs) into a probabilistic context tree. Meanwhile, the content extraction and management module performs the unigram based extraction from the displayed page segments and obtains a list of probabilistic content terms. The probabilities of acquired context instances and extracted content terms reflect how likely the user will refer to them as memory cues to get back to the previously focused page. (2) Web revisitation. Later, when a user requests to get back to a previously focused page through context and/or content keywords, the re-access by context keywords module and re-access by content keywords module search the probabilistic context tree repository and probabilistic term list repository, respectively.




  • We present a personal web revisitation technique, called WebPagePrev that allows users to get back to their previously focused pages through access context and page content keywords. Underlying techniques for context and content memories’ acquisition, storage, and utilization for web page recall are discussed.
  • Dynamic tuning strategies to tailor to individual’s memorization strength and recall habits based on relevance feedback (e.g., weight preference calculation, decay rate adjustment, etc.) are developed for performance improvement.
  • We evaluate the effectiveness of the proposed technique WebPagePrev, and report the findings (e.g., the importance context and content factors) in web revisitation through a 6-month user study with 21 participants




  • Preparation for web revisitation 
  • Web revisitation 
  • Web Revisitation By Context And Content Keywords




Algorithm 1: Web Page Revisitation Algorithm

Web page revisitation revisited: implications of a long-term click-stream study of ...... The use of embodied agents, defined as visual human-like representations ...... However, recent work on vision-based algorithms raises the promise of rapid.

Input : a revisit query Q(W;Qc;Qd; t)

Output: Wm


 Trees = getMatchContextTrees(W;Qc; t);

Lists = getMatchTermLists(W;Qd; t);

determine candidate matched page set Wc based on Treesand Lists;

foreach w 2 Wc do

split w#tree into n smallest subtrees w#treesubi(i = 1; _ _ _ ; n);

for i = 1; i _ n; i + + do

 determine matched nodes Vsubi of w#treesubi ;

foreach _ 2 Vsubi do

 if _ has a matched child node in Vsubi then

 delete _ from Vsubi ;


 mAs(Qc; _; t) =|Qc∩_:title||_:title|・ cAs(w; _; t);

 cRank(w#treesubi| Qc; t) =Π_∈VsubimAs(Qc; _; t);

 cRank(w#tree| Qc; t)=Σni=1 cRank(w#treesubi| Qc; t);

 dRank(w#list| Qd; t) =Πqd∈QddIs(w; qd; t);

 Rank(w| Q; t) = cRank(w#tree| Qc; t)・

 dRank(w#list| Qd; t);

 determine the matched page w_ with highest ranking score;

 foreach w 2 Wc do

 if Rank(wj Q; t) < _ _ Rank(w_ j Q; t) then

 determine W′c by deleting w from Wc;

 Wm = Quicksort(W′c;Rank(W′cj Q; t));




[1] A. Cockburn, S. Greenberg, S. Jones, B. McKenzie, and M. Moyle. Improving web page revisitation: analysis, design and evaluation. IT & Society, 1(3):159–183, 2003.

[2] L. Tauscher and S. Greenberg. How people revisit web pages: empirical findings and implications for the design of history systems. International Journal of Human Computer Studies, 47(1):97–137, 1997.

[3] J. Teevan, E. Adar, R. Jones, and M. Potts. Information re-retrieval: repeat queries in yahoo’s logs. In SIGIR, pages 151–158, 2007.

[4] M. Mayer. Web history tools and revisitation support: a survey of existing approaches and directions. Foundations and Trends in HCI, 2(3):173–278, 2009.

[5] L. C. Wiggs, J. Weisberg, and A. Martin. Neural correlates of semantic and episodic memory retrieval. Neuropsychological, pages103–118, 1999.

Leave a comment