Masters Theses
Abstract
"The web is an index of real-world events and lot of knowledge can be mined from the web resources and their derivatives. Event detection is one recent research topic triggered from the domain of web data mining with the increasing popularity of search engines. In the visitor-centric approach, the click-through data generated by the web search engines is the start up resource with the intuition: often such data is event-driven. In this thesis, a retrospective algorithm is proposed to detect such real-world events from the click-through data. This approach differs from the existing work as it: (i) considers the click-through data as collaborative query sessions instead of mere web logs and try to understand user behavior (ii) tries to integrate the semantics, structure, and content of queries and pages (iii) aims to achieve the overall objective via Query Clustering. The problem of event detection is transformed into query clustering by generating clusters - hybrid cover graphs; each hybrid cover graph corresponds to a real-world event. The evolutionary pattern for the co-occurrences of query-page pairs in a hybrid cover graph is imposed for the quality purpose over a moving window period. Also, the approach is experimentally evaluated on a commercial search engine's data collected over 3 months with about 20 million web queries and page clicks from 650000 users. The results outperform the most recent work in this domain in terms of number of events detected, F-measures, entropy, recall etc."--Abstract, page iv.
Advisor(s)
Madria, Sanjay Kumar
Committee Member(s)
Leopold, Jennifer
Erçal, Fikret
Department(s)
Computer Science
Degree Name
M.S. in Computer Science
Sponsor(s)
Air Force Research Laboratory (Wright-Patterson Air Force Base, Ohio)
Publisher
Missouri University of Science and Technology
Publication Date
Fall 2010
Pagination
ix, 42 pages
Rights
© 2010 Prabhu Kumar Angajala, All rights reserved.
Document Type
Thesis - Open Access
File Type
text
Language
English
Subject Headings
Association rule miningData miningQuerying (Computer science)Semantics -- Data processing
Thesis Number
T 9715
Print OCLC #
724116197
Electronic OCLC #
745911546
Recommended Citation
Angajala, Prabhu Kumar, "Event detection from click-through data via query clustering" (2010). Masters Theses. 4984.
https://scholarsmine.mst.edu/masters_theses/4984