Published: 2012 November
Type: Research Technical Report
Institution: Institut AIFB, KIT
Erscheinungsort / Ort: Karlsruhe
Large amounts of data are being produced daily as detailed records of Web usage behavior, but the task of deriving actionable knowledge from them remains a challenge. Investigations of user browsing behavior at multiple websites, while more beneficial than studies restricted to a single site, still need to tackle the problems of information heterogeneity and mapping usage logs to meaningful events from the application domain.
Focusing on the problem of modeling cross-site browsing behavior, we present a formalization approach based on a Web browsing Activity Model (WAM). We introduce a novel two-staged approach for the semantic enrichment of usage logs with domain knowledge, bringing together Semantic Web technologies and Machine Learning techniques. For learning the semantic types of logs, we present a supervised multi-class classification formulation, deploying structural Support Vector Machines with new sequential input features.
We provide an implementation
of these approaches and show the results
of evaluation with real-world data.
Download: Media:JH paper 2013.pdf