Detecting Impolite Crawler by Using Time Series Analysis


Numerous web crawlers especially impolite crawlers visit websites to get contents every day, which yields higher access frequency than the websites can hold. The big traffic of impolite crawlers causes a strong hazard on analysis of normal users and advertisement income. In this paper, we present a method to detect impolite crawlers by using time series analysis. This method is applied to real data of web server logs. Compared with the old methods only using common log attributes as features, the method using time series features improves detection accuracy by at least 20%.

IEEE 25th International Conference on Tools with Artificial Intelligence
Zhiqian Chen
Assistant Professor in Computer Science and Engineering

Zhiqian Chen will join Department of Computer Science at Mississippi State University as Assistant Professor, focusing on AI research.