Preview

Web Usage Mining

Powerful Essays
Open Document
Open Document
5208 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Web Usage Mining
Web Usage Mining: A Survey on Pattern Extraction from Web Logs

Web Usage Mining: A Survey on Pattern Extraction from Web Logs
S. K. Pani, , 2L. Panigrahy, 2V.H.Sankar, 3Bikram Keshari Ratha, 2A.K.Mandal, 2S.K.Padhi 1 P.G. Department Of Computer Science, RCMA; Bhubaneswar, Orissa, India 2 Department of Computer Science and Engineering; Konark Institute of Science and Technology; Bhubaneswar, Orissa, India 3 P.G. Department Of Computer Science, Utkal University,Bhubaneswar, Orissa, India E-mail: Subhendu_pani@rediffmail.com; mynamelingaraj@gmail.com; Himashankar.V@gmail.com; vkramus@gmail.com; Sanjaya2004@yahoo.com
1

Abstract— As the size of web increases along with number of users, it is very much essential for the website owners to better understand their customers so that they can provide better service, and also enhance the quality of the website. To achieve this they depend on the web access log files. The web access log files can be mined to extract interesting pattern so that the user behaviour can be understood. This paper presents an overview of web usage mining and also provides a survey of the pattern extraction algorithms used for web usage mining.

To mine the interesting data from this huge pool, data mining techniques can be applied. But the web data is unstructured or semi structured. So we can not apply the data mining techniques directly. Rather another discipline is evolved called web mining which can be applied to web data. Web mining is used to discover interest patterns which can be applied to many real world problems like improving web sites, better

Keywords— web mining, pattern extraction, usage mining, preprocessing

I. INTRODUCTION In this world of

Information

Technology, accessing

understanding the visitor’s behavior, product recommendation etc. Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services (Etzioni,1996). Web mining is

information



References: [1] Chen Hu, Xuli Zong, Chung-wei Lee and Jyh-haw Yeh, “World Wide Web Usage Mining Systems and Technologies”, Journal of SYSTEMICS, CYBERNETICS AND INFORMATICS Vol. 1, No. 4, Pages53-59, 2003.   22 International Journal of Instrumentation, Control & Automation (IJICA), Volume 1, Issue 1, 2011     Web Usage Mining: A Survey on Pattern Extraction from Web Logs   [2] FlorentMasseglia, Pascal Poncelet, Rosine Cicchetti, “An efficient algorithm for Web usage mining”, Networking and Information Systems Journal. Volume X, 2000 [3] R. Pamnani, P. Chawan “Web Usage Mining: A Research Area in Web Mining” [4] Qiankun Zhao, Sourav S. Bhowmick, “Sequential Pattern Mining: A Survey”, Technical Report, CAIS, Nanyang Technological University, Singapore, No. 2003118 , 2003. [5] S. Rawat, L. Rajamani, “Discovering Potential User Browsing Behaviors Using Custom-Built APRIORI Algorithm”, International journal of computer science & information Technology (IJCSIT) Vol.2, No.4, August 2010 [6] Ming-Syan Chen, Jong Soo Park, Philip S. Yu, “Efficient Data Mining for Path Traversal Patterns”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 10, NO. 2, MARCH/APRIL 1998. [7] Jianhan Zhu, Jun Hong, John G. Hughes, “Using Markov Chains for Link Prediction in Adaptive Web Sites”, Soft-Ware 2002, LNCS 2311, pp. 60–73, 2002 [8] WANG Tong, HE Pi-lian, “Web Log Mining by an Improved AprioriAll Algorithm”, World Academy of Science, Engineering and Technology 4 2005 [9] Hengshan Wang, Cheng Yang, Hua Zeng, “ Design and Implementation of a Web Usage Mining Model Based On Fpgrowth and Prefixspan”, Communications of the IIMA 2006 Volume 6 Issue 2 [10] Paola Britos, Damián Martinelli, Hernán Merlino, Ramón GarcíaMartínez, “Web Usage Mining Using Self Organized Maps”, International Journal of Computer Science and Network Security, VOL.7 No.6, June 2007 [11] Mehrdad Jalali, Norwati Mustapha, Ali Mamat, Md. Nasir B Sulaiman, “WEB USER NAVIGATION PATTERN MINING APPROACH BASED ON GRAPH PARTITIONING ALGORITHM”, Journal of Theoretical and Applied Information Technology [12] Kobra Etminani, Mohammad-R. Akbarzadeh-T., Noorali Raeeji Yanehsari, “Web Usage Mining: users ' navigational patterns extraction from web logs using Ant-based Clustering Method”, IFSA-EUSFLAT 2009 [13] Sandeep Singh Rawat, Lakshmi Rajamani, “DISCOVERING POTENTIAL USER BROWSING BEHAVIORS USING CUSTOM-BUILT APRIORI ALGORITHM”, International journal of computer science & information Technology (IJCSIT) Vol.2, No.4, August 2010 [14] Mahdi Khosravi, Mohammad J. Tarokh, “Dynamic Mining of Users Interest Navigation Patterns Using Naive Bayesian Method”, 978-1-42448230-6/10/$26.00 ©2010 IEEE [15] N. Sujatha, K. Iyakutty, “Refinement of Web usage Data Clustering from K-means with Genetic Algorithm”, European Journal of Scientific Research ISSN 1450-216X Vol.42 No.3 (2010), pp.464-476 [16] http://httpd.apache.org/docs/1.3/logs.html [17] http://www.w3.org/TR/WD-logfile.html [18] http://www.internetworldstats.com [19] http://www.domaintools.com/internet-statistics/   23 International Journal of Instrumentation, Control & Automation (IJICA), Volume 1, Issue 1, 2011

You May Also Find These Documents Helpful

  • Powerful Essays

    Cis 500 Data Mining Report

    • 2046 Words
    • 9 Pages

    Web mining to discover business intelligence from Web customers is used in a variety of ways because this technique is designed to discover patterns from the web. One of the most popular ways is to determine the search patterns for a particular group of people from a particular region. Other means include visiting e-commerce websites to determine what the best and worst sellers are. Additionally popular sites can also be identified by determining the number of links that refer to the site. Advantages of using techniques like this for businesses are increased sales because you have the ability to track a web users browsing behavior down to the mouse clicks. The applications of web mining enable a business to personalize services for individual customers on a massive scale. This helps businesses by satisfying customer needs and increasing brand loyalty. By using a personalized and customer oriented approach, the content of a website can be updated and adapted to a customer’s preference. Efforts like this ensure the right offers can be made to the right…

    • 2046 Words
    • 9 Pages
    Powerful Essays
  • Good Essays

    The Internet today is a major resource and tool for many people. Computers have been around since the 1950s’. However, the popularity of computers didn’t take off until the 1990s’. Many businesses today market, promote, and have their own website. This is important as it serves as avenue of business to promote their products, sell their services to their customers, and continuously inform the public on their performance. The Internet also provides various search engines in 2011 with popular search engines such as Yahoo, MSN, Google, and newer search engines such as (Microsoft)…

    • 907 Words
    • 4 Pages
    Good Essays
  • Best Essays

    Data Mining is an analytical process that primarily involves searching through vast amounts of data to spot useful, but initially undiscovered, patterns. The data mining process typically involves three major steps—exploration, model building and validation and finally, deployment.…

    • 4628 Words
    • 19 Pages
    Best Essays
  • Better Essays

    Web Analytics Basics

    • 768 Words
    • 4 Pages

    I have been asked by the company CIO to determine if Web Analytics is a strategy the company should pursue. I shall first provide information about what Web Analytics is. Then, I shall analyze how is can be used to improve a business. Finally, I shall provide my recommendation regarding using Web Analytics for our company—in answer of the CIO’s question.…

    • 768 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    Amazon.com was founded in 1994 when founder, Jeff Bezos left his job at D.E. Shaw to pursue his idea to provide products and services to…

    • 3921 Words
    • 16 Pages
    Powerful Essays
  • Best Essays

    Since 1991, the start of the World Wide Web, there has been a rapid increase of numbers in websites in the Internet and according to Netcraft in November 2013, the site had an increase of 18 million more responses compared to the 785,293,473 responses that they got last October 2013 [1]. There’s also a study on 2005 saying that there are more than 11.5 billion indexed pages [2]. Two sources for tracking the growth of the Web are http://searchengineshowdown.com/stats/ and http://searchenginewatch.com/article.php/2156481 and even though they’re not updated on a regular basis. Estimating the size of the whole Web is not an easy task due to its dynamic nature. Nevertheless, it is possible to assess the size of the publically indexable Web. The indexable Web [3] is…

    • 2126 Words
    • 6 Pages
    Best Essays
  • Powerful Essays

    References: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] Agrawal, R. and Psaila, G. "Active data mining." KDD-95, 1995. Agrawal, R., Imielinski, T., Swami, A. “Mining association rules between sets of items in large databases.” SIGMOD-1993, 1993, pp. 207-216. Cheung, D. W., Han, J, V. Ng, and Wong, C.Y. “Maintenance of discovered association rules in large databases: an incremental updating technique.” ICDE-96, 1996. Dong, G. and Li, J. “Efficient mining of emerging patterns: discovering trends and differences.” KDD-99, 1999. Freund, Y and Mansour, Y. “Learning under persistent drift” Computational learning theory: Third European conference, 1997. Ganti, V., Gehrke, J., and Ramakrishnan, R. "A framework for measuring changes in data characteristics" POPS-99. Helmbold, D. P. and Long, P. M. “Tracking drifting concepts by minimizing disagreements.” Machine Learning, 14:27, 1994. Johnson T. and Dasu, T. "Comparing massive high-dimensional data sets," KDD-98. Lane, T. and Brodley, C. "Approaches to online learning and concept drift for user identification in computer security." KDD-98, 1998. Liu, B., Hsu, W., “Post analysis of learnt rules." AAAI-96. Liu, B., Hsu, W., and Chen, S. “Using general impressions to analyze discovered classification rules.” KDD-97, 1997, pp. 31-36. Merz, C. J, and Murphy, P. UCI repository of machine learning databases [http://www.cs.uci.edu/~mlearn/MLRepository.html], 1996. Moore, D.S. “Tests for chi-squared type.” In: R. B. D’Agostino and M. A. Stephens (eds), Googness-of-Fit Techniques, Marcel Dekker, New York, 1996, pp. 63-95. Nakhaeizadeh, G., Taylor, C. and Lanquillon, C. “Evaluating usefulness of dynamic classification”, KDD-98, 1998. Quinlan, R. C4.5: program for machine learning. Morgan Kaufmann, 1992. Silberschatz, A., and Tuzhilin, A. “What makes patterns interesting in knowledge discovery systems.” IEEE Trans. on Know. and Data Eng. 8(6), 1996, pp. 970-974. Widmer, G. "Learning in the presence of concept drift and hidden contexts." Machine learning, 23 69-101, 1996.…

    • 4961 Words
    • 20 Pages
    Powerful Essays
  • Good Essays

    Data Mining

    • 1660 Words
    • 7 Pages

    Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.…

    • 1660 Words
    • 7 Pages
    Good Essays
  • Better Essays

    Analysis of people‘s aspects, reactions, emotions, etc. regarding entities such as services, products, issues, events and their attributes based on feedback from Web pages is called opinion mining. Opinion mining is also called as sentiment analysis, opinion extraction, sentiment mining, and subjectivity analysis, affect analysis, emotion analysis, review mining, etc. [12]. Opinion mining becomes important for impact analysis and helps in making decisions on constructive developmental directions. It is a research area dealing with usual methods of opinion detection and extraction of sentiments presented in a text…

    • 773 Words
    • 4 Pages
    Better Essays
  • Good Essays

    Storage of complex & upto date information is a weighty problem & has become a matter of concern for research fraternity as well. Automatically understanding the semantics of underlying web info is also one of task set that needs to look for.…

    • 818 Words
    • 4 Pages
    Good Essays
  • Better Essays

    Grouping of users is a critical assignment in web usage mining procedure. For user clustering,…

    • 944 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    Recent advancements in technology provide an opportunity to construct and store the huge amount of data together from many fields such as business, administration, banking, the delivery of social and health services, environmental safety, security and in politics. Typically, these data sets are very huge and regularly growing and contain a huge number of compound features which are hard to manage. Therefore, mining or extracting association rules from large amount of data in the database is interested for many industries which can help in many business decision making processes, such as cross-marketing, basket data study, and promotion assortment. From the beginning, Frequent Itemset Mining (FIM) is one of the most well known techniques which is concerned with extracting the information from databases based on regularly…

    • 2384 Words
    • 10 Pages
    Powerful Essays
  • Powerful Essays

    J. Srivastava, R. Cooley, M. Deshpande and P-N. Tan. “Web Usage Mining: Discovery and Applications of usage patterns from Web Data”, SIGKDD Explorations, Vol 1, Issue 2, 2000.…

    • 5487 Words
    • 22 Pages
    Powerful Essays
  • Powerful Essays

    Abstract: Today the amount of data available online is increasing widely. the World Wide Web has becoming one of the most valuable resources for information retrievals and knowledge discoveries. Web mining technologies are the right solutions for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering, and Web based data warehousing. In this paper, we provide an introduction of Web mining as well as a review of the Web mining categories. But we focus on one of the category called the Web structure mining.…

    • 1689 Words
    • 7 Pages
    Powerful Essays
  • Better Essays

    Web Mining

    • 2083 Words
    • 9 Pages

    The World-Wide-Web contains a large amount of information. Everyone can store and retrieve the information from web. It is difficult to find the relevant piece of information from web. Extracting the important information from web is called Web Mining. Web mining technologies are best suited for web information extraction and information retrieval. Web mining is one of the mining technologies, which applies data mining techniques in large amount of web data to improve the web services. We are going to give a brief description of web mining and its categorization namely: web content mining, web structure mining and web usage mining. This paper also reports the web data mining with applications.…

    • 2083 Words
    • 9 Pages
    Better Essays