A Weighted Graph Web Usage Mining method to evaluate usage of websites

Mehdi Heydari, Raed Alsaqour, Khairil Imran, Kamelia Vaziry

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Web Usage Mining (WUM) is a method to evaluate usage of websites. Traditionally, WUM uses a Server Log File (SLF) as a web usage data source. SLF is confronted with three problems: (i) in some cases such as page cashing or pushing of the back button of the browser, no data is recorded in SLF. It causes a considerable amount of web usage data to be lost and consequently the accuracy of WUM is decreased. (ii) In the linear web browsing model, the sequence of pages visited corresponds to the sequence of SLF records whereas in the parallel web browsing model the sequence of pages visited does not correspond to the sequence of SLF records andthus does not match with actual web browsing.WUM methods are therefore faced with difficulty during the reconstruction of the user web browsing model. (iii) Sometimes, it is discovered that web usage patterns are not essentially interesting patterns because of site structure. So a method is needed to distinguish web usage patterns during patterns discovery. To cope with the abovementioned problems, this paper proposes a Weighted Graph WUM (WGWUM) method, which consists of an AJAX interface and a Custom Log File (CLF). The AJAX interface monitors the events of all data source levels and records them into the CLF. To cope with the parallel web browsing model, a graph mining algorithm is applied, which helpsusers to define a threshold value to determine which patterns are valuable. To evaluate the proposed WGWUMmethod, a robotsoftware was designed tosimulate user web browsing behavior.The robotis able to navigate through websites and records all of their activities. In addition, the robot randomly determines the page that should be visited and the duration of page visiting.When itsweb browsing finishes its navigation on the website, the graph mining algorithm is applied on SLF and CLF files.WGWUM method shows 100% accuracy on discovering the traversed pathswhereas through existing methods it is 73%. This accuracy helps web administrators to improve their websites especially when the time isa significant factor, such as in e-learning web-based systems.

Original languageEnglish
Pages (from-to)1606-1616
Number of pages11
JournalAustralian Journal of Basic and Applied Sciences
Volume5
Issue number9
Publication statusPublished - Sep 2011

Fingerprint

World Wide Web
Websites
Servers
Navigation
Robots

Keywords

  • Graph mining algorithm
  • Page browsing time
  • Web usage mining

ASJC Scopus subject areas

  • General

Cite this

Heydari, M., Alsaqour, R., Imran, K., & Vaziry, K. (2011). A Weighted Graph Web Usage Mining method to evaluate usage of websites. Australian Journal of Basic and Applied Sciences, 5(9), 1606-1616.

A Weighted Graph Web Usage Mining method to evaluate usage of websites. / Heydari, Mehdi; Alsaqour, Raed; Imran, Khairil; Vaziry, Kamelia.

In: Australian Journal of Basic and Applied Sciences, Vol. 5, No. 9, 09.2011, p. 1606-1616.

Research output: Contribution to journalArticle

Heydari, M, Alsaqour, R, Imran, K & Vaziry, K 2011, 'A Weighted Graph Web Usage Mining method to evaluate usage of websites', Australian Journal of Basic and Applied Sciences, vol. 5, no. 9, pp. 1606-1616.
Heydari, Mehdi ; Alsaqour, Raed ; Imran, Khairil ; Vaziry, Kamelia. / A Weighted Graph Web Usage Mining method to evaluate usage of websites. In: Australian Journal of Basic and Applied Sciences. 2011 ; Vol. 5, No. 9. pp. 1606-1616.
@article{4ed9149766344e4abc301c97b93aec01,
title = "A Weighted Graph Web Usage Mining method to evaluate usage of websites",
abstract = "Web Usage Mining (WUM) is a method to evaluate usage of websites. Traditionally, WUM uses a Server Log File (SLF) as a web usage data source. SLF is confronted with three problems: (i) in some cases such as page cashing or pushing of the back button of the browser, no data is recorded in SLF. It causes a considerable amount of web usage data to be lost and consequently the accuracy of WUM is decreased. (ii) In the linear web browsing model, the sequence of pages visited corresponds to the sequence of SLF records whereas in the parallel web browsing model the sequence of pages visited does not correspond to the sequence of SLF records andthus does not match with actual web browsing.WUM methods are therefore faced with difficulty during the reconstruction of the user web browsing model. (iii) Sometimes, it is discovered that web usage patterns are not essentially interesting patterns because of site structure. So a method is needed to distinguish web usage patterns during patterns discovery. To cope with the abovementioned problems, this paper proposes a Weighted Graph WUM (WGWUM) method, which consists of an AJAX interface and a Custom Log File (CLF). The AJAX interface monitors the events of all data source levels and records them into the CLF. To cope with the parallel web browsing model, a graph mining algorithm is applied, which helpsusers to define a threshold value to determine which patterns are valuable. To evaluate the proposed WGWUMmethod, a robotsoftware was designed tosimulate user web browsing behavior.The robotis able to navigate through websites and records all of their activities. In addition, the robot randomly determines the page that should be visited and the duration of page visiting.When itsweb browsing finishes its navigation on the website, the graph mining algorithm is applied on SLF and CLF files.WGWUM method shows 100{\%} accuracy on discovering the traversed pathswhereas through existing methods it is 73{\%}. This accuracy helps web administrators to improve their websites especially when the time isa significant factor, such as in e-learning web-based systems.",
keywords = "Graph mining algorithm, Page browsing time, Web usage mining",
author = "Mehdi Heydari and Raed Alsaqour and Khairil Imran and Kamelia Vaziry",
year = "2011",
month = "9",
language = "English",
volume = "5",
pages = "1606--1616",
journal = "Australian Journal of Basic and Applied Sciences",
issn = "1991-8178",
publisher = "INSInet Publications",
number = "9",

}

TY - JOUR

T1 - A Weighted Graph Web Usage Mining method to evaluate usage of websites

AU - Heydari, Mehdi

AU - Alsaqour, Raed

AU - Imran, Khairil

AU - Vaziry, Kamelia

PY - 2011/9

Y1 - 2011/9

N2 - Web Usage Mining (WUM) is a method to evaluate usage of websites. Traditionally, WUM uses a Server Log File (SLF) as a web usage data source. SLF is confronted with three problems: (i) in some cases such as page cashing or pushing of the back button of the browser, no data is recorded in SLF. It causes a considerable amount of web usage data to be lost and consequently the accuracy of WUM is decreased. (ii) In the linear web browsing model, the sequence of pages visited corresponds to the sequence of SLF records whereas in the parallel web browsing model the sequence of pages visited does not correspond to the sequence of SLF records andthus does not match with actual web browsing.WUM methods are therefore faced with difficulty during the reconstruction of the user web browsing model. (iii) Sometimes, it is discovered that web usage patterns are not essentially interesting patterns because of site structure. So a method is needed to distinguish web usage patterns during patterns discovery. To cope with the abovementioned problems, this paper proposes a Weighted Graph WUM (WGWUM) method, which consists of an AJAX interface and a Custom Log File (CLF). The AJAX interface monitors the events of all data source levels and records them into the CLF. To cope with the parallel web browsing model, a graph mining algorithm is applied, which helpsusers to define a threshold value to determine which patterns are valuable. To evaluate the proposed WGWUMmethod, a robotsoftware was designed tosimulate user web browsing behavior.The robotis able to navigate through websites and records all of their activities. In addition, the robot randomly determines the page that should be visited and the duration of page visiting.When itsweb browsing finishes its navigation on the website, the graph mining algorithm is applied on SLF and CLF files.WGWUM method shows 100% accuracy on discovering the traversed pathswhereas through existing methods it is 73%. This accuracy helps web administrators to improve their websites especially when the time isa significant factor, such as in e-learning web-based systems.

AB - Web Usage Mining (WUM) is a method to evaluate usage of websites. Traditionally, WUM uses a Server Log File (SLF) as a web usage data source. SLF is confronted with three problems: (i) in some cases such as page cashing or pushing of the back button of the browser, no data is recorded in SLF. It causes a considerable amount of web usage data to be lost and consequently the accuracy of WUM is decreased. (ii) In the linear web browsing model, the sequence of pages visited corresponds to the sequence of SLF records whereas in the parallel web browsing model the sequence of pages visited does not correspond to the sequence of SLF records andthus does not match with actual web browsing.WUM methods are therefore faced with difficulty during the reconstruction of the user web browsing model. (iii) Sometimes, it is discovered that web usage patterns are not essentially interesting patterns because of site structure. So a method is needed to distinguish web usage patterns during patterns discovery. To cope with the abovementioned problems, this paper proposes a Weighted Graph WUM (WGWUM) method, which consists of an AJAX interface and a Custom Log File (CLF). The AJAX interface monitors the events of all data source levels and records them into the CLF. To cope with the parallel web browsing model, a graph mining algorithm is applied, which helpsusers to define a threshold value to determine which patterns are valuable. To evaluate the proposed WGWUMmethod, a robotsoftware was designed tosimulate user web browsing behavior.The robotis able to navigate through websites and records all of their activities. In addition, the robot randomly determines the page that should be visited and the duration of page visiting.When itsweb browsing finishes its navigation on the website, the graph mining algorithm is applied on SLF and CLF files.WGWUM method shows 100% accuracy on discovering the traversed pathswhereas through existing methods it is 73%. This accuracy helps web administrators to improve their websites especially when the time isa significant factor, such as in e-learning web-based systems.

KW - Graph mining algorithm

KW - Page browsing time

KW - Web usage mining

UR - http://www.scopus.com/inward/record.url?scp=81755182880&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=81755182880&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:81755182880

VL - 5

SP - 1606

EP - 1616

JO - Australian Journal of Basic and Applied Sciences

JF - Australian Journal of Basic and Applied Sciences

SN - 1991-8178

IS - 9

ER -