Optimal initial centroid in k-means for crime topic

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

A wide number of different clustering method applications and their effectiveness in crime topics have been examined in this paper. Several works have investigated the optimal initial centroid of clustering crime topics. In this paper, wehave compared the effectiveness of single pass clustering and k-means in detecting crime topics and aiding in the identification of events or crimes. We have also experimentedon enhanced k-means clustering, in order to select the optimal initial centroid to be automatically compared with regular k-means, to choose the initial centroid randomly. Based on the main findings of this study, it was revealed that the experimental method, which was based on k-means, was proved to be better and more effective than single pass clustering in detecting and identifying events or crimes. For the initial number of centroids, it was found that the proposed method was more effective when used in selecting terms that were more than the number of topics, than when they were less. However, the best result was obtained when choosing a number of topics equal to the number of original topics. This implies that the optimal accuracy of clustering is achieved when selecting a large number of documents that have termsbetter than randomly chosen documents as a centroid.

Original languageEnglish
Pages (from-to)19-26
Number of pages8
JournalJournal of Theoretical and Applied Information Technology
Volume45
Issue number1
Publication statusPublished - 2012

Fingerprint

Crime
K-means
Centroid
Clustering
K-means Clustering
Clustering Methods
Choose
Imply
Term

Keywords

  • Crime clustering
  • Crime topic
  • K-means
  • Single pass

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

@article{434918fe25b949dfb4f89cf7024a65fc,
title = "Optimal initial centroid in k-means for crime topic",
abstract = "A wide number of different clustering method applications and their effectiveness in crime topics have been examined in this paper. Several works have investigated the optimal initial centroid of clustering crime topics. In this paper, wehave compared the effectiveness of single pass clustering and k-means in detecting crime topics and aiding in the identification of events or crimes. We have also experimentedon enhanced k-means clustering, in order to select the optimal initial centroid to be automatically compared with regular k-means, to choose the initial centroid randomly. Based on the main findings of this study, it was revealed that the experimental method, which was based on k-means, was proved to be better and more effective than single pass clustering in detecting and identifying events or crimes. For the initial number of centroids, it was found that the proposed method was more effective when used in selecting terms that were more than the number of topics, than when they were less. However, the best result was obtained when choosing a number of topics equal to the number of original topics. This implies that the optimal accuracy of clustering is achieved when selecting a large number of documents that have termsbetter than randomly chosen documents as a centroid.",
keywords = "Crime clustering, Crime topic, K-means, Single pass",
author = "Masnizah Mohd and Bsoul, {Qusay Walid} and {Mohamad Ali}, Nazlena and {Mohd Noah}, {Shahrul Azman} and Saidah Saad and Nazlia Omar and {Ab Aziz}, {Mohd Juzaiddin}",
year = "2012",
language = "English",
volume = "45",
pages = "19--26",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "1",

}

TY - JOUR

T1 - Optimal initial centroid in k-means for crime topic

AU - Mohd, Masnizah

AU - Bsoul, Qusay Walid

AU - Mohamad Ali, Nazlena

AU - Mohd Noah, Shahrul Azman

AU - Saad, Saidah

AU - Omar, Nazlia

AU - Ab Aziz, Mohd Juzaiddin

PY - 2012

Y1 - 2012

N2 - A wide number of different clustering method applications and their effectiveness in crime topics have been examined in this paper. Several works have investigated the optimal initial centroid of clustering crime topics. In this paper, wehave compared the effectiveness of single pass clustering and k-means in detecting crime topics and aiding in the identification of events or crimes. We have also experimentedon enhanced k-means clustering, in order to select the optimal initial centroid to be automatically compared with regular k-means, to choose the initial centroid randomly. Based on the main findings of this study, it was revealed that the experimental method, which was based on k-means, was proved to be better and more effective than single pass clustering in detecting and identifying events or crimes. For the initial number of centroids, it was found that the proposed method was more effective when used in selecting terms that were more than the number of topics, than when they were less. However, the best result was obtained when choosing a number of topics equal to the number of original topics. This implies that the optimal accuracy of clustering is achieved when selecting a large number of documents that have termsbetter than randomly chosen documents as a centroid.

AB - A wide number of different clustering method applications and their effectiveness in crime topics have been examined in this paper. Several works have investigated the optimal initial centroid of clustering crime topics. In this paper, wehave compared the effectiveness of single pass clustering and k-means in detecting crime topics and aiding in the identification of events or crimes. We have also experimentedon enhanced k-means clustering, in order to select the optimal initial centroid to be automatically compared with regular k-means, to choose the initial centroid randomly. Based on the main findings of this study, it was revealed that the experimental method, which was based on k-means, was proved to be better and more effective than single pass clustering in detecting and identifying events or crimes. For the initial number of centroids, it was found that the proposed method was more effective when used in selecting terms that were more than the number of topics, than when they were less. However, the best result was obtained when choosing a number of topics equal to the number of original topics. This implies that the optimal accuracy of clustering is achieved when selecting a large number of documents that have termsbetter than randomly chosen documents as a centroid.

KW - Crime clustering

KW - Crime topic

KW - K-means

KW - Single pass

UR - http://www.scopus.com/inward/record.url?scp=84874548995&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874548995&partnerID=8YFLogxK

M3 - Article

VL - 45

SP - 19

EP - 26

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 1

ER -