Data stream clustering algorithms: A review

Maryam Mousavi, Azuraliza Abu Bakar, Mohammadmahdi Vakilian

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone records, sensor network monitoring, telecommunications, website analysis, weather monitoring, and e-business. Data stream clustering presents some challenges; it needs to be done in a short time frame with limited memory using a single-scan process. Moreover, because data stream outliers are hidden, clustering algorithms must be able to detect outliers and noise. In addition, the algorithms have to handle concept drift and detect arbitrary shaped clusters. Several algorithms have been proposed to overcome these challenges. This paper presents a review of five types of data stream clustering approaches: partitioning, hierarchical, density-based, grid-based and model-based. The different data stream clustering algorithms in the literature by considering their respective advantages and disadvantages are discussed.

Original languageEnglish
Pages (from-to)1-15
Number of pages15
JournalInternational Journal of Advances in Soft Computing and its Applications
Volume7
Issue numberSpecialissue3
Publication statusPublished - 2015

Fingerprint

Clustering algorithms
Monitoring
Telephone
Sensor networks
Telecommunication
Websites
Data storage equipment
Industry

Keywords

  • Data stream clustering
  • Density-based methods
  • Grid-based methods
  • Hierarchical methods
  • Model-based methods
  • Partitioning methods

ASJC Scopus subject areas

  • Computer Science Applications

Cite this

Data stream clustering algorithms : A review. / Mousavi, Maryam; Abu Bakar, Azuraliza; Vakilian, Mohammadmahdi.

In: International Journal of Advances in Soft Computing and its Applications, Vol. 7, No. Specialissue3, 2015, p. 1-15.

Research output: Contribution to journalArticle

Mousavi, Maryam ; Abu Bakar, Azuraliza ; Vakilian, Mohammadmahdi. / Data stream clustering algorithms : A review. In: International Journal of Advances in Soft Computing and its Applications. 2015 ; Vol. 7, No. Specialissue3. pp. 1-15.
@article{cbb3994ada454d698f5bcee1cfcb1d11,
title = "Data stream clustering algorithms: A review",
abstract = "Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone records, sensor network monitoring, telecommunications, website analysis, weather monitoring, and e-business. Data stream clustering presents some challenges; it needs to be done in a short time frame with limited memory using a single-scan process. Moreover, because data stream outliers are hidden, clustering algorithms must be able to detect outliers and noise. In addition, the algorithms have to handle concept drift and detect arbitrary shaped clusters. Several algorithms have been proposed to overcome these challenges. This paper presents a review of five types of data stream clustering approaches: partitioning, hierarchical, density-based, grid-based and model-based. The different data stream clustering algorithms in the literature by considering their respective advantages and disadvantages are discussed.",
keywords = "Data stream clustering, Density-based methods, Grid-based methods, Hierarchical methods, Model-based methods, Partitioning methods",
author = "Maryam Mousavi and {Abu Bakar}, Azuraliza and Mohammadmahdi Vakilian",
year = "2015",
language = "English",
volume = "7",
pages = "1--15",
journal = "International Journal of Advances in Soft Computing and its Applications",
issn = "2074-8523",
publisher = "International Center for Scientific Research and Studies (ICSRS)",
number = "Specialissue3",

}

TY - JOUR

T1 - Data stream clustering algorithms

T2 - A review

AU - Mousavi, Maryam

AU - Abu Bakar, Azuraliza

AU - Vakilian, Mohammadmahdi

PY - 2015

Y1 - 2015

N2 - Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone records, sensor network monitoring, telecommunications, website analysis, weather monitoring, and e-business. Data stream clustering presents some challenges; it needs to be done in a short time frame with limited memory using a single-scan process. Moreover, because data stream outliers are hidden, clustering algorithms must be able to detect outliers and noise. In addition, the algorithms have to handle concept drift and detect arbitrary shaped clusters. Several algorithms have been proposed to overcome these challenges. This paper presents a review of five types of data stream clustering approaches: partitioning, hierarchical, density-based, grid-based and model-based. The different data stream clustering algorithms in the literature by considering their respective advantages and disadvantages are discussed.

AB - Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone records, sensor network monitoring, telecommunications, website analysis, weather monitoring, and e-business. Data stream clustering presents some challenges; it needs to be done in a short time frame with limited memory using a single-scan process. Moreover, because data stream outliers are hidden, clustering algorithms must be able to detect outliers and noise. In addition, the algorithms have to handle concept drift and detect arbitrary shaped clusters. Several algorithms have been proposed to overcome these challenges. This paper presents a review of five types of data stream clustering approaches: partitioning, hierarchical, density-based, grid-based and model-based. The different data stream clustering algorithms in the literature by considering their respective advantages and disadvantages are discussed.

KW - Data stream clustering

KW - Density-based methods

KW - Grid-based methods

KW - Hierarchical methods

KW - Model-based methods

KW - Partitioning methods

UR - http://www.scopus.com/inward/record.url?scp=84949799953&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949799953&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84949799953

VL - 7

SP - 1

EP - 15

JO - International Journal of Advances in Soft Computing and its Applications

JF - International Journal of Advances in Soft Computing and its Applications

SN - 2074-8523

IS - Specialissue3

ER -