Optimization overlap clustering based on the hybrid rough discernibility concept and rough K-Means

Djoko Budiyanto Setyohadi, Azuraliza Abu Bakar, Zulaiha Ali Othman

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Technically, the problem of overlap in a dataset is viewed as an uncertainty problem and is solved using a fuzzy set theoretical approach, specifically, fuzzy clustering. This approach is powerful but has some problems associated with it, of which the design of the membership function is the most serious. There are many different techniques for optimizing fuzzy clustering, including those based on similarity decomposition and centroids of clusters. Furthermore, the problem of overlap clustering is still being studied to improve its performance, especially with respect to the membership optimization. Rough set theory (RST) is the complement of fuzzy set theory and evidence theory, which use different techniques to address the uncertainty problem in overlap clustering. Considering the simplicity of the membership computation in RST, we propose an overlap clustering algorithm, which involves the use of the discernibility concept of RST to improve the overlap clusters as an existing variant of the overlap clustering algorithm. The experiment described here demonstrates that this new method improves the performance and increases the accuracy of clustering while avoiding the time complexity problem. The experiment uses five UCI machine learning datasets. The complexity of the data is measured using the volume of the overlap region and feature efficiency. The experimental results show that the proposed method significantly outperforms the other two methods in terms of the Dunn index, the sum of the squared errors and the silhouette index.

Original languageEnglish
Pages (from-to)795-823
Number of pages29
JournalIntelligent Data Analysis
Volume19
Issue number4
DOIs
Publication statusPublished - 1 Jul 2015

Fingerprint

Rough set theory
K-means
Rough
Overlap
Fuzzy clustering
Clustering
Clustering algorithms
Optimization
Rough Set Theory
Fuzzy set theory
Membership functions
Fuzzy sets
Fuzzy Clustering
Learning systems
Clustering Algorithm
Experiments
Decomposition
Evidence Theory
Uncertainty
Silhouette

Keywords

  • discernibility
  • Overlap clustering
  • RK-means
  • rough membership
  • uncertain

ASJC Scopus subject areas

  • Artificial Intelligence
  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition

Cite this

Optimization overlap clustering based on the hybrid rough discernibility concept and rough K-Means. / Setyohadi, Djoko Budiyanto; Abu Bakar, Azuraliza; Ali Othman, Zulaiha.

In: Intelligent Data Analysis, Vol. 19, No. 4, 01.07.2015, p. 795-823.

Research output: Contribution to journalArticle

@article{18e364466d114f93bee3356357a79d6e,
title = "Optimization overlap clustering based on the hybrid rough discernibility concept and rough K-Means",
abstract = "Technically, the problem of overlap in a dataset is viewed as an uncertainty problem and is solved using a fuzzy set theoretical approach, specifically, fuzzy clustering. This approach is powerful but has some problems associated with it, of which the design of the membership function is the most serious. There are many different techniques for optimizing fuzzy clustering, including those based on similarity decomposition and centroids of clusters. Furthermore, the problem of overlap clustering is still being studied to improve its performance, especially with respect to the membership optimization. Rough set theory (RST) is the complement of fuzzy set theory and evidence theory, which use different techniques to address the uncertainty problem in overlap clustering. Considering the simplicity of the membership computation in RST, we propose an overlap clustering algorithm, which involves the use of the discernibility concept of RST to improve the overlap clusters as an existing variant of the overlap clustering algorithm. The experiment described here demonstrates that this new method improves the performance and increases the accuracy of clustering while avoiding the time complexity problem. The experiment uses five UCI machine learning datasets. The complexity of the data is measured using the volume of the overlap region and feature efficiency. The experimental results show that the proposed method significantly outperforms the other two methods in terms of the Dunn index, the sum of the squared errors and the silhouette index.",
keywords = "discernibility, Overlap clustering, RK-means, rough membership, uncertain",
author = "Setyohadi, {Djoko Budiyanto} and {Abu Bakar}, Azuraliza and {Ali Othman}, Zulaiha",
year = "2015",
month = "7",
day = "1",
doi = "10.3233/IDA-150746",
language = "English",
volume = "19",
pages = "795--823",
journal = "Intelligent Data Analysis",
issn = "1088-467X",
publisher = "IOS Press",
number = "4",

}

TY - JOUR

T1 - Optimization overlap clustering based on the hybrid rough discernibility concept and rough K-Means

AU - Setyohadi, Djoko Budiyanto

AU - Abu Bakar, Azuraliza

AU - Ali Othman, Zulaiha

PY - 2015/7/1

Y1 - 2015/7/1

N2 - Technically, the problem of overlap in a dataset is viewed as an uncertainty problem and is solved using a fuzzy set theoretical approach, specifically, fuzzy clustering. This approach is powerful but has some problems associated with it, of which the design of the membership function is the most serious. There are many different techniques for optimizing fuzzy clustering, including those based on similarity decomposition and centroids of clusters. Furthermore, the problem of overlap clustering is still being studied to improve its performance, especially with respect to the membership optimization. Rough set theory (RST) is the complement of fuzzy set theory and evidence theory, which use different techniques to address the uncertainty problem in overlap clustering. Considering the simplicity of the membership computation in RST, we propose an overlap clustering algorithm, which involves the use of the discernibility concept of RST to improve the overlap clusters as an existing variant of the overlap clustering algorithm. The experiment described here demonstrates that this new method improves the performance and increases the accuracy of clustering while avoiding the time complexity problem. The experiment uses five UCI machine learning datasets. The complexity of the data is measured using the volume of the overlap region and feature efficiency. The experimental results show that the proposed method significantly outperforms the other two methods in terms of the Dunn index, the sum of the squared errors and the silhouette index.

AB - Technically, the problem of overlap in a dataset is viewed as an uncertainty problem and is solved using a fuzzy set theoretical approach, specifically, fuzzy clustering. This approach is powerful but has some problems associated with it, of which the design of the membership function is the most serious. There are many different techniques for optimizing fuzzy clustering, including those based on similarity decomposition and centroids of clusters. Furthermore, the problem of overlap clustering is still being studied to improve its performance, especially with respect to the membership optimization. Rough set theory (RST) is the complement of fuzzy set theory and evidence theory, which use different techniques to address the uncertainty problem in overlap clustering. Considering the simplicity of the membership computation in RST, we propose an overlap clustering algorithm, which involves the use of the discernibility concept of RST to improve the overlap clusters as an existing variant of the overlap clustering algorithm. The experiment described here demonstrates that this new method improves the performance and increases the accuracy of clustering while avoiding the time complexity problem. The experiment uses five UCI machine learning datasets. The complexity of the data is measured using the volume of the overlap region and feature efficiency. The experimental results show that the proposed method significantly outperforms the other two methods in terms of the Dunn index, the sum of the squared errors and the silhouette index.

KW - discernibility

KW - Overlap clustering

KW - RK-means

KW - rough membership

KW - uncertain

UR - http://www.scopus.com/inward/record.url?scp=84936930571&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84936930571&partnerID=8YFLogxK

U2 - 10.3233/IDA-150746

DO - 10.3233/IDA-150746

M3 - Article

AN - SCOPUS:84936930571

VL - 19

SP - 795

EP - 823

JO - Intelligent Data Analysis

JF - Intelligent Data Analysis

SN - 1088-467X

IS - 4

ER -