Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm

Almahdi Mohammed Ahmed, Azuraliza Abu Bakar, Abdul Razak Hamdan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.

Original languageEnglish
Title of host publicationProceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10
Pages1320-1325
Number of pages6
DOIs
Publication statusPublished - 2010
Event2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10 - Cairo
Duration: 29 Nov 20101 Dec 2010

Other

Other2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10
CityCairo
Period29/11/101/12/10

Fingerprint

Time series
Rain

Keywords

  • Data mining
  • Pre-processing and reduction
  • Time series

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Hardware and Architecture

Cite this

Ahmed, A. M., Abu Bakar, A., & Hamdan, A. R. (2010). Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm. In Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10 (pp. 1320-1325). [5687092] https://doi.org/10.1109/ISDA.2010.5687092

Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm. / Ahmed, Almahdi Mohammed; Abu Bakar, Azuraliza; Hamdan, Abdul Razak.

Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. p. 1320-1325 5687092.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmed, AM, Abu Bakar, A & Hamdan, AR 2010, Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm. in Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10., 5687092, pp. 1320-1325, 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10, Cairo, 29/11/10. https://doi.org/10.1109/ISDA.2010.5687092
Ahmed AM, Abu Bakar A, Hamdan AR. Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm. In Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. p. 1320-1325. 5687092 https://doi.org/10.1109/ISDA.2010.5687092
Ahmed, Almahdi Mohammed ; Abu Bakar, Azuraliza ; Hamdan, Abdul Razak. / Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm. Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. pp. 1320-1325
@inproceedings{2970e7257b5d44a9bf340ea07e001a2f,
title = "Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm",
abstract = "In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.",
keywords = "Data mining, Pre-processing and reduction, Time series",
author = "Ahmed, {Almahdi Mohammed} and {Abu Bakar}, Azuraliza and Hamdan, {Abdul Razak}",
year = "2010",
doi = "10.1109/ISDA.2010.5687092",
language = "English",
isbn = "9781424481354",
pages = "1320--1325",
booktitle = "Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10",

}

TY - GEN

T1 - Improved SAX time series data representation based on Relative Frequency and K-Nearest Neighbor algorithm

AU - Ahmed, Almahdi Mohammed

AU - Abu Bakar, Azuraliza

AU - Hamdan, Abdul Razak

PY - 2010

Y1 - 2010

N2 - In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.

AB - In this paper we propose a new approach based on Symbolic Aggregate approximation (SAX), called improved iSAX to recognize efficient and accurate discovery of the important patterns, essential for time series data. The original SAX approach allows a very high-quality dimensionality reduction and distance measures to be defined on the symbolic approach and it is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality. The proposed improved SAX, called iSAX includes the Relative Frequency and K-Nearest Neighbor (RFknn) Algorithm. The main task of the algorithm is to determine the sufficient number of intervals represented as symbolic (alphabet size) that can ensure efficient mining process and a good knowledge model is obtained without major loss of knowledge. We show that iSAX can improve representation preciseness without losing symbolic nature of the original SAX representation. The iSAX is compared with the original SAX and PAA representation, and demonstrate its quality improvement. Ten time series rainfall data sets were used. The experimental results showed that iSAX gives better term of representation and minimum Euclidean Distance.

KW - Data mining

KW - Pre-processing and reduction

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=79851506825&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79851506825&partnerID=8YFLogxK

U2 - 10.1109/ISDA.2010.5687092

DO - 10.1109/ISDA.2010.5687092

M3 - Conference contribution

SN - 9781424481354

SP - 1320

EP - 1325

BT - Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10

ER -