Discretization of time series dataset using relative frequency and k-nearest neighbor approach

Azuraliza Abu Bakar, Almahdi Mohammed Ahmed, Abdul Razak Hamdan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In this work, we propose an improved approach of time series data discretization using the Relative Frequency and K- nearest Neighbor functions called the RFknn method. The main idea of the method is to improve the process of determining the sufficient number of intervals for discretization of time series data. The proposed approach improved the time series data representation by integrating it with the Piecewise Aggregate Approximation (PAA) and the Symbolic Aggregate Approximation (SAX) representation. The intervals are represented as a symbol and can ensure efficient mining process where better knowledge model can be obtained without major loss of knowledge. The basic idea is not to minimize or maximize the number of intervals of the temporal patterns over their class labels. The performance of RFknn is evaluated using 22 temporal datasets and compared to the original time series discretization SAX method with similar representation. We show that RFknn can improve representation preciseness without losing symbolic nature of the original SAX representation. The experimental results showed that RFknn gives better term of representation with lower and comparable error rates.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages193-201
Number of pages9
Volume6440 LNAI
EditionPART 1
DOIs
Publication statusPublished - 2010
Event6th International Conference on Advanced Data Mining and Applications, ADMA 2010 - Chongqing
Duration: 19 Nov 201021 Nov 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6440 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other6th International Conference on Advanced Data Mining and Applications, ADMA 2010
CityChongqing
Period19/11/1021/11/10

Fingerprint

Time series
Nearest Neighbor
Time Series Data
Discretization
Interval
Approximation
Process Mining
Approximation Methods
Error Rate
Labels
Maximise
Sufficient
Minimise
Experimental Results
Term
Knowledge
Model

Keywords

  • Data mining
  • discretization
  • dynamic intervals
  • pre-processing and time series representation
  • reduction

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Abu Bakar, A., Mohammed Ahmed, A., & Hamdan, A. R. (2010). Discretization of time series dataset using relative frequency and k-nearest neighbor approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 1 ed., Vol. 6440 LNAI, pp. 193-201). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6440 LNAI, No. PART 1). https://doi.org/10.1007/978-3-642-17316-5_18

Discretization of time series dataset using relative frequency and k-nearest neighbor approach. / Abu Bakar, Azuraliza; Mohammed Ahmed, Almahdi; Hamdan, Abdul Razak.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6440 LNAI PART 1. ed. 2010. p. 193-201 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6440 LNAI, No. PART 1).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abu Bakar, A, Mohammed Ahmed, A & Hamdan, AR 2010, Discretization of time series dataset using relative frequency and k-nearest neighbor approach. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 1 edn, vol. 6440 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 1, vol. 6440 LNAI, pp. 193-201, 6th International Conference on Advanced Data Mining and Applications, ADMA 2010, Chongqing, 19/11/10. https://doi.org/10.1007/978-3-642-17316-5_18
Abu Bakar A, Mohammed Ahmed A, Hamdan AR. Discretization of time series dataset using relative frequency and k-nearest neighbor approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 1 ed. Vol. 6440 LNAI. 2010. p. 193-201. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). https://doi.org/10.1007/978-3-642-17316-5_18
Abu Bakar, Azuraliza ; Mohammed Ahmed, Almahdi ; Hamdan, Abdul Razak. / Discretization of time series dataset using relative frequency and k-nearest neighbor approach. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6440 LNAI PART 1. ed. 2010. pp. 193-201 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1).
@inproceedings{9401e0ee32274ef6b422348591511a02,
title = "Discretization of time series dataset using relative frequency and k-nearest neighbor approach",
abstract = "In this work, we propose an improved approach of time series data discretization using the Relative Frequency and K- nearest Neighbor functions called the RFknn method. The main idea of the method is to improve the process of determining the sufficient number of intervals for discretization of time series data. The proposed approach improved the time series data representation by integrating it with the Piecewise Aggregate Approximation (PAA) and the Symbolic Aggregate Approximation (SAX) representation. The intervals are represented as a symbol and can ensure efficient mining process where better knowledge model can be obtained without major loss of knowledge. The basic idea is not to minimize or maximize the number of intervals of the temporal patterns over their class labels. The performance of RFknn is evaluated using 22 temporal datasets and compared to the original time series discretization SAX method with similar representation. We show that RFknn can improve representation preciseness without losing symbolic nature of the original SAX representation. The experimental results showed that RFknn gives better term of representation with lower and comparable error rates.",
keywords = "Data mining, discretization, dynamic intervals, pre-processing and time series representation, reduction",
author = "{Abu Bakar}, Azuraliza and {Mohammed Ahmed}, Almahdi and Hamdan, {Abdul Razak}",
year = "2010",
doi = "10.1007/978-3-642-17316-5_18",
language = "English",
isbn = "3642173152",
volume = "6440 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 1",
pages = "193--201",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 1",

}

TY - GEN

T1 - Discretization of time series dataset using relative frequency and k-nearest neighbor approach

AU - Abu Bakar, Azuraliza

AU - Mohammed Ahmed, Almahdi

AU - Hamdan, Abdul Razak

PY - 2010

Y1 - 2010

N2 - In this work, we propose an improved approach of time series data discretization using the Relative Frequency and K- nearest Neighbor functions called the RFknn method. The main idea of the method is to improve the process of determining the sufficient number of intervals for discretization of time series data. The proposed approach improved the time series data representation by integrating it with the Piecewise Aggregate Approximation (PAA) and the Symbolic Aggregate Approximation (SAX) representation. The intervals are represented as a symbol and can ensure efficient mining process where better knowledge model can be obtained without major loss of knowledge. The basic idea is not to minimize or maximize the number of intervals of the temporal patterns over their class labels. The performance of RFknn is evaluated using 22 temporal datasets and compared to the original time series discretization SAX method with similar representation. We show that RFknn can improve representation preciseness without losing symbolic nature of the original SAX representation. The experimental results showed that RFknn gives better term of representation with lower and comparable error rates.

AB - In this work, we propose an improved approach of time series data discretization using the Relative Frequency and K- nearest Neighbor functions called the RFknn method. The main idea of the method is to improve the process of determining the sufficient number of intervals for discretization of time series data. The proposed approach improved the time series data representation by integrating it with the Piecewise Aggregate Approximation (PAA) and the Symbolic Aggregate Approximation (SAX) representation. The intervals are represented as a symbol and can ensure efficient mining process where better knowledge model can be obtained without major loss of knowledge. The basic idea is not to minimize or maximize the number of intervals of the temporal patterns over their class labels. The performance of RFknn is evaluated using 22 temporal datasets and compared to the original time series discretization SAX method with similar representation. We show that RFknn can improve representation preciseness without losing symbolic nature of the original SAX representation. The experimental results showed that RFknn gives better term of representation with lower and comparable error rates.

KW - Data mining

KW - discretization

KW - dynamic intervals

KW - pre-processing and time series representation

KW - reduction

UR - http://www.scopus.com/inward/record.url?scp=78650194218&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650194218&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-17316-5_18

DO - 10.1007/978-3-642-17316-5_18

M3 - Conference contribution

SN - 3642173152

SN - 9783642173158

VL - 6440 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 193

EP - 201

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -