Klang vally rainfall forecasting model using time series data mining technique

Zulaiha Ali Othman, Noraini Ismail, Abdul Razak Hamdan, Mahmoud Ahmed Sammour

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Rainfall has influence the social and economic activities in particular area such as agriculture, industry and domestic needs. Therefore, having an accurate rainfall forecasting becomes demanding. Various statistical and data mining techniques are used to obtain the accurate prediction of rainfall. Time series data mining is a well-known used for forecast time series data. Therefore, the objective of this study is to develop a distribution of rainfall pattern forecasting model based on symbolic data representation using Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate approXimation (SAX). The rainfall dataset were collected from three rain gauge station in Langat area within 31 years. The development of the model consists of three phases: data collection, data pre-processing, and model development. During data preprocessing phase, the data were transform into an appropriate representation using dimensional reduction technique known as Piecewise Aggregate Approximation (PAA). Then the transformed data were discretized using Symbolic Aggregate approXimation (SAX). Furthermore, clustering technique was used to determine the label of class pattern during preparing unsupervised training data. Three type of pattern are identified which is dry, normal and wet using three clustering techniques: Agglomotive Hierarchical Clustering, K-Means Partitional Clustering and Self-Organising Map. As a result, the best model has be able to forecast better for the next 3 and 5 years using rule induction classification techniques.

Original languageEnglish
Pages (from-to)372-379
Number of pages8
JournalJournal of Theoretical and Applied Information Technology
Volume92
Issue number2
Publication statusPublished - 1 Oct 2016

Fingerprint

Rainfall
Time Series Data
Rain
Data mining
Forecasting
Time series
Data Mining
Data Preprocessing
Approximation
Forecast
Clustering
Rain gages
Rule Induction
Dimensional Reduction
Model
K-means Clustering
Agriculture
Hierarchical Clustering
Self-organizing
Labels

Keywords

  • Classification
  • Clustering
  • Rainfall forecasting
  • Time series data mining
  • Time series symbolic representation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Klang vally rainfall forecasting model using time series data mining technique. / Ali Othman, Zulaiha; Ismail, Noraini; Hamdan, Abdul Razak; Sammour, Mahmoud Ahmed.

In: Journal of Theoretical and Applied Information Technology, Vol. 92, No. 2, 01.10.2016, p. 372-379.

Research output: Contribution to journalArticle

Ali Othman, Zulaiha ; Ismail, Noraini ; Hamdan, Abdul Razak ; Sammour, Mahmoud Ahmed. / Klang vally rainfall forecasting model using time series data mining technique. In: Journal of Theoretical and Applied Information Technology. 2016 ; Vol. 92, No. 2. pp. 372-379.
@article{84b53aaed34c4cb2983f58b06144404e,
title = "Klang vally rainfall forecasting model using time series data mining technique",
abstract = "Rainfall has influence the social and economic activities in particular area such as agriculture, industry and domestic needs. Therefore, having an accurate rainfall forecasting becomes demanding. Various statistical and data mining techniques are used to obtain the accurate prediction of rainfall. Time series data mining is a well-known used for forecast time series data. Therefore, the objective of this study is to develop a distribution of rainfall pattern forecasting model based on symbolic data representation using Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate approXimation (SAX). The rainfall dataset were collected from three rain gauge station in Langat area within 31 years. The development of the model consists of three phases: data collection, data pre-processing, and model development. During data preprocessing phase, the data were transform into an appropriate representation using dimensional reduction technique known as Piecewise Aggregate Approximation (PAA). Then the transformed data were discretized using Symbolic Aggregate approXimation (SAX). Furthermore, clustering technique was used to determine the label of class pattern during preparing unsupervised training data. Three type of pattern are identified which is dry, normal and wet using three clustering techniques: Agglomotive Hierarchical Clustering, K-Means Partitional Clustering and Self-Organising Map. As a result, the best model has be able to forecast better for the next 3 and 5 years using rule induction classification techniques.",
keywords = "Classification, Clustering, Rainfall forecasting, Time series data mining, Time series symbolic representation",
author = "{Ali Othman}, Zulaiha and Noraini Ismail and Hamdan, {Abdul Razak} and Sammour, {Mahmoud Ahmed}",
year = "2016",
month = "10",
day = "1",
language = "English",
volume = "92",
pages = "372--379",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "2",

}

TY - JOUR

T1 - Klang vally rainfall forecasting model using time series data mining technique

AU - Ali Othman, Zulaiha

AU - Ismail, Noraini

AU - Hamdan, Abdul Razak

AU - Sammour, Mahmoud Ahmed

PY - 2016/10/1

Y1 - 2016/10/1

N2 - Rainfall has influence the social and economic activities in particular area such as agriculture, industry and domestic needs. Therefore, having an accurate rainfall forecasting becomes demanding. Various statistical and data mining techniques are used to obtain the accurate prediction of rainfall. Time series data mining is a well-known used for forecast time series data. Therefore, the objective of this study is to develop a distribution of rainfall pattern forecasting model based on symbolic data representation using Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate approXimation (SAX). The rainfall dataset were collected from three rain gauge station in Langat area within 31 years. The development of the model consists of three phases: data collection, data pre-processing, and model development. During data preprocessing phase, the data were transform into an appropriate representation using dimensional reduction technique known as Piecewise Aggregate Approximation (PAA). Then the transformed data were discretized using Symbolic Aggregate approXimation (SAX). Furthermore, clustering technique was used to determine the label of class pattern during preparing unsupervised training data. Three type of pattern are identified which is dry, normal and wet using three clustering techniques: Agglomotive Hierarchical Clustering, K-Means Partitional Clustering and Self-Organising Map. As a result, the best model has be able to forecast better for the next 3 and 5 years using rule induction classification techniques.

AB - Rainfall has influence the social and economic activities in particular area such as agriculture, industry and domestic needs. Therefore, having an accurate rainfall forecasting becomes demanding. Various statistical and data mining techniques are used to obtain the accurate prediction of rainfall. Time series data mining is a well-known used for forecast time series data. Therefore, the objective of this study is to develop a distribution of rainfall pattern forecasting model based on symbolic data representation using Piecewise Aggregate Approximation (PAA) and Symbolic Aggregate approXimation (SAX). The rainfall dataset were collected from three rain gauge station in Langat area within 31 years. The development of the model consists of three phases: data collection, data pre-processing, and model development. During data preprocessing phase, the data were transform into an appropriate representation using dimensional reduction technique known as Piecewise Aggregate Approximation (PAA). Then the transformed data were discretized using Symbolic Aggregate approXimation (SAX). Furthermore, clustering technique was used to determine the label of class pattern during preparing unsupervised training data. Three type of pattern are identified which is dry, normal and wet using three clustering techniques: Agglomotive Hierarchical Clustering, K-Means Partitional Clustering and Self-Organising Map. As a result, the best model has be able to forecast better for the next 3 and 5 years using rule induction classification techniques.

KW - Classification

KW - Clustering

KW - Rainfall forecasting

KW - Time series data mining

KW - Time series symbolic representation

UR - http://www.scopus.com/inward/record.url?scp=84994108284&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994108284&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84994108284

VL - 92

SP - 372

EP - 379

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 2

ER -