Application of functional data analysis for the treatment of missing air quality data

Norshahida Shaadan, Sayang Mohd Deni, Abdul Aziz Jemain

Research output: Contribution to journalArticle

Abstract

In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.

Original languageEnglish
Pages (from-to)1531-1540
Number of pages10
JournalSains Malaysiana
Volume44
Issue number10
Publication statusPublished - 1 Oct 2015

Fingerprint

air quality
roughness
method
data analysis
environmental research
smoothing
data quality

Keywords

  • Air quality
  • Functional data
  • Imputation
  • Missing value
  • PM

ASJC Scopus subject areas

  • General

Cite this

Application of functional data analysis for the treatment of missing air quality data. / Shaadan, Norshahida; Deni, Sayang Mohd; Jemain, Abdul Aziz.

In: Sains Malaysiana, Vol. 44, No. 10, 01.10.2015, p. 1531-1540.

Research output: Contribution to journalArticle

Shaadan, Norshahida ; Deni, Sayang Mohd ; Jemain, Abdul Aziz. / Application of functional data analysis for the treatment of missing air quality data. In: Sains Malaysiana. 2015 ; Vol. 44, No. 10. pp. 1531-1540.
@article{fb19906ee0bd48e6ad360a31e7241439,
title = "Application of functional data analysis for the treatment of missing air quality data",
abstract = "In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.",
keywords = "Air quality, Functional data, Imputation, Missing value, PM",
author = "Norshahida Shaadan and Deni, {Sayang Mohd} and Jemain, {Abdul Aziz}",
year = "2015",
month = "10",
day = "1",
language = "English",
volume = "44",
pages = "1531--1540",
journal = "Sains Malaysiana",
issn = "0126-6039",
publisher = "Penerbit Universiti Kebangsaan Malaysia",
number = "10",

}

TY - JOUR

T1 - Application of functional data analysis for the treatment of missing air quality data

AU - Shaadan, Norshahida

AU - Deni, Sayang Mohd

AU - Jemain, Abdul Aziz

PY - 2015/10/1

Y1 - 2015/10/1

N2 - In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.

AB - In most research including environmental research, missing recorded data often exists and has become a common problem for data quality. In this study, several imputation methods that have been designed based on the techniques for functional data analysis are introduced and the capability of the methods for estimating missing values is investigated. Single imputation methods and iterative imputation methods are conducted by means of curve estimation using regression and roughness penalty smoothing approaches. The performance of the methods is compared using a reference data set, the real PM10 data from an air quality monitoring station namely the Petaling Jaya station located at the western part of Peninsular Malaysia. A hundred of the missing data sets that have been generated from a reference data set with six different patterns of missing values are used to investigate the performance of the considered methods. The patterns are simulated according to three percentages (5, 10 and 15) of missing values with respect to two different sizes (3 and 7) of maximum gap lengths (consecutive missing points). By means of the mean absolute error, the index of agreement and the coefficient of determination as the performance indicators, the results have showed that the iterative imputation method using the roughness penalty approach is more flexible and superior to other methods.

KW - Air quality

KW - Functional data

KW - Imputation

KW - Missing value

KW - PM

UR - http://www.scopus.com/inward/record.url?scp=84952043609&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84952043609&partnerID=8YFLogxK

M3 - Article

VL - 44

SP - 1531

EP - 1540

JO - Sains Malaysiana

JF - Sains Malaysiana

SN - 0126-6039

IS - 10

ER -