Methods on handling missing rainfall data with Neyman-Scott rectangular pulse modeling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Rainfall data from rain-gauge stations suffers the risk of being missing due to factors such as human negligence, faulty equipment and disasters. In this paper, complete monthly rainfall data from 1985 to 1992 in Payakangsar station is used as the base data to determine the appropriate method for handling missing data. A portion of this complete data is then omitted at random by as much as 5%, 10% and 15% of the total number of data. Three methods of missing data replacement are considered that is, replacement of the missing data with zero (NR), single imputation (SI) and multiple imputation (MI) methods. The Neyman-Scott Rectangular Pulse (NSRP) rainfall stochastic model is then fitted to the resulting data from these three methods. Data from the month of October and November are selected for further analysis as these two months represent the months with highest rainfall amount received. To assess the performance of these three methods, a goodness-of-fit test based on the mean absolute error is applied. Results from the goodness-of-fit test indicate that NR method is the best for each case of missing data in the month of October, and also for the 5% case in November. On the other hand, method of imputation with 4 stages (MI) is superior for cases of 10% and 15% in November.

Original languageEnglish
Title of host publicationAIP Conference Proceedings
Pages1213-1220
Number of pages8
Volume1522
DOIs
Publication statusPublished - 2013
Event20th National Symposium on Mathematical Sciences - Research in Mathematical Sciences: A Catalyst for Creativity and Innovation, SKSM 2012 - Putrajaya
Duration: 18 Dec 201220 Dec 2012

Other

Other20th National Symposium on Mathematical Sciences - Research in Mathematical Sciences: A Catalyst for Creativity and Innovation, SKSM 2012
CityPutrajaya
Period18/12/1220/12/12

Fingerprint

goodness of fit
stations
rain gages
disasters
pulses

Keywords

  • Missing data
  • Neyman-Scott Rectangular Pulse
  • Rainfall modeling

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Cite this

Methods on handling missing rainfall data with Neyman-Scott rectangular pulse modeling. / Yendra, Rado; Jemain, Abdul Aziz; Zahari, Marina; Wan Zin @ Wan Ibrahim, Wan Zawiah.

AIP Conference Proceedings. Vol. 1522 2013. p. 1213-1220.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yendra, R, Jemain, AA, Zahari, M & Wan Zin @ Wan Ibrahim, WZ 2013, Methods on handling missing rainfall data with Neyman-Scott rectangular pulse modeling. in AIP Conference Proceedings. vol. 1522, pp. 1213-1220, 20th National Symposium on Mathematical Sciences - Research in Mathematical Sciences: A Catalyst for Creativity and Innovation, SKSM 2012, Putrajaya, 18/12/12. https://doi.org/10.1063/1.4801269
@inproceedings{d52d698dafe34ac088e00fb65a51aa8c,
title = "Methods on handling missing rainfall data with Neyman-Scott rectangular pulse modeling",
abstract = "Rainfall data from rain-gauge stations suffers the risk of being missing due to factors such as human negligence, faulty equipment and disasters. In this paper, complete monthly rainfall data from 1985 to 1992 in Payakangsar station is used as the base data to determine the appropriate method for handling missing data. A portion of this complete data is then omitted at random by as much as 5{\%}, 10{\%} and 15{\%} of the total number of data. Three methods of missing data replacement are considered that is, replacement of the missing data with zero (NR), single imputation (SI) and multiple imputation (MI) methods. The Neyman-Scott Rectangular Pulse (NSRP) rainfall stochastic model is then fitted to the resulting data from these three methods. Data from the month of October and November are selected for further analysis as these two months represent the months with highest rainfall amount received. To assess the performance of these three methods, a goodness-of-fit test based on the mean absolute error is applied. Results from the goodness-of-fit test indicate that NR method is the best for each case of missing data in the month of October, and also for the 5{\%} case in November. On the other hand, method of imputation with 4 stages (MI) is superior for cases of 10{\%} and 15{\%} in November.",
keywords = "Missing data, Neyman-Scott Rectangular Pulse, Rainfall modeling",
author = "Rado Yendra and Jemain, {Abdul Aziz} and Marina Zahari and {Wan Zin @ Wan Ibrahim}, {Wan Zawiah}",
year = "2013",
doi = "10.1063/1.4801269",
language = "English",
isbn = "9780735411500",
volume = "1522",
pages = "1213--1220",
booktitle = "AIP Conference Proceedings",

}

TY - GEN

T1 - Methods on handling missing rainfall data with Neyman-Scott rectangular pulse modeling

AU - Yendra, Rado

AU - Jemain, Abdul Aziz

AU - Zahari, Marina

AU - Wan Zin @ Wan Ibrahim, Wan Zawiah

PY - 2013

Y1 - 2013

N2 - Rainfall data from rain-gauge stations suffers the risk of being missing due to factors such as human negligence, faulty equipment and disasters. In this paper, complete monthly rainfall data from 1985 to 1992 in Payakangsar station is used as the base data to determine the appropriate method for handling missing data. A portion of this complete data is then omitted at random by as much as 5%, 10% and 15% of the total number of data. Three methods of missing data replacement are considered that is, replacement of the missing data with zero (NR), single imputation (SI) and multiple imputation (MI) methods. The Neyman-Scott Rectangular Pulse (NSRP) rainfall stochastic model is then fitted to the resulting data from these three methods. Data from the month of October and November are selected for further analysis as these two months represent the months with highest rainfall amount received. To assess the performance of these three methods, a goodness-of-fit test based on the mean absolute error is applied. Results from the goodness-of-fit test indicate that NR method is the best for each case of missing data in the month of October, and also for the 5% case in November. On the other hand, method of imputation with 4 stages (MI) is superior for cases of 10% and 15% in November.

AB - Rainfall data from rain-gauge stations suffers the risk of being missing due to factors such as human negligence, faulty equipment and disasters. In this paper, complete monthly rainfall data from 1985 to 1992 in Payakangsar station is used as the base data to determine the appropriate method for handling missing data. A portion of this complete data is then omitted at random by as much as 5%, 10% and 15% of the total number of data. Three methods of missing data replacement are considered that is, replacement of the missing data with zero (NR), single imputation (SI) and multiple imputation (MI) methods. The Neyman-Scott Rectangular Pulse (NSRP) rainfall stochastic model is then fitted to the resulting data from these three methods. Data from the month of October and November are selected for further analysis as these two months represent the months with highest rainfall amount received. To assess the performance of these three methods, a goodness-of-fit test based on the mean absolute error is applied. Results from the goodness-of-fit test indicate that NR method is the best for each case of missing data in the month of October, and also for the 5% case in November. On the other hand, method of imputation with 4 stages (MI) is superior for cases of 10% and 15% in November.

KW - Missing data

KW - Neyman-Scott Rectangular Pulse

KW - Rainfall modeling

UR - http://www.scopus.com/inward/record.url?scp=84876923791&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84876923791&partnerID=8YFLogxK

U2 - 10.1063/1.4801269

DO - 10.1063/1.4801269

M3 - Conference contribution

SN - 9780735411500

VL - 1522

SP - 1213

EP - 1220

BT - AIP Conference Proceedings

ER -