### Abstract

The use of curves or functional data in the study analysis is increasingly gaining momentum in the various fields of research. The statistical method to analyze such data is known as functional data analysis (FDA). The first step in FDA is to convert the observed data points which are repeatedly recorded over a period of time or space into either a rough (raw) or smooth curve. In the case of the smooth curve, basis functions expansion is one of the methods used for the data conversion. The data can be converted into a smooth curve either by using the regression smoothing or roughness penalty smoothing approach. By using the regression smoothing approach, the degree of curve's smoothness is very dependent on k number of basis functions; meanwhile for the roughness penalty approach, the smoothness is dependent on a roughness coefficient given by parameter λ Based on previous studies, researchers often used the rather time-consuming trial and error or cross validation method to estimate the appropriate number of basis functions. Thus, this paper proposes a statistical procedure to construct functional data or curves for the hourly and daily recorded data. The Bayesian Information Criteria is used to determine the number of basis functions while the Generalized Cross Validation criteria is used to identify the parameter λ The proposed procedure is then applied on a ten year (2001-2010) period of PM10 data from 30 air quality monitoring stations that are located in Peninsular Malaysia. It was found that the number of basis functions required for the construction of the PM10 daily curve in Peninsular Malaysia was in the interval of between 14 and 20 with an average value of 17; the first percentile is 15 and the third percentile is 19. Meanwhile the initial value of the roughness coefficient was in the interval of between 10 ^{-5} and 10^{-7} and the mode was 10^{-6}. An example of the functional descriptive analysis is also shown.

Original language | English |
---|---|

Title of host publication | AIP Conference Proceedings |

Publisher | American Institute of Physics Inc. |

Pages | 850-855 |

Number of pages | 6 |

Volume | 1605 |

ISBN (Print) | 9780735412415 |

DOIs | |

Publication status | Published - 2014 |

Event | 21st National Symposium on Mathematical Sciences: Germination of Mathematical Sciences Education and Research Towards Global Sustainability, SKSM 21 - Penang Duration: 6 Nov 2013 → 8 Nov 2013 |

### Other

Other | 21st National Symposium on Mathematical Sciences: Germination of Mathematical Sciences Education and Research Towards Global Sustainability, SKSM 21 |
---|---|

City | Penang |

Period | 6/11/13 → 8/11/13 |

### Fingerprint

### Keywords

- basis function
- curve smoothing
- Functional data analysis
- PM10 curves

### ASJC Scopus subject areas

- Physics and Astronomy(all)

### Cite this

*AIP Conference Proceedings*(Vol. 1605, pp. 850-855). American Institute of Physics Inc.. https://doi.org/10.1063/1.4887701

**Data preparation for functional data analysis of PM10 in Peninsular Malaysia.** / Shaadan, Norshahida; Jemain, Abdul Aziz; Deni, Sayang Mohd.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*AIP Conference Proceedings.*vol. 1605, American Institute of Physics Inc., pp. 850-855, 21st National Symposium on Mathematical Sciences: Germination of Mathematical Sciences Education and Research Towards Global Sustainability, SKSM 21, Penang, 6/11/13. https://doi.org/10.1063/1.4887701

}

TY - GEN

T1 - Data preparation for functional data analysis of PM10 in Peninsular Malaysia

AU - Shaadan, Norshahida

AU - Jemain, Abdul Aziz

AU - Deni, Sayang Mohd

PY - 2014

Y1 - 2014

N2 - The use of curves or functional data in the study analysis is increasingly gaining momentum in the various fields of research. The statistical method to analyze such data is known as functional data analysis (FDA). The first step in FDA is to convert the observed data points which are repeatedly recorded over a period of time or space into either a rough (raw) or smooth curve. In the case of the smooth curve, basis functions expansion is one of the methods used for the data conversion. The data can be converted into a smooth curve either by using the regression smoothing or roughness penalty smoothing approach. By using the regression smoothing approach, the degree of curve's smoothness is very dependent on k number of basis functions; meanwhile for the roughness penalty approach, the smoothness is dependent on a roughness coefficient given by parameter λ Based on previous studies, researchers often used the rather time-consuming trial and error or cross validation method to estimate the appropriate number of basis functions. Thus, this paper proposes a statistical procedure to construct functional data or curves for the hourly and daily recorded data. The Bayesian Information Criteria is used to determine the number of basis functions while the Generalized Cross Validation criteria is used to identify the parameter λ The proposed procedure is then applied on a ten year (2001-2010) period of PM10 data from 30 air quality monitoring stations that are located in Peninsular Malaysia. It was found that the number of basis functions required for the construction of the PM10 daily curve in Peninsular Malaysia was in the interval of between 14 and 20 with an average value of 17; the first percentile is 15 and the third percentile is 19. Meanwhile the initial value of the roughness coefficient was in the interval of between 10 -5 and 10-7 and the mode was 10-6. An example of the functional descriptive analysis is also shown.

AB - The use of curves or functional data in the study analysis is increasingly gaining momentum in the various fields of research. The statistical method to analyze such data is known as functional data analysis (FDA). The first step in FDA is to convert the observed data points which are repeatedly recorded over a period of time or space into either a rough (raw) or smooth curve. In the case of the smooth curve, basis functions expansion is one of the methods used for the data conversion. The data can be converted into a smooth curve either by using the regression smoothing or roughness penalty smoothing approach. By using the regression smoothing approach, the degree of curve's smoothness is very dependent on k number of basis functions; meanwhile for the roughness penalty approach, the smoothness is dependent on a roughness coefficient given by parameter λ Based on previous studies, researchers often used the rather time-consuming trial and error or cross validation method to estimate the appropriate number of basis functions. Thus, this paper proposes a statistical procedure to construct functional data or curves for the hourly and daily recorded data. The Bayesian Information Criteria is used to determine the number of basis functions while the Generalized Cross Validation criteria is used to identify the parameter λ The proposed procedure is then applied on a ten year (2001-2010) period of PM10 data from 30 air quality monitoring stations that are located in Peninsular Malaysia. It was found that the number of basis functions required for the construction of the PM10 daily curve in Peninsular Malaysia was in the interval of between 14 and 20 with an average value of 17; the first percentile is 15 and the third percentile is 19. Meanwhile the initial value of the roughness coefficient was in the interval of between 10 -5 and 10-7 and the mode was 10-6. An example of the functional descriptive analysis is also shown.

KW - basis function

KW - curve smoothing

KW - Functional data analysis

KW - PM10 curves

UR - http://www.scopus.com/inward/record.url?scp=84904607385&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904607385&partnerID=8YFLogxK

U2 - 10.1063/1.4887701

DO - 10.1063/1.4887701

M3 - Conference contribution

AN - SCOPUS:84904607385

SN - 9780735412415

VL - 1605

SP - 850

EP - 855

BT - AIP Conference Proceedings

PB - American Institute of Physics Inc.

ER -