Classification modeling on distributed environment

Ting Ah Choo, Azuraliza Abu Bakar, Amin Benjavad Talebi, Elankovan A Sundararajan, Mahathir Rahmany

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

High Performance Computing (HPC) is usually used to solve problems that cannot be solved on a single machine due to constraints in computing resources such as memory and number of processor in science and technology. The speed of processing can be improved through HPC. However, the use of high-powered supercomputer for this type of problems involves huge cost. In some circumstances, High-Throughput Computing (HTC) on distributed environments performs parallel processing with speed that are comparable to supercomputer. In this work, we improve the time and speed in mining process for developing a classification modeling for a large data file on distributed environments via a web-based portal that provides various classification methods. The web-based application was build using PHP language, and adapt combination of data mining software WEKA version 3.6.0 of classification techniques with split percentage of training and testing data. HTCondor middleware is used to control and run all jobs on distributed environment. The results show significant improvement in processing time.

Original languageEnglish
Title of host publication2013 IEEE Conference on Open Systems, ICOS 2013
PublisherIEEE Computer Society
Pages209-214
Number of pages6
ISBN (Print)9781479902859
DOIs
Publication statusPublished - 2013
Event2013 IEEE Conference on Open Systems, ICOS 2013 - Kuching, Sarawak
Duration: 2 Dec 20134 Dec 2013

Other

Other2013 IEEE Conference on Open Systems, ICOS 2013
CityKuching, Sarawak
Period2/12/134/12/13

Fingerprint

Supercomputers
World Wide Web
Processing
Middleware
Data mining
Throughput
Data storage equipment
Testing
Costs

Keywords

  • Classification
  • High performance computing (hpc)
  • High-throughput computing condor (Htcondor)
  • Php
  • Weka

ASJC Scopus subject areas

  • Software

Cite this

Choo, T. A., Abu Bakar, A., Talebi, A. B., A Sundararajan, E., & Rahmany, M. (2013). Classification modeling on distributed environment. In 2013 IEEE Conference on Open Systems, ICOS 2013 (pp. 209-214). [6735076] IEEE Computer Society. https://doi.org/10.1109/ICOS.2013.6735076

Classification modeling on distributed environment. / Choo, Ting Ah; Abu Bakar, Azuraliza; Talebi, Amin Benjavad; A Sundararajan, Elankovan; Rahmany, Mahathir.

2013 IEEE Conference on Open Systems, ICOS 2013. IEEE Computer Society, 2013. p. 209-214 6735076.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Choo, TA, Abu Bakar, A, Talebi, AB, A Sundararajan, E & Rahmany, M 2013, Classification modeling on distributed environment. in 2013 IEEE Conference on Open Systems, ICOS 2013., 6735076, IEEE Computer Society, pp. 209-214, 2013 IEEE Conference on Open Systems, ICOS 2013, Kuching, Sarawak, 2/12/13. https://doi.org/10.1109/ICOS.2013.6735076
Choo TA, Abu Bakar A, Talebi AB, A Sundararajan E, Rahmany M. Classification modeling on distributed environment. In 2013 IEEE Conference on Open Systems, ICOS 2013. IEEE Computer Society. 2013. p. 209-214. 6735076 https://doi.org/10.1109/ICOS.2013.6735076
Choo, Ting Ah ; Abu Bakar, Azuraliza ; Talebi, Amin Benjavad ; A Sundararajan, Elankovan ; Rahmany, Mahathir. / Classification modeling on distributed environment. 2013 IEEE Conference on Open Systems, ICOS 2013. IEEE Computer Society, 2013. pp. 209-214
@inproceedings{cf4e6119c3554a8688ccd964e14e1e11,
title = "Classification modeling on distributed environment",
abstract = "High Performance Computing (HPC) is usually used to solve problems that cannot be solved on a single machine due to constraints in computing resources such as memory and number of processor in science and technology. The speed of processing can be improved through HPC. However, the use of high-powered supercomputer for this type of problems involves huge cost. In some circumstances, High-Throughput Computing (HTC) on distributed environments performs parallel processing with speed that are comparable to supercomputer. In this work, we improve the time and speed in mining process for developing a classification modeling for a large data file on distributed environments via a web-based portal that provides various classification methods. The web-based application was build using PHP language, and adapt combination of data mining software WEKA version 3.6.0 of classification techniques with split percentage of training and testing data. HTCondor middleware is used to control and run all jobs on distributed environment. The results show significant improvement in processing time.",
keywords = "Classification, High performance computing (hpc), High-throughput computing condor (Htcondor), Php, Weka",
author = "Choo, {Ting Ah} and {Abu Bakar}, Azuraliza and Talebi, {Amin Benjavad} and {A Sundararajan}, Elankovan and Mahathir Rahmany",
year = "2013",
doi = "10.1109/ICOS.2013.6735076",
language = "English",
isbn = "9781479902859",
pages = "209--214",
booktitle = "2013 IEEE Conference on Open Systems, ICOS 2013",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Classification modeling on distributed environment

AU - Choo, Ting Ah

AU - Abu Bakar, Azuraliza

AU - Talebi, Amin Benjavad

AU - A Sundararajan, Elankovan

AU - Rahmany, Mahathir

PY - 2013

Y1 - 2013

N2 - High Performance Computing (HPC) is usually used to solve problems that cannot be solved on a single machine due to constraints in computing resources such as memory and number of processor in science and technology. The speed of processing can be improved through HPC. However, the use of high-powered supercomputer for this type of problems involves huge cost. In some circumstances, High-Throughput Computing (HTC) on distributed environments performs parallel processing with speed that are comparable to supercomputer. In this work, we improve the time and speed in mining process for developing a classification modeling for a large data file on distributed environments via a web-based portal that provides various classification methods. The web-based application was build using PHP language, and adapt combination of data mining software WEKA version 3.6.0 of classification techniques with split percentage of training and testing data. HTCondor middleware is used to control and run all jobs on distributed environment. The results show significant improvement in processing time.

AB - High Performance Computing (HPC) is usually used to solve problems that cannot be solved on a single machine due to constraints in computing resources such as memory and number of processor in science and technology. The speed of processing can be improved through HPC. However, the use of high-powered supercomputer for this type of problems involves huge cost. In some circumstances, High-Throughput Computing (HTC) on distributed environments performs parallel processing with speed that are comparable to supercomputer. In this work, we improve the time and speed in mining process for developing a classification modeling for a large data file on distributed environments via a web-based portal that provides various classification methods. The web-based application was build using PHP language, and adapt combination of data mining software WEKA version 3.6.0 of classification techniques with split percentage of training and testing data. HTCondor middleware is used to control and run all jobs on distributed environment. The results show significant improvement in processing time.

KW - Classification

KW - High performance computing (hpc)

KW - High-throughput computing condor (Htcondor)

KW - Php

KW - Weka

UR - http://www.scopus.com/inward/record.url?scp=84897678119&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897678119&partnerID=8YFLogxK

U2 - 10.1109/ICOS.2013.6735076

DO - 10.1109/ICOS.2013.6735076

M3 - Conference contribution

AN - SCOPUS:84897678119

SN - 9781479902859

SP - 209

EP - 214

BT - 2013 IEEE Conference on Open Systems, ICOS 2013

PB - IEEE Computer Society

ER -