Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier

Basel Alshaikhdeeb, Kamsuriah Ahmad

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Biomedical Entity extraction is the process of identifying biomedical instances such as disorders, viruses, proteins, genes and others. One of these instances is the chemical compound which caught many researchers' attentions regarding the challenging task of extracting them. In fact, most of the studies that have been proposed for chemical compounds extraction have relied on supervised machine learning techniques regarding its ability to adopt a statistical model rather than handcrafted rules. However, the key characteristic of the use of supervised machine learning techniques lies on the utilized features. There is a wide range of features that have been used in the previous studies for the process of extracting chemical compounds. Hence, the need of accommodating a feature selection task in order to determine the best combination of features is becoming imperative. Therefore, this paper aims to apply a combination of Naïve Bayes classification method with the Wrapper Subset Selection approach to identify the best features. Results showed that the proposed combination has the ability to identify the best combination of features which consists of Capitalization, Punctuation, Prefix and Part-Of-Speech Tagging by achieving 0.72 of f-measure. Such result has been compared to the state of the art and it demonstrated competitive performance.

Original languageEnglish
Title of host publicationProceedings of the 2017 6th International Conference on Electrical Engineering and Informatics
Subtitle of host publicationSustainable Society Through Digital Innovation, ICEEI 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-6
Number of pages6
Volume2017-November
ISBN (Electronic)9781538604755
DOIs
Publication statusPublished - 9 Mar 2018
Event6th International Conference on Electrical Engineering and Informatics, ICEEI 2017 - Langkawi, Malaysia
Duration: 25 Nov 201727 Nov 2017

Other

Other6th International Conference on Electrical Engineering and Informatics, ICEEI 2017
CountryMalaysia
CityLangkawi
Period25/11/1727/11/17

Fingerprint

Naive Bayes Classifier
Chemical compounds
Aptitude
Wrapper
Feature Selection
Feature extraction
Classifiers
Chemical Phenomena
Learning systems
Statistical Models
Supervised Learning
Machine Learning
Research Personnel
Viruses
Subset Selection
Tagging
Prefix
Bayes
Genes
Statistical Model

Keywords

  • Biomedical Named Entity Recognition
  • Chemical Compounds Extraction
  • Feature Selection
  • Naïve Bayes

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Optimization
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Information Systems
  • Software
  • Electrical and Electronic Engineering
  • Health Informatics

Cite this

Alshaikhdeeb, B., & Ahmad, K. (2018). Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier. In Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017 (Vol. 2017-November, pp. 1-6). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICEEI.2017.8312421

Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier. / Alshaikhdeeb, Basel; Ahmad, Kamsuriah.

Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November Institute of Electrical and Electronics Engineers Inc., 2018. p. 1-6.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Alshaikhdeeb, B & Ahmad, K 2018, Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier. in Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. vol. 2017-November, Institute of Electrical and Electronics Engineers Inc., pp. 1-6, 6th International Conference on Electrical Engineering and Informatics, ICEEI 2017, Langkawi, Malaysia, 25/11/17. https://doi.org/10.1109/ICEEI.2017.8312421
Alshaikhdeeb B, Ahmad K. Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier. In Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1-6 https://doi.org/10.1109/ICEEI.2017.8312421
Alshaikhdeeb, Basel ; Ahmad, Kamsuriah. / Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier. Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics: Sustainable Society Through Digital Innovation, ICEEI 2017. Vol. 2017-November Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1-6
@inproceedings{05798952f1e74ea5b0fd5a182fbb3057,
title = "Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier",
abstract = "Biomedical Entity extraction is the process of identifying biomedical instances such as disorders, viruses, proteins, genes and others. One of these instances is the chemical compound which caught many researchers' attentions regarding the challenging task of extracting them. In fact, most of the studies that have been proposed for chemical compounds extraction have relied on supervised machine learning techniques regarding its ability to adopt a statistical model rather than handcrafted rules. However, the key characteristic of the use of supervised machine learning techniques lies on the utilized features. There is a wide range of features that have been used in the previous studies for the process of extracting chemical compounds. Hence, the need of accommodating a feature selection task in order to determine the best combination of features is becoming imperative. Therefore, this paper aims to apply a combination of Na{\"i}ve Bayes classification method with the Wrapper Subset Selection approach to identify the best features. Results showed that the proposed combination has the ability to identify the best combination of features which consists of Capitalization, Punctuation, Prefix and Part-Of-Speech Tagging by achieving 0.72 of f-measure. Such result has been compared to the state of the art and it demonstrated competitive performance.",
keywords = "Biomedical Named Entity Recognition, Chemical Compounds Extraction, Feature Selection, Na{\"i}ve Bayes",
author = "Basel Alshaikhdeeb and Kamsuriah Ahmad",
year = "2018",
month = "3",
day = "9",
doi = "10.1109/ICEEI.2017.8312421",
language = "English",
volume = "2017-November",
pages = "1--6",
booktitle = "Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier

AU - Alshaikhdeeb, Basel

AU - Ahmad, Kamsuriah

PY - 2018/3/9

Y1 - 2018/3/9

N2 - Biomedical Entity extraction is the process of identifying biomedical instances such as disorders, viruses, proteins, genes and others. One of these instances is the chemical compound which caught many researchers' attentions regarding the challenging task of extracting them. In fact, most of the studies that have been proposed for chemical compounds extraction have relied on supervised machine learning techniques regarding its ability to adopt a statistical model rather than handcrafted rules. However, the key characteristic of the use of supervised machine learning techniques lies on the utilized features. There is a wide range of features that have been used in the previous studies for the process of extracting chemical compounds. Hence, the need of accommodating a feature selection task in order to determine the best combination of features is becoming imperative. Therefore, this paper aims to apply a combination of Naïve Bayes classification method with the Wrapper Subset Selection approach to identify the best features. Results showed that the proposed combination has the ability to identify the best combination of features which consists of Capitalization, Punctuation, Prefix and Part-Of-Speech Tagging by achieving 0.72 of f-measure. Such result has been compared to the state of the art and it demonstrated competitive performance.

AB - Biomedical Entity extraction is the process of identifying biomedical instances such as disorders, viruses, proteins, genes and others. One of these instances is the chemical compound which caught many researchers' attentions regarding the challenging task of extracting them. In fact, most of the studies that have been proposed for chemical compounds extraction have relied on supervised machine learning techniques regarding its ability to adopt a statistical model rather than handcrafted rules. However, the key characteristic of the use of supervised machine learning techniques lies on the utilized features. There is a wide range of features that have been used in the previous studies for the process of extracting chemical compounds. Hence, the need of accommodating a feature selection task in order to determine the best combination of features is becoming imperative. Therefore, this paper aims to apply a combination of Naïve Bayes classification method with the Wrapper Subset Selection approach to identify the best features. Results showed that the proposed combination has the ability to identify the best combination of features which consists of Capitalization, Punctuation, Prefix and Part-Of-Speech Tagging by achieving 0.72 of f-measure. Such result has been compared to the state of the art and it demonstrated competitive performance.

KW - Biomedical Named Entity Recognition

KW - Chemical Compounds Extraction

KW - Feature Selection

KW - Naïve Bayes

UR - http://www.scopus.com/inward/record.url?scp=85050808469&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050808469&partnerID=8YFLogxK

U2 - 10.1109/ICEEI.2017.8312421

DO - 10.1109/ICEEI.2017.8312421

M3 - Conference contribution

VL - 2017-November

SP - 1

EP - 6

BT - Proceedings of the 2017 6th International Conference on Electrical Engineering and Informatics

PB - Institute of Electrical and Electronics Engineers Inc.

ER -