Adaptive learning for lemmatization in morphology analysis

Mary Ting, Abdul Kadir Rabiah, Tengku Mohd Tengku Sembok, Fatimah Ahmad, Azreen Azman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There are many linguistic morphology tools available in the market for commercial and research purposes. Morphology technique are incorporated into these tools to ensure its ability to study the internal structure of natural language words. This technique plays an important role in reducing the number of vocabularies used, at the same time retains the semantic meaning of the knowledge in NLP system. Among the algorithms implemented, majority of them only has to ability to carry out stemming process instead of a lemmatization process. Even with technology advancement, yet none of the available lemmatization algorithms able to produce 100% accurate result. Inappropriate words produced by the current algorithm might alter the overall meaning it tried to represent, which will directly affect the outcome of NLP system. This paper proposed a new method to handle lemmatization process during the morphological analysis. The method consist three layers of lemmatization process, which incorporate the implementation of a well known Stanford parser API, WordNet database and adaptive learning technique. Stanford parser API is implemented in the first layer of lemmatization process, whereas WordNet database and adaptive learning technique are implemented in the second layer and finally another lemmatization algorithm in the final layer. The lemmatised words yields from the proposed method are much more appropriate compare to the previous algorithms due to user participation in the adaptive learning technique, which will ultimately improve the semantic knowledge represented and stored in the knowledge base.

Original languageEnglish
Title of host publicationFrontiers in Artificial Intelligence and Applications
PublisherIOS Press
Pages991-1005
Number of pages15
Volume265
ISBN (Print)9781614994336
DOIs
Publication statusPublished - 2014
Event13th International Conference on New Trends in Intelligent Software Methodology Tools, and Techniques, SoMeT 2014 - Langkawi, Malaysia
Duration: 22 Sep 201424 Sep 2014

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume265
ISSN (Print)09226389

Other

Other13th International Conference on New Trends in Intelligent Software Methodology Tools, and Techniques, SoMeT 2014
CountryMalaysia
CityLangkawi
Period22/9/1424/9/14

Fingerprint

Application programming interfaces (API)
Semantics
Linguistics

Keywords

  • Adaptive learning
  • Lemmatization
  • Morphology analysis
  • Natural language processing
  • Semi-supervised learning

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Ting, M., Rabiah, A. K., Tengku Sembok, T. M., Ahmad, F., & Azman, A. (2014). Adaptive learning for lemmatization in morphology analysis. In Frontiers in Artificial Intelligence and Applications (Vol. 265, pp. 991-1005). (Frontiers in Artificial Intelligence and Applications; Vol. 265). IOS Press. https://doi.org/10.3233/978-1-61499-434-3-991

Adaptive learning for lemmatization in morphology analysis. / Ting, Mary; Rabiah, Abdul Kadir; Tengku Sembok, Tengku Mohd; Ahmad, Fatimah; Azman, Azreen.

Frontiers in Artificial Intelligence and Applications. Vol. 265 IOS Press, 2014. p. 991-1005 (Frontiers in Artificial Intelligence and Applications; Vol. 265).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ting, M, Rabiah, AK, Tengku Sembok, TM, Ahmad, F & Azman, A 2014, Adaptive learning for lemmatization in morphology analysis. in Frontiers in Artificial Intelligence and Applications. vol. 265, Frontiers in Artificial Intelligence and Applications, vol. 265, IOS Press, pp. 991-1005, 13th International Conference on New Trends in Intelligent Software Methodology Tools, and Techniques, SoMeT 2014, Langkawi, Malaysia, 22/9/14. https://doi.org/10.3233/978-1-61499-434-3-991
Ting M, Rabiah AK, Tengku Sembok TM, Ahmad F, Azman A. Adaptive learning for lemmatization in morphology analysis. In Frontiers in Artificial Intelligence and Applications. Vol. 265. IOS Press. 2014. p. 991-1005. (Frontiers in Artificial Intelligence and Applications). https://doi.org/10.3233/978-1-61499-434-3-991
Ting, Mary ; Rabiah, Abdul Kadir ; Tengku Sembok, Tengku Mohd ; Ahmad, Fatimah ; Azman, Azreen. / Adaptive learning for lemmatization in morphology analysis. Frontiers in Artificial Intelligence and Applications. Vol. 265 IOS Press, 2014. pp. 991-1005 (Frontiers in Artificial Intelligence and Applications).
@inproceedings{5830276da75d4a8699c99b89384736be,
title = "Adaptive learning for lemmatization in morphology analysis",
abstract = "There are many linguistic morphology tools available in the market for commercial and research purposes. Morphology technique are incorporated into these tools to ensure its ability to study the internal structure of natural language words. This technique plays an important role in reducing the number of vocabularies used, at the same time retains the semantic meaning of the knowledge in NLP system. Among the algorithms implemented, majority of them only has to ability to carry out stemming process instead of a lemmatization process. Even with technology advancement, yet none of the available lemmatization algorithms able to produce 100{\%} accurate result. Inappropriate words produced by the current algorithm might alter the overall meaning it tried to represent, which will directly affect the outcome of NLP system. This paper proposed a new method to handle lemmatization process during the morphological analysis. The method consist three layers of lemmatization process, which incorporate the implementation of a well known Stanford parser API, WordNet database and adaptive learning technique. Stanford parser API is implemented in the first layer of lemmatization process, whereas WordNet database and adaptive learning technique are implemented in the second layer and finally another lemmatization algorithm in the final layer. The lemmatised words yields from the proposed method are much more appropriate compare to the previous algorithms due to user participation in the adaptive learning technique, which will ultimately improve the semantic knowledge represented and stored in the knowledge base.",
keywords = "Adaptive learning, Lemmatization, Morphology analysis, Natural language processing, Semi-supervised learning",
author = "Mary Ting and Rabiah, {Abdul Kadir} and {Tengku Sembok}, {Tengku Mohd} and Fatimah Ahmad and Azreen Azman",
year = "2014",
doi = "10.3233/978-1-61499-434-3-991",
language = "English",
isbn = "9781614994336",
volume = "265",
series = "Frontiers in Artificial Intelligence and Applications",
publisher = "IOS Press",
pages = "991--1005",
booktitle = "Frontiers in Artificial Intelligence and Applications",

}

TY - GEN

T1 - Adaptive learning for lemmatization in morphology analysis

AU - Ting, Mary

AU - Rabiah, Abdul Kadir

AU - Tengku Sembok, Tengku Mohd

AU - Ahmad, Fatimah

AU - Azman, Azreen

PY - 2014

Y1 - 2014

N2 - There are many linguistic morphology tools available in the market for commercial and research purposes. Morphology technique are incorporated into these tools to ensure its ability to study the internal structure of natural language words. This technique plays an important role in reducing the number of vocabularies used, at the same time retains the semantic meaning of the knowledge in NLP system. Among the algorithms implemented, majority of them only has to ability to carry out stemming process instead of a lemmatization process. Even with technology advancement, yet none of the available lemmatization algorithms able to produce 100% accurate result. Inappropriate words produced by the current algorithm might alter the overall meaning it tried to represent, which will directly affect the outcome of NLP system. This paper proposed a new method to handle lemmatization process during the morphological analysis. The method consist three layers of lemmatization process, which incorporate the implementation of a well known Stanford parser API, WordNet database and adaptive learning technique. Stanford parser API is implemented in the first layer of lemmatization process, whereas WordNet database and adaptive learning technique are implemented in the second layer and finally another lemmatization algorithm in the final layer. The lemmatised words yields from the proposed method are much more appropriate compare to the previous algorithms due to user participation in the adaptive learning technique, which will ultimately improve the semantic knowledge represented and stored in the knowledge base.

AB - There are many linguistic morphology tools available in the market for commercial and research purposes. Morphology technique are incorporated into these tools to ensure its ability to study the internal structure of natural language words. This technique plays an important role in reducing the number of vocabularies used, at the same time retains the semantic meaning of the knowledge in NLP system. Among the algorithms implemented, majority of them only has to ability to carry out stemming process instead of a lemmatization process. Even with technology advancement, yet none of the available lemmatization algorithms able to produce 100% accurate result. Inappropriate words produced by the current algorithm might alter the overall meaning it tried to represent, which will directly affect the outcome of NLP system. This paper proposed a new method to handle lemmatization process during the morphological analysis. The method consist three layers of lemmatization process, which incorporate the implementation of a well known Stanford parser API, WordNet database and adaptive learning technique. Stanford parser API is implemented in the first layer of lemmatization process, whereas WordNet database and adaptive learning technique are implemented in the second layer and finally another lemmatization algorithm in the final layer. The lemmatised words yields from the proposed method are much more appropriate compare to the previous algorithms due to user participation in the adaptive learning technique, which will ultimately improve the semantic knowledge represented and stored in the knowledge base.

KW - Adaptive learning

KW - Lemmatization

KW - Morphology analysis

KW - Natural language processing

KW - Semi-supervised learning

UR - http://www.scopus.com/inward/record.url?scp=84948740482&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948740482&partnerID=8YFLogxK

U2 - 10.3233/978-1-61499-434-3-991

DO - 10.3233/978-1-61499-434-3-991

M3 - Conference contribution

AN - SCOPUS:84948740482

SN - 9781614994336

VL - 265

T3 - Frontiers in Artificial Intelligence and Applications

SP - 991

EP - 1005

BT - Frontiers in Artificial Intelligence and Applications

PB - IOS Press

ER -