Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.

Original languageEnglish
Title of host publicationProceedings - International Symposium on Information Technology 2008, ITSim
Volume2
DOIs
Publication statusPublished - 2008
EventInternational Symposium on Information Technology 2008, ITSim - Kuala Lumpur
Duration: 26 Aug 200829 Aug 2008

Other

OtherInternational Symposium on Information Technology 2008, ITSim
CityKuala Lumpur
Period26/8/0829/8/08

Fingerprint

Formal concept analysis
Taxonomies
Linguistics
Syntactics

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text. / Ahmad Nazri, Mohd Zakree; Abu Bakar, Azuraliza; Shamsudin, Siti Mariyam; Abdul Ghani, Ahmad Tarmizi.

Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008. 4631709.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmad Nazri, MZ, Abu Bakar, A, Shamsudin, SM & Abdul Ghani, AT 2008, Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text. in Proceedings - International Symposium on Information Technology 2008, ITSim. vol. 2, 4631709, International Symposium on Information Technology 2008, ITSim, Kuala Lumpur, 26/8/08. https://doi.org/10.1109/ITSIM.2008.4631709
Ahmad Nazri, Mohd Zakree ; Abu Bakar, Azuraliza ; Shamsudin, Siti Mariyam ; Abdul Ghani, Ahmad Tarmizi. / Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text. Proceedings - International Symposium on Information Technology 2008, ITSim. Vol. 2 2008.
@inproceedings{be262292d5404a8aa389d68d8432f46e,
title = "Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text",
abstract = "Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.",
author = "{Ahmad Nazri}, {Mohd Zakree} and {Abu Bakar}, Azuraliza and Shamsudin, {Siti Mariyam} and {Abdul Ghani}, {Ahmad Tarmizi}",
year = "2008",
doi = "10.1109/ITSIM.2008.4631709",
language = "English",
isbn = "9781424423286",
volume = "2",
booktitle = "Proceedings - International Symposium on Information Technology 2008, ITSim",

}

TY - GEN

T1 - Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text

AU - Ahmad Nazri, Mohd Zakree

AU - Abu Bakar, Azuraliza

AU - Shamsudin, Siti Mariyam

AU - Abdul Ghani, Ahmad Tarmizi

PY - 2008

Y1 - 2008

N2 - Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.

AB - Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.

UR - http://www.scopus.com/inward/record.url?scp=57349164375&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=57349164375&partnerID=8YFLogxK

U2 - 10.1109/ITSIM.2008.4631709

DO - 10.1109/ITSIM.2008.4631709

M3 - Conference contribution

SN - 9781424423286

VL - 2

BT - Proceedings - International Symposium on Information Technology 2008, ITSim

ER -