Measuring the Compositionality of Arabic Multiword Expressions

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

This paper presents a method for measuring the compositionality score of multiword expressions (MWEs). Based on Wikipedia (WP) as a lexicon resource, the multiword expressions are identified using the title of Wikipedia articles that are made up of more than one word without further process. Through the semantic representation, this method exploits the hierarchical taxonomy in Wikipedia to represent the concept (single word or multiword) as a feature vector containing the WP articles that belong to concept of categories and sub-categories. The literality and the multiplicative function composition scores are used for measuring the compositionality score of an MWE utilizing the semantic similarity. The proposed method is evaluated by comparing the compositionality score against human judgments (dataset) containing 100 Arabic noun-noun compounds.

Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
PublisherSpringer Verlag
Pages245-256
Number of pages12
Volume378 CCIS
ISBN (Print)9783642405662
DOIs
Publication statusPublished - 2013
Event2nd International Multi-Conference on Artificial Intelligence Technology, M-CAIT 2013 - Shah Alam
Duration: 28 Aug 201329 Aug 2013

Publication series

NameCommunications in Computer and Information Science
Volume378 CCIS
ISSN (Print)18650929

Other

Other2nd International Multi-Conference on Artificial Intelligence Technology, M-CAIT 2013
CityShah Alam
Period28/8/1329/8/13

Fingerprint

Semantics
Taxonomies
Chemical analysis

Keywords

  • multiword expression
  • semantic compositionality
  • semantic similarity
  • wikipedia

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Saif, A., Ab Aziz, M. J., & Omar, N. (2013). Measuring the Compositionality of Arabic Multiword Expressions. In Communications in Computer and Information Science (Vol. 378 CCIS, pp. 245-256). (Communications in Computer and Information Science; Vol. 378 CCIS). Springer Verlag. https://doi.org/10.1007/978-3-642-40567-9_21

Measuring the Compositionality of Arabic Multiword Expressions. / Saif, Abdulgabbar; Ab Aziz, Mohd Juzaiddin; Omar, Nazlia.

Communications in Computer and Information Science. Vol. 378 CCIS Springer Verlag, 2013. p. 245-256 (Communications in Computer and Information Science; Vol. 378 CCIS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saif, A, Ab Aziz, MJ & Omar, N 2013, Measuring the Compositionality of Arabic Multiword Expressions. in Communications in Computer and Information Science. vol. 378 CCIS, Communications in Computer and Information Science, vol. 378 CCIS, Springer Verlag, pp. 245-256, 2nd International Multi-Conference on Artificial Intelligence Technology, M-CAIT 2013, Shah Alam, 28/8/13. https://doi.org/10.1007/978-3-642-40567-9_21
Saif A, Ab Aziz MJ, Omar N. Measuring the Compositionality of Arabic Multiword Expressions. In Communications in Computer and Information Science. Vol. 378 CCIS. Springer Verlag. 2013. p. 245-256. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-642-40567-9_21
Saif, Abdulgabbar ; Ab Aziz, Mohd Juzaiddin ; Omar, Nazlia. / Measuring the Compositionality of Arabic Multiword Expressions. Communications in Computer and Information Science. Vol. 378 CCIS Springer Verlag, 2013. pp. 245-256 (Communications in Computer and Information Science).
@inproceedings{fe3f1c55ce7d428a978e414f39f3dcc5,
title = "Measuring the Compositionality of Arabic Multiword Expressions",
abstract = "This paper presents a method for measuring the compositionality score of multiword expressions (MWEs). Based on Wikipedia (WP) as a lexicon resource, the multiword expressions are identified using the title of Wikipedia articles that are made up of more than one word without further process. Through the semantic representation, this method exploits the hierarchical taxonomy in Wikipedia to represent the concept (single word or multiword) as a feature vector containing the WP articles that belong to concept of categories and sub-categories. The literality and the multiplicative function composition scores are used for measuring the compositionality score of an MWE utilizing the semantic similarity. The proposed method is evaluated by comparing the compositionality score against human judgments (dataset) containing 100 Arabic noun-noun compounds.",
keywords = "multiword expression, semantic compositionality, semantic similarity, wikipedia",
author = "Abdulgabbar Saif and {Ab Aziz}, {Mohd Juzaiddin} and Nazlia Omar",
year = "2013",
doi = "10.1007/978-3-642-40567-9_21",
language = "English",
isbn = "9783642405662",
volume = "378 CCIS",
series = "Communications in Computer and Information Science",
publisher = "Springer Verlag",
pages = "245--256",
booktitle = "Communications in Computer and Information Science",

}

TY - GEN

T1 - Measuring the Compositionality of Arabic Multiword Expressions

AU - Saif, Abdulgabbar

AU - Ab Aziz, Mohd Juzaiddin

AU - Omar, Nazlia

PY - 2013

Y1 - 2013

N2 - This paper presents a method for measuring the compositionality score of multiword expressions (MWEs). Based on Wikipedia (WP) as a lexicon resource, the multiword expressions are identified using the title of Wikipedia articles that are made up of more than one word without further process. Through the semantic representation, this method exploits the hierarchical taxonomy in Wikipedia to represent the concept (single word or multiword) as a feature vector containing the WP articles that belong to concept of categories and sub-categories. The literality and the multiplicative function composition scores are used for measuring the compositionality score of an MWE utilizing the semantic similarity. The proposed method is evaluated by comparing the compositionality score against human judgments (dataset) containing 100 Arabic noun-noun compounds.

AB - This paper presents a method for measuring the compositionality score of multiword expressions (MWEs). Based on Wikipedia (WP) as a lexicon resource, the multiword expressions are identified using the title of Wikipedia articles that are made up of more than one word without further process. Through the semantic representation, this method exploits the hierarchical taxonomy in Wikipedia to represent the concept (single word or multiword) as a feature vector containing the WP articles that belong to concept of categories and sub-categories. The literality and the multiplicative function composition scores are used for measuring the compositionality score of an MWE utilizing the semantic similarity. The proposed method is evaluated by comparing the compositionality score against human judgments (dataset) containing 100 Arabic noun-noun compounds.

KW - multiword expression

KW - semantic compositionality

KW - semantic similarity

KW - wikipedia

UR - http://www.scopus.com/inward/record.url?scp=84904596722&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904596722&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-40567-9_21

DO - 10.1007/978-3-642-40567-9_21

M3 - Conference contribution

AN - SCOPUS:84904596722

SN - 9783642405662

VL - 378 CCIS

T3 - Communications in Computer and Information Science

SP - 245

EP - 256

BT - Communications in Computer and Information Science

PB - Springer Verlag

ER -