Linguistic rule-based translation of natural language question into sparql query for effective semantic question answering

Nurfadhlina Mohd Sharef, Shahrul Azman Mohd Noah, Masrah Azrifah Azmi Murad

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Semantic question answering (SQA) demands different processing compared to the common information retrieval method because the semantic knowledge base is stored in the triples form. However, manipulating the knowledge requires understanding of its structure and proficiency in semantic query language such as SPARQL. Natural language interface (NLI) alleviates this by allowing user to input question in their human language. Then it produces an answer by translating the input into an equivalent SPARQL before it is executed to retrieve the answer. However, none of the existing research has presented a holistic computational model for the translation of NL question into an equivalent SPARQL for the semantic KB querying. Existing studies have focused mainly on the semantic disambiguation through consolidation where user interaction is initiated so that relevant concept can be chosen by the user to be inserted into the SPARQL. Besides, the linguistic understanding of the input has limited coverage where only one triple is constructed which loses many original expressions. There is a necessity to increase the linguistic understanding to involve multi-variables and multi-triples in the translated SPARQL so that accurate answer will be returned. Therefore, in this paper the linguistic challenge in NLI is addressed, specifically on the question complexity depth, processes that need to be performed to answer the question and gaps in existing study. A linguistic-rule-based translation model for natural language question is introduced that utilizes a set of observational variables to extract the information in the question; (i) checking if the focus is equals to subject, (ii) number of subjects, (iii) number of property, (iv) number of object, (v) checking if object is instance, (vi) checking if the question contains superlative expression, (vii) superlative orientation and (viii) checking if the question contains aggregates expression. The model is also aimed to reduce dependability on clarification dialogues. The results show that the approach has managed to eliminate clarification dialogues and increase linguistic coverage of NLI.

Original languageEnglish
Pages (from-to)557-575
Number of pages19
JournalJournal of Theoretical and Applied Information Technology
Volume80
Issue number3
Publication statusPublished - 31 Oct 2015
Externally publishedYes

Fingerprint

SPARQL
Question Answering
Linguistics
Natural Language
Semantics
Query
Coverage
Query languages
Dependability
Consolidation
Query Language
User Interaction
Information retrieval
Knowledge Base
Computational Model
Information Retrieval
Eliminate
Processing
Model

Keywords

  • Natural language
  • Natural language interface
  • Semantic question answering
  • Semantic search
  • Semantic web
  • SPARQL

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Linguistic rule-based translation of natural language question into sparql query for effective semantic question answering. / Sharef, Nurfadhlina Mohd; Mohd Noah, Shahrul Azman; Azmi Murad, Masrah Azrifah.

In: Journal of Theoretical and Applied Information Technology, Vol. 80, No. 3, 31.10.2015, p. 557-575.

Research output: Contribution to journalArticle

@article{c61bbc3a1bf249c1bf1a7856c0f6d6f0,
title = "Linguistic rule-based translation of natural language question into sparql query for effective semantic question answering",
abstract = "Semantic question answering (SQA) demands different processing compared to the common information retrieval method because the semantic knowledge base is stored in the triples form. However, manipulating the knowledge requires understanding of its structure and proficiency in semantic query language such as SPARQL. Natural language interface (NLI) alleviates this by allowing user to input question in their human language. Then it produces an answer by translating the input into an equivalent SPARQL before it is executed to retrieve the answer. However, none of the existing research has presented a holistic computational model for the translation of NL question into an equivalent SPARQL for the semantic KB querying. Existing studies have focused mainly on the semantic disambiguation through consolidation where user interaction is initiated so that relevant concept can be chosen by the user to be inserted into the SPARQL. Besides, the linguistic understanding of the input has limited coverage where only one triple is constructed which loses many original expressions. There is a necessity to increase the linguistic understanding to involve multi-variables and multi-triples in the translated SPARQL so that accurate answer will be returned. Therefore, in this paper the linguistic challenge in NLI is addressed, specifically on the question complexity depth, processes that need to be performed to answer the question and gaps in existing study. A linguistic-rule-based translation model for natural language question is introduced that utilizes a set of observational variables to extract the information in the question; (i) checking if the focus is equals to subject, (ii) number of subjects, (iii) number of property, (iv) number of object, (v) checking if object is instance, (vi) checking if the question contains superlative expression, (vii) superlative orientation and (viii) checking if the question contains aggregates expression. The model is also aimed to reduce dependability on clarification dialogues. The results show that the approach has managed to eliminate clarification dialogues and increase linguistic coverage of NLI.",
keywords = "Natural language, Natural language interface, Semantic question answering, Semantic search, Semantic web, SPARQL",
author = "Sharef, {Nurfadhlina Mohd} and {Mohd Noah}, {Shahrul Azman} and {Azmi Murad}, {Masrah Azrifah}",
year = "2015",
month = "10",
day = "31",
language = "English",
volume = "80",
pages = "557--575",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "3",

}

TY - JOUR

T1 - Linguistic rule-based translation of natural language question into sparql query for effective semantic question answering

AU - Sharef, Nurfadhlina Mohd

AU - Mohd Noah, Shahrul Azman

AU - Azmi Murad, Masrah Azrifah

PY - 2015/10/31

Y1 - 2015/10/31

N2 - Semantic question answering (SQA) demands different processing compared to the common information retrieval method because the semantic knowledge base is stored in the triples form. However, manipulating the knowledge requires understanding of its structure and proficiency in semantic query language such as SPARQL. Natural language interface (NLI) alleviates this by allowing user to input question in their human language. Then it produces an answer by translating the input into an equivalent SPARQL before it is executed to retrieve the answer. However, none of the existing research has presented a holistic computational model for the translation of NL question into an equivalent SPARQL for the semantic KB querying. Existing studies have focused mainly on the semantic disambiguation through consolidation where user interaction is initiated so that relevant concept can be chosen by the user to be inserted into the SPARQL. Besides, the linguistic understanding of the input has limited coverage where only one triple is constructed which loses many original expressions. There is a necessity to increase the linguistic understanding to involve multi-variables and multi-triples in the translated SPARQL so that accurate answer will be returned. Therefore, in this paper the linguistic challenge in NLI is addressed, specifically on the question complexity depth, processes that need to be performed to answer the question and gaps in existing study. A linguistic-rule-based translation model for natural language question is introduced that utilizes a set of observational variables to extract the information in the question; (i) checking if the focus is equals to subject, (ii) number of subjects, (iii) number of property, (iv) number of object, (v) checking if object is instance, (vi) checking if the question contains superlative expression, (vii) superlative orientation and (viii) checking if the question contains aggregates expression. The model is also aimed to reduce dependability on clarification dialogues. The results show that the approach has managed to eliminate clarification dialogues and increase linguistic coverage of NLI.

AB - Semantic question answering (SQA) demands different processing compared to the common information retrieval method because the semantic knowledge base is stored in the triples form. However, manipulating the knowledge requires understanding of its structure and proficiency in semantic query language such as SPARQL. Natural language interface (NLI) alleviates this by allowing user to input question in their human language. Then it produces an answer by translating the input into an equivalent SPARQL before it is executed to retrieve the answer. However, none of the existing research has presented a holistic computational model for the translation of NL question into an equivalent SPARQL for the semantic KB querying. Existing studies have focused mainly on the semantic disambiguation through consolidation where user interaction is initiated so that relevant concept can be chosen by the user to be inserted into the SPARQL. Besides, the linguistic understanding of the input has limited coverage where only one triple is constructed which loses many original expressions. There is a necessity to increase the linguistic understanding to involve multi-variables and multi-triples in the translated SPARQL so that accurate answer will be returned. Therefore, in this paper the linguistic challenge in NLI is addressed, specifically on the question complexity depth, processes that need to be performed to answer the question and gaps in existing study. A linguistic-rule-based translation model for natural language question is introduced that utilizes a set of observational variables to extract the information in the question; (i) checking if the focus is equals to subject, (ii) number of subjects, (iii) number of property, (iv) number of object, (v) checking if object is instance, (vi) checking if the question contains superlative expression, (vii) superlative orientation and (viii) checking if the question contains aggregates expression. The model is also aimed to reduce dependability on clarification dialogues. The results show that the approach has managed to eliminate clarification dialogues and increase linguistic coverage of NLI.

KW - Natural language

KW - Natural language interface

KW - Semantic question answering

KW - Semantic search

KW - Semantic web

KW - SPARQL

UR - http://www.scopus.com/inward/record.url?scp=84945924482&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945924482&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84945924482

VL - 80

SP - 557

EP - 575

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 3

ER -