Intelligent Schema Integrator (ISI)

A tool to solve the problem of naming conflict for schema integration

Kamsuriah Ahmad, Hea Khim Chiew, Reduan Samad

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The data stored in the data warehouse are mostly coming from different sources. It may be developed using different model or structure for the schema. In order to improve the usability of these data, the process of combining or integrating is needed so that it can provide users with a unified view or a global view of these data. The most important issue in data integration is the schema integration: that is to solve the problem of "how can equivalent real-world entities from multiple data sources be matched up?" This is referred to as entity identification process. Terms may be given a different interpretation at different sources by different people. For example, how can data analyst be sure that customer-id in one database and cust-number in another refer to the same entity? In this paper, a tool which is called an Intelligent Schema Integrator (ISI) is built to increase the uses of data from the data warehouse and to make the process more simple, systematic and impressive. ISI is an intelligent tool which can be used to integrate two different schemas from different sources into a unified schema (global schema). ISI is developed to solve the problems of naming conflict which are homonym conflict and synonym conflict. Homonym conflict means the same element name is used to represent different concept. Synonym conflict means different element name is used to represent the same concept. Thesaurus is used to get the meaning of each element concept and compares it with the other concept. An interface is built to allow the user to choose which elements are going to be renamed or removed, if there are occurrences of homonym and synonym conflicts in the schemas. These are the intelligence features built for ISI. The methodology used in this study consists of 4 phases: Design the Input and Output, Extraction, Comparison, and Integration. The development of this tool is an important direction for more efficient and effective implementation of data integration in data warehousing.

Original languageEnglish
Title of host publicationProceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011
DOIs
Publication statusPublished - 2011
Event2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011 - Bandung
Duration: 17 Jul 201119 Jul 2011

Other

Other2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011
CityBandung
Period17/7/1119/7/11

Fingerprint

Data warehouses
Data integration
Thesauri

Keywords

  • homonym conflict
  • naming conflict
  • schema integration
  • synonym conflict

ASJC Scopus subject areas

  • Information Systems
  • Electrical and Electronic Engineering

Cite this

Ahmad, K., Chiew, H. K., & Samad, R. (2011). Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration. In Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011 [6021539] https://doi.org/10.1109/ICEEI.2011.6021539

Intelligent Schema Integrator (ISI) : A tool to solve the problem of naming conflict for schema integration. / Ahmad, Kamsuriah; Chiew, Hea Khim; Samad, Reduan.

Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011. 2011. 6021539.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmad, K, Chiew, HK & Samad, R 2011, Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration. in Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011., 6021539, 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011, Bandung, 17/7/11. https://doi.org/10.1109/ICEEI.2011.6021539
Ahmad K, Chiew HK, Samad R. Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration. In Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011. 2011. 6021539 https://doi.org/10.1109/ICEEI.2011.6021539
Ahmad, Kamsuriah ; Chiew, Hea Khim ; Samad, Reduan. / Intelligent Schema Integrator (ISI) : A tool to solve the problem of naming conflict for schema integration. Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011. 2011.
@inproceedings{dd1df4e59a9842dfa06c7b3e91f49860,
title = "Intelligent Schema Integrator (ISI): A tool to solve the problem of naming conflict for schema integration",
abstract = "The data stored in the data warehouse are mostly coming from different sources. It may be developed using different model or structure for the schema. In order to improve the usability of these data, the process of combining or integrating is needed so that it can provide users with a unified view or a global view of these data. The most important issue in data integration is the schema integration: that is to solve the problem of {"}how can equivalent real-world entities from multiple data sources be matched up?{"} This is referred to as entity identification process. Terms may be given a different interpretation at different sources by different people. For example, how can data analyst be sure that customer-id in one database and cust-number in another refer to the same entity? In this paper, a tool which is called an Intelligent Schema Integrator (ISI) is built to increase the uses of data from the data warehouse and to make the process more simple, systematic and impressive. ISI is an intelligent tool which can be used to integrate two different schemas from different sources into a unified schema (global schema). ISI is developed to solve the problems of naming conflict which are homonym conflict and synonym conflict. Homonym conflict means the same element name is used to represent different concept. Synonym conflict means different element name is used to represent the same concept. Thesaurus is used to get the meaning of each element concept and compares it with the other concept. An interface is built to allow the user to choose which elements are going to be renamed or removed, if there are occurrences of homonym and synonym conflicts in the schemas. These are the intelligence features built for ISI. The methodology used in this study consists of 4 phases: Design the Input and Output, Extraction, Comparison, and Integration. The development of this tool is an important direction for more efficient and effective implementation of data integration in data warehousing.",
keywords = "homonym conflict, naming conflict, schema integration, synonym conflict",
author = "Kamsuriah Ahmad and Chiew, {Hea Khim} and Reduan Samad",
year = "2011",
doi = "10.1109/ICEEI.2011.6021539",
language = "English",
isbn = "9781457707520",
booktitle = "Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011",

}

TY - GEN

T1 - Intelligent Schema Integrator (ISI)

T2 - A tool to solve the problem of naming conflict for schema integration

AU - Ahmad, Kamsuriah

AU - Chiew, Hea Khim

AU - Samad, Reduan

PY - 2011

Y1 - 2011

N2 - The data stored in the data warehouse are mostly coming from different sources. It may be developed using different model or structure for the schema. In order to improve the usability of these data, the process of combining or integrating is needed so that it can provide users with a unified view or a global view of these data. The most important issue in data integration is the schema integration: that is to solve the problem of "how can equivalent real-world entities from multiple data sources be matched up?" This is referred to as entity identification process. Terms may be given a different interpretation at different sources by different people. For example, how can data analyst be sure that customer-id in one database and cust-number in another refer to the same entity? In this paper, a tool which is called an Intelligent Schema Integrator (ISI) is built to increase the uses of data from the data warehouse and to make the process more simple, systematic and impressive. ISI is an intelligent tool which can be used to integrate two different schemas from different sources into a unified schema (global schema). ISI is developed to solve the problems of naming conflict which are homonym conflict and synonym conflict. Homonym conflict means the same element name is used to represent different concept. Synonym conflict means different element name is used to represent the same concept. Thesaurus is used to get the meaning of each element concept and compares it with the other concept. An interface is built to allow the user to choose which elements are going to be renamed or removed, if there are occurrences of homonym and synonym conflicts in the schemas. These are the intelligence features built for ISI. The methodology used in this study consists of 4 phases: Design the Input and Output, Extraction, Comparison, and Integration. The development of this tool is an important direction for more efficient and effective implementation of data integration in data warehousing.

AB - The data stored in the data warehouse are mostly coming from different sources. It may be developed using different model or structure for the schema. In order to improve the usability of these data, the process of combining or integrating is needed so that it can provide users with a unified view or a global view of these data. The most important issue in data integration is the schema integration: that is to solve the problem of "how can equivalent real-world entities from multiple data sources be matched up?" This is referred to as entity identification process. Terms may be given a different interpretation at different sources by different people. For example, how can data analyst be sure that customer-id in one database and cust-number in another refer to the same entity? In this paper, a tool which is called an Intelligent Schema Integrator (ISI) is built to increase the uses of data from the data warehouse and to make the process more simple, systematic and impressive. ISI is an intelligent tool which can be used to integrate two different schemas from different sources into a unified schema (global schema). ISI is developed to solve the problems of naming conflict which are homonym conflict and synonym conflict. Homonym conflict means the same element name is used to represent different concept. Synonym conflict means different element name is used to represent the same concept. Thesaurus is used to get the meaning of each element concept and compares it with the other concept. An interface is built to allow the user to choose which elements are going to be renamed or removed, if there are occurrences of homonym and synonym conflicts in the schemas. These are the intelligence features built for ISI. The methodology used in this study consists of 4 phases: Design the Input and Output, Extraction, Comparison, and Integration. The development of this tool is an important direction for more efficient and effective implementation of data integration in data warehousing.

KW - homonym conflict

KW - naming conflict

KW - schema integration

KW - synonym conflict

UR - http://www.scopus.com/inward/record.url?scp=80054015030&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80054015030&partnerID=8YFLogxK

U2 - 10.1109/ICEEI.2011.6021539

DO - 10.1109/ICEEI.2011.6021539

M3 - Conference contribution

SN - 9781457707520

BT - Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, ICEEI 2011

ER -