A classification method for identifying confidential data to enhance efficiency of query processing over cloud

Hussein Albadri, Rossilawati Sulaiman

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

With the increased use of Database-as-a-Service (DAAS), several issues also come in parallel, especially in translating and executing queries to and from the database securely and efficiently. These issues are in response towards potential attacks such as attempting to copy or eavesdrop the database via queries. Existing security mechanisms include securing the queries by using encryption. However, encrypting the queries significantly affects the efficiency of query processing because of the security overhead from the encrypting and decrypting processes. This study aims to address this problem by proposing a divide-andconquer strategy in which partial encryptions is used on the queries. This is performed by classifying the data into sensitive and non-sensitive categories using a classification approach, so that only the sensitive data will be encrypted. The classification used in this study is based on the data classification policy from the Columbia University. Firstly, a manual annotation is conducted to label the data fields into sensitive and non-sensitive categories. Next, rules are generated in order to classify the queried data. If a query contains sensitive data, the encryption will specifically be applied to the sensitive data, whereas the non-sensitive data will remain unencrypted. Experiments have been conducted using real-time data from Baghdad University that is related to students’ information consisting 35 tables and 362 fields. The evaluation is based on the comparison of security overhead of the fully encryption (without classification) and partial encryption (with the classification) using Advance Encryption Standard (AES). Results shown that the classification method has significantly reduced the time used to process the query. This implies that the partial encryption based on classifying the data into sensitive and non-sensitive categories has improves the efficiency of query processing.

Original languageEnglish
Pages (from-to)412-420
Number of pages9
JournalJournal of Theoretical and Applied Information Technology
Volume93
Issue number2
Publication statusPublished - 30 Nov 2016

Fingerprint

Query processing
Query Processing
Cryptography
Encryption
Query
Partial
Data Classification
Labels
Students
Divides
Annotation
Tables
Classify
Attack
Real-time
Imply
Evaluation

Keywords

  • Cloud computing
  • Cloud database
  • Cloud query processing
  • Secure query processing

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

@article{0f75b56054a341afbcfaf239d0a44e8c,
title = "A classification method for identifying confidential data to enhance efficiency of query processing over cloud",
abstract = "With the increased use of Database-as-a-Service (DAAS), several issues also come in parallel, especially in translating and executing queries to and from the database securely and efficiently. These issues are in response towards potential attacks such as attempting to copy or eavesdrop the database via queries. Existing security mechanisms include securing the queries by using encryption. However, encrypting the queries significantly affects the efficiency of query processing because of the security overhead from the encrypting and decrypting processes. This study aims to address this problem by proposing a divide-andconquer strategy in which partial encryptions is used on the queries. This is performed by classifying the data into sensitive and non-sensitive categories using a classification approach, so that only the sensitive data will be encrypted. The classification used in this study is based on the data classification policy from the Columbia University. Firstly, a manual annotation is conducted to label the data fields into sensitive and non-sensitive categories. Next, rules are generated in order to classify the queried data. If a query contains sensitive data, the encryption will specifically be applied to the sensitive data, whereas the non-sensitive data will remain unencrypted. Experiments have been conducted using real-time data from Baghdad University that is related to students’ information consisting 35 tables and 362 fields. The evaluation is based on the comparison of security overhead of the fully encryption (without classification) and partial encryption (with the classification) using Advance Encryption Standard (AES). Results shown that the classification method has significantly reduced the time used to process the query. This implies that the partial encryption based on classifying the data into sensitive and non-sensitive categories has improves the efficiency of query processing.",
keywords = "Cloud computing, Cloud database, Cloud query processing, Secure query processing",
author = "Hussein Albadri and Rossilawati Sulaiman",
year = "2016",
month = "11",
day = "30",
language = "English",
volume = "93",
pages = "412--420",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "2",

}

TY - JOUR

T1 - A classification method for identifying confidential data to enhance efficiency of query processing over cloud

AU - Albadri, Hussein

AU - Sulaiman, Rossilawati

PY - 2016/11/30

Y1 - 2016/11/30

N2 - With the increased use of Database-as-a-Service (DAAS), several issues also come in parallel, especially in translating and executing queries to and from the database securely and efficiently. These issues are in response towards potential attacks such as attempting to copy or eavesdrop the database via queries. Existing security mechanisms include securing the queries by using encryption. However, encrypting the queries significantly affects the efficiency of query processing because of the security overhead from the encrypting and decrypting processes. This study aims to address this problem by proposing a divide-andconquer strategy in which partial encryptions is used on the queries. This is performed by classifying the data into sensitive and non-sensitive categories using a classification approach, so that only the sensitive data will be encrypted. The classification used in this study is based on the data classification policy from the Columbia University. Firstly, a manual annotation is conducted to label the data fields into sensitive and non-sensitive categories. Next, rules are generated in order to classify the queried data. If a query contains sensitive data, the encryption will specifically be applied to the sensitive data, whereas the non-sensitive data will remain unencrypted. Experiments have been conducted using real-time data from Baghdad University that is related to students’ information consisting 35 tables and 362 fields. The evaluation is based on the comparison of security overhead of the fully encryption (without classification) and partial encryption (with the classification) using Advance Encryption Standard (AES). Results shown that the classification method has significantly reduced the time used to process the query. This implies that the partial encryption based on classifying the data into sensitive and non-sensitive categories has improves the efficiency of query processing.

AB - With the increased use of Database-as-a-Service (DAAS), several issues also come in parallel, especially in translating and executing queries to and from the database securely and efficiently. These issues are in response towards potential attacks such as attempting to copy or eavesdrop the database via queries. Existing security mechanisms include securing the queries by using encryption. However, encrypting the queries significantly affects the efficiency of query processing because of the security overhead from the encrypting and decrypting processes. This study aims to address this problem by proposing a divide-andconquer strategy in which partial encryptions is used on the queries. This is performed by classifying the data into sensitive and non-sensitive categories using a classification approach, so that only the sensitive data will be encrypted. The classification used in this study is based on the data classification policy from the Columbia University. Firstly, a manual annotation is conducted to label the data fields into sensitive and non-sensitive categories. Next, rules are generated in order to classify the queried data. If a query contains sensitive data, the encryption will specifically be applied to the sensitive data, whereas the non-sensitive data will remain unencrypted. Experiments have been conducted using real-time data from Baghdad University that is related to students’ information consisting 35 tables and 362 fields. The evaluation is based on the comparison of security overhead of the fully encryption (without classification) and partial encryption (with the classification) using Advance Encryption Standard (AES). Results shown that the classification method has significantly reduced the time used to process the query. This implies that the partial encryption based on classifying the data into sensitive and non-sensitive categories has improves the efficiency of query processing.

KW - Cloud computing

KW - Cloud database

KW - Cloud query processing

KW - Secure query processing

UR - http://www.scopus.com/inward/record.url?scp=85002131223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85002131223&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85002131223

VL - 93

SP - 412

EP - 420

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 2

ER -