Efficient identity matching using static pruning q-gram indexing approach

Nizam B. Khairul, Shahrul Azman Mohd Noah

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Information overload is a growing problem for information management and analytics in many organizations. Identity matching techniques are used to manage and resolve millions of identity records in diverse domains such as health care information, telecom subscribers, insurance holders, law offenders, and the census. In this paper, we propose an identity matching technique that is efficient for large datasets without compromising matching effectiveness. Our experimental results provide strong evidence that our proposed identity matching technique outperforms the adaptive detection identity matching technique in terms of efficiency and effectiveness, reducing the number of required comparisons by almost 98% and the completion time by 97%, with promising scalability results. Furthermore, our proposed technique achieves better matching results than the most trusted pairwise identity matching approach.

Original languageEnglish
Pages (from-to)97-108
Number of pages12
JournalDecision Support Systems
Volume73
DOIs
Publication statusPublished - 1 May 2015

Fingerprint

Information Management
Insurance
Censuses
Health care
Information management
Scalability
Organizations
Delivery of Health Care
Datasets
Pruning
Indexing

Keywords

  • Identity management
  • Identity matching
  • Q-gram indexing
  • Static index pruning

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems
  • Developmental and Educational Psychology
  • Arts and Humanities (miscellaneous)
  • Information Systems and Management

Cite this

Efficient identity matching using static pruning q-gram indexing approach. / Khairul, Nizam B.; Mohd Noah, Shahrul Azman.

In: Decision Support Systems, Vol. 73, 01.05.2015, p. 97-108.

Research output: Contribution to journalArticle

@article{d3c4568ca0684740a1535ac6e444fe1f,
title = "Efficient identity matching using static pruning q-gram indexing approach",
abstract = "Information overload is a growing problem for information management and analytics in many organizations. Identity matching techniques are used to manage and resolve millions of identity records in diverse domains such as health care information, telecom subscribers, insurance holders, law offenders, and the census. In this paper, we propose an identity matching technique that is efficient for large datasets without compromising matching effectiveness. Our experimental results provide strong evidence that our proposed identity matching technique outperforms the adaptive detection identity matching technique in terms of efficiency and effectiveness, reducing the number of required comparisons by almost 98{\%} and the completion time by 97{\%}, with promising scalability results. Furthermore, our proposed technique achieves better matching results than the most trusted pairwise identity matching approach.",
keywords = "Identity management, Identity matching, Q-gram indexing, Static index pruning",
author = "Khairul, {Nizam B.} and {Mohd Noah}, {Shahrul Azman}",
year = "2015",
month = "5",
day = "1",
doi = "10.1016/j.dss.2015.02.015",
language = "English",
volume = "73",
pages = "97--108",
journal = "Decision Support Systems",
issn = "0167-9236",
publisher = "Elsevier",

}

TY - JOUR

T1 - Efficient identity matching using static pruning q-gram indexing approach

AU - Khairul, Nizam B.

AU - Mohd Noah, Shahrul Azman

PY - 2015/5/1

Y1 - 2015/5/1

N2 - Information overload is a growing problem for information management and analytics in many organizations. Identity matching techniques are used to manage and resolve millions of identity records in diverse domains such as health care information, telecom subscribers, insurance holders, law offenders, and the census. In this paper, we propose an identity matching technique that is efficient for large datasets without compromising matching effectiveness. Our experimental results provide strong evidence that our proposed identity matching technique outperforms the adaptive detection identity matching technique in terms of efficiency and effectiveness, reducing the number of required comparisons by almost 98% and the completion time by 97%, with promising scalability results. Furthermore, our proposed technique achieves better matching results than the most trusted pairwise identity matching approach.

AB - Information overload is a growing problem for information management and analytics in many organizations. Identity matching techniques are used to manage and resolve millions of identity records in diverse domains such as health care information, telecom subscribers, insurance holders, law offenders, and the census. In this paper, we propose an identity matching technique that is efficient for large datasets without compromising matching effectiveness. Our experimental results provide strong evidence that our proposed identity matching technique outperforms the adaptive detection identity matching technique in terms of efficiency and effectiveness, reducing the number of required comparisons by almost 98% and the completion time by 97%, with promising scalability results. Furthermore, our proposed technique achieves better matching results than the most trusted pairwise identity matching approach.

KW - Identity management

KW - Identity matching

KW - Q-gram indexing

KW - Static index pruning

UR - http://www.scopus.com/inward/record.url?scp=84961291287&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961291287&partnerID=8YFLogxK

U2 - 10.1016/j.dss.2015.02.015

DO - 10.1016/j.dss.2015.02.015

M3 - Article

VL - 73

SP - 97

EP - 108

JO - Decision Support Systems

JF - Decision Support Systems

SN - 0167-9236

ER -