A text mining system for deviation detection in financial documents

Siti Sakira Kamaruddin, Azuraliza Abu Bakar, Abdul Razak Hamdan, Fauzias Mat Nor, Mohd Zakree Ahmad Nazri, Zulaiha Ali Othman, Ghassan Saleh Hussein

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Attempts to mine text documents to discover deviations or anomalies have increased in recent years due to the elevated amount of textual data in today's data repositories. Text mining assists in uncovering hidden information contents across multiple documents. Although various text mining tools are available, their focus is mainly to assist in data summarization or document classification. These tasks proved to be helpful, however; they do not provide semantic analysis and rigorous textual comparison to detect abnormal sentences that exist in the documents. In this paper, we describe a text mining system that is able to detect sentence deviations from a collection of financial documents. The system implements a dissimilarity function to compare sentences represented as graphs. Our evaluation on the proposed system revolves around experiments using financial statements of a bank. The findings provide valid evidence that the proposed system is able to identify deviating sentences occurring in the documents. The detected deviations can be beneficial for the authorities in order to improve their business decisions.

Original languageEnglish
Pages (from-to)S19-S44
JournalIntelligent Data Analysis
Volume19
Issue numberS1
DOIs
Publication statusPublished - 1 Sep 2015

Fingerprint

Text Mining
Deviation
Semantics
Industry
Experiments
Document Classification
Semantic Analysis
Summarization
Information Content
Dissimilarity
Repository
Anomaly
Valid
Evaluation
Graph in graph theory
Experiment

Keywords

  • abnormal sentences
  • Deviation detection
  • financial statement analysis
  • graph-based representation
  • text mining

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this

A text mining system for deviation detection in financial documents. / Kamaruddin, Siti Sakira; Abu Bakar, Azuraliza; Hamdan, Abdul Razak; Nor, Fauzias Mat; Ahmad Nazri, Mohd Zakree; Ali Othman, Zulaiha; Hussein, Ghassan Saleh.

In: Intelligent Data Analysis, Vol. 19, No. S1, 01.09.2015, p. S19-S44.

Research output: Contribution to journalArticle

Kamaruddin, Siti Sakira ; Abu Bakar, Azuraliza ; Hamdan, Abdul Razak ; Nor, Fauzias Mat ; Ahmad Nazri, Mohd Zakree ; Ali Othman, Zulaiha ; Hussein, Ghassan Saleh. / A text mining system for deviation detection in financial documents. In: Intelligent Data Analysis. 2015 ; Vol. 19, No. S1. pp. S19-S44.
@article{3ecf207b365145b496e0e47c89c72069,
title = "A text mining system for deviation detection in financial documents",
abstract = "Attempts to mine text documents to discover deviations or anomalies have increased in recent years due to the elevated amount of textual data in today's data repositories. Text mining assists in uncovering hidden information contents across multiple documents. Although various text mining tools are available, their focus is mainly to assist in data summarization or document classification. These tasks proved to be helpful, however; they do not provide semantic analysis and rigorous textual comparison to detect abnormal sentences that exist in the documents. In this paper, we describe a text mining system that is able to detect sentence deviations from a collection of financial documents. The system implements a dissimilarity function to compare sentences represented as graphs. Our evaluation on the proposed system revolves around experiments using financial statements of a bank. The findings provide valid evidence that the proposed system is able to identify deviating sentences occurring in the documents. The detected deviations can be beneficial for the authorities in order to improve their business decisions.",
keywords = "abnormal sentences, Deviation detection, financial statement analysis, graph-based representation, text mining",
author = "Kamaruddin, {Siti Sakira} and {Abu Bakar}, Azuraliza and Hamdan, {Abdul Razak} and Nor, {Fauzias Mat} and {Ahmad Nazri}, {Mohd Zakree} and {Ali Othman}, Zulaiha and Hussein, {Ghassan Saleh}",
year = "2015",
month = "9",
day = "1",
doi = "10.3233/IDA-150768",
language = "English",
volume = "19",
pages = "S19--S44",
journal = "Intelligent Data Analysis",
issn = "1088-467X",
publisher = "IOS Press",
number = "S1",

}

TY - JOUR

T1 - A text mining system for deviation detection in financial documents

AU - Kamaruddin, Siti Sakira

AU - Abu Bakar, Azuraliza

AU - Hamdan, Abdul Razak

AU - Nor, Fauzias Mat

AU - Ahmad Nazri, Mohd Zakree

AU - Ali Othman, Zulaiha

AU - Hussein, Ghassan Saleh

PY - 2015/9/1

Y1 - 2015/9/1

N2 - Attempts to mine text documents to discover deviations or anomalies have increased in recent years due to the elevated amount of textual data in today's data repositories. Text mining assists in uncovering hidden information contents across multiple documents. Although various text mining tools are available, their focus is mainly to assist in data summarization or document classification. These tasks proved to be helpful, however; they do not provide semantic analysis and rigorous textual comparison to detect abnormal sentences that exist in the documents. In this paper, we describe a text mining system that is able to detect sentence deviations from a collection of financial documents. The system implements a dissimilarity function to compare sentences represented as graphs. Our evaluation on the proposed system revolves around experiments using financial statements of a bank. The findings provide valid evidence that the proposed system is able to identify deviating sentences occurring in the documents. The detected deviations can be beneficial for the authorities in order to improve their business decisions.

AB - Attempts to mine text documents to discover deviations or anomalies have increased in recent years due to the elevated amount of textual data in today's data repositories. Text mining assists in uncovering hidden information contents across multiple documents. Although various text mining tools are available, their focus is mainly to assist in data summarization or document classification. These tasks proved to be helpful, however; they do not provide semantic analysis and rigorous textual comparison to detect abnormal sentences that exist in the documents. In this paper, we describe a text mining system that is able to detect sentence deviations from a collection of financial documents. The system implements a dissimilarity function to compare sentences represented as graphs. Our evaluation on the proposed system revolves around experiments using financial statements of a bank. The findings provide valid evidence that the proposed system is able to identify deviating sentences occurring in the documents. The detected deviations can be beneficial for the authorities in order to improve their business decisions.

KW - abnormal sentences

KW - Deviation detection

KW - financial statement analysis

KW - graph-based representation

KW - text mining

UR - http://www.scopus.com/inward/record.url?scp=84996843629&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84996843629&partnerID=8YFLogxK

U2 - 10.3233/IDA-150768

DO - 10.3233/IDA-150768

M3 - Article

AN - SCOPUS:84996843629

VL - 19

SP - S19-S44

JO - Intelligent Data Analysis

JF - Intelligent Data Analysis

SN - 1088-467X

IS - S1

ER -