Extracting features from online software reviews to aid requirements reuse

Noor Hasrina Bakar, Zarinah M. Kasirun, Norsaremah Salleh, Hamid A. Jalab

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Sets of common features are essential assets to be reused in fulfilling specific needs in software product line methodology. In Requirements Reuse (RR), the extraction of software features from Software Requirement Specifications (SRS) is viable only to practitioners who have access to these software artefacts. Due to organisational privacy, SRS are always kept confidential and not easily available to the public. As alternatives, researchers opted to use the publicly available software descriptions such as product brochures and online software descriptions to identify potential software features to initiate the RR process. The aim of this paper is to propose a semi-automated approach, known as Feature Extraction for Reuse of Natural Language requirements (FENL), to extract phrases that can represent software features from software reviews in the absence of SRS as a way to initiate the RR process. FENL is composed of four stages, which depend on keyword occurrences from several combinations of nouns, verbs, and/or adjectives. In the experiment conducted, phrases that could reflect software features, which reside within online software reviews were extracted by utilising the techniques from information retrieval (IR) area. As a way to demonstrate the feature groupings phase, a semi-automated approach to group the extracted features were then conducted with the assistance of a modified word overlap algorithm. As for the evaluation, the proposed extraction approach is evaluated through experiments against the truth data set created manually. The performance results obtained from the feature extraction phase indicates that the proposed approach performed comparably with related works in terms of recall, precision, and F-Measure.

Original languageEnglish
Pages (from-to)1297-1315
Number of pages19
JournalApplied Soft Computing Journal
Volume49
DOIs
Publication statusPublished - 1 Dec 2016
Externally publishedYes

Fingerprint

Specifications
Feature extraction
Information retrieval
Experiments

Keywords

  • Latent semantic analysis
  • Natural language processing
  • Requirements reuse
  • Software engineering
  • Unsupervised learning

ASJC Scopus subject areas

  • Software

Cite this

Extracting features from online software reviews to aid requirements reuse. / Bakar, Noor Hasrina; Kasirun, Zarinah M.; Salleh, Norsaremah; Jalab, Hamid A.

In: Applied Soft Computing Journal, Vol. 49, 01.12.2016, p. 1297-1315.

Research output: Contribution to journalArticle

Bakar, Noor Hasrina ; Kasirun, Zarinah M. ; Salleh, Norsaremah ; Jalab, Hamid A. / Extracting features from online software reviews to aid requirements reuse. In: Applied Soft Computing Journal. 2016 ; Vol. 49. pp. 1297-1315.
@article{ac44beeb445942b08f9f556708d81e61,
title = "Extracting features from online software reviews to aid requirements reuse",
abstract = "Sets of common features are essential assets to be reused in fulfilling specific needs in software product line methodology. In Requirements Reuse (RR), the extraction of software features from Software Requirement Specifications (SRS) is viable only to practitioners who have access to these software artefacts. Due to organisational privacy, SRS are always kept confidential and not easily available to the public. As alternatives, researchers opted to use the publicly available software descriptions such as product brochures and online software descriptions to identify potential software features to initiate the RR process. The aim of this paper is to propose a semi-automated approach, known as Feature Extraction for Reuse of Natural Language requirements (FENL), to extract phrases that can represent software features from software reviews in the absence of SRS as a way to initiate the RR process. FENL is composed of four stages, which depend on keyword occurrences from several combinations of nouns, verbs, and/or adjectives. In the experiment conducted, phrases that could reflect software features, which reside within online software reviews were extracted by utilising the techniques from information retrieval (IR) area. As a way to demonstrate the feature groupings phase, a semi-automated approach to group the extracted features were then conducted with the assistance of a modified word overlap algorithm. As for the evaluation, the proposed extraction approach is evaluated through experiments against the truth data set created manually. The performance results obtained from the feature extraction phase indicates that the proposed approach performed comparably with related works in terms of recall, precision, and F-Measure.",
keywords = "Latent semantic analysis, Natural language processing, Requirements reuse, Software engineering, Unsupervised learning",
author = "Bakar, {Noor Hasrina} and Kasirun, {Zarinah M.} and Norsaremah Salleh and Jalab, {Hamid A.}",
year = "2016",
month = "12",
day = "1",
doi = "10.1016/j.asoc.2016.07.048",
language = "English",
volume = "49",
pages = "1297--1315",
journal = "Applied Soft Computing",
issn = "1568-4946",
publisher = "Elsevier BV",

}

TY - JOUR

T1 - Extracting features from online software reviews to aid requirements reuse

AU - Bakar, Noor Hasrina

AU - Kasirun, Zarinah M.

AU - Salleh, Norsaremah

AU - Jalab, Hamid A.

PY - 2016/12/1

Y1 - 2016/12/1

N2 - Sets of common features are essential assets to be reused in fulfilling specific needs in software product line methodology. In Requirements Reuse (RR), the extraction of software features from Software Requirement Specifications (SRS) is viable only to practitioners who have access to these software artefacts. Due to organisational privacy, SRS are always kept confidential and not easily available to the public. As alternatives, researchers opted to use the publicly available software descriptions such as product brochures and online software descriptions to identify potential software features to initiate the RR process. The aim of this paper is to propose a semi-automated approach, known as Feature Extraction for Reuse of Natural Language requirements (FENL), to extract phrases that can represent software features from software reviews in the absence of SRS as a way to initiate the RR process. FENL is composed of four stages, which depend on keyword occurrences from several combinations of nouns, verbs, and/or adjectives. In the experiment conducted, phrases that could reflect software features, which reside within online software reviews were extracted by utilising the techniques from information retrieval (IR) area. As a way to demonstrate the feature groupings phase, a semi-automated approach to group the extracted features were then conducted with the assistance of a modified word overlap algorithm. As for the evaluation, the proposed extraction approach is evaluated through experiments against the truth data set created manually. The performance results obtained from the feature extraction phase indicates that the proposed approach performed comparably with related works in terms of recall, precision, and F-Measure.

AB - Sets of common features are essential assets to be reused in fulfilling specific needs in software product line methodology. In Requirements Reuse (RR), the extraction of software features from Software Requirement Specifications (SRS) is viable only to practitioners who have access to these software artefacts. Due to organisational privacy, SRS are always kept confidential and not easily available to the public. As alternatives, researchers opted to use the publicly available software descriptions such as product brochures and online software descriptions to identify potential software features to initiate the RR process. The aim of this paper is to propose a semi-automated approach, known as Feature Extraction for Reuse of Natural Language requirements (FENL), to extract phrases that can represent software features from software reviews in the absence of SRS as a way to initiate the RR process. FENL is composed of four stages, which depend on keyword occurrences from several combinations of nouns, verbs, and/or adjectives. In the experiment conducted, phrases that could reflect software features, which reside within online software reviews were extracted by utilising the techniques from information retrieval (IR) area. As a way to demonstrate the feature groupings phase, a semi-automated approach to group the extracted features were then conducted with the assistance of a modified word overlap algorithm. As for the evaluation, the proposed extraction approach is evaluated through experiments against the truth data set created manually. The performance results obtained from the feature extraction phase indicates that the proposed approach performed comparably with related works in terms of recall, precision, and F-Measure.

KW - Latent semantic analysis

KW - Natural language processing

KW - Requirements reuse

KW - Software engineering

KW - Unsupervised learning

UR - http://www.scopus.com/inward/record.url?scp=84997124422&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84997124422&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2016.07.048

DO - 10.1016/j.asoc.2016.07.048

M3 - Article

VL - 49

SP - 1297

EP - 1315

JO - Applied Soft Computing

JF - Applied Soft Computing

SN - 1568-4946

ER -