Multiple linear regression for reconstruction of gene regulatory networks in solving cascade error problems

Faridah Hani Mohamed Salleh, Suhaila Zainudin, Shereena M. Arif

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction ismisinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

Original languageEnglish
Article number4827171
JournalAdvances in Bioinformatics
Volume2017
DOIs
Publication statusPublished - 2017

Fingerprint

Gene Regulatory Networks
Linear regression
Linear Models
Genes
Regulator Genes
Experiments
Research
Gene expression
Gene Expression

ASJC Scopus subject areas

  • Biomedical Engineering
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Computer Science Applications

Cite this

Multiple linear regression for reconstruction of gene regulatory networks in solving cascade error problems. / Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M.

In: Advances in Bioinformatics, Vol. 2017, 4827171, 2017.

Research output: Contribution to journalArticle

@article{7f5be6da0c2740938711db9344a32e06,
title = "Multiple linear regression for reconstruction of gene regulatory networks in solving cascade error problems",
abstract = "Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction ismisinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.",
author = "Salleh, {Faridah Hani Mohamed} and Suhaila Zainudin and Arif, {Shereena M.}",
year = "2017",
doi = "10.1155/2017/4827171",
language = "English",
volume = "2017",
journal = "Advances in Bioinformatics",
issn = "1687-8027",
publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Multiple linear regression for reconstruction of gene regulatory networks in solving cascade error problems

AU - Salleh, Faridah Hani Mohamed

AU - Zainudin, Suhaila

AU - Arif, Shereena M.

PY - 2017

Y1 - 2017

N2 - Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction ismisinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

AB - Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction ismisinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.

UR - http://www.scopus.com/inward/record.url?scp=85013276084&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013276084&partnerID=8YFLogxK

U2 - 10.1155/2017/4827171

DO - 10.1155/2017/4827171

M3 - Article

VL - 2017

JO - Advances in Bioinformatics

JF - Advances in Bioinformatics

SN - 1687-8027

M1 - 4827171

ER -