A dynamic multiarmed bandit-gene expression programming hyper-heuristic for combinatorial optimization problems

Nasser R. Sabar, Masri Ayob, Graham Kendall, Rong Qu

Research output: Contribution to journalArticle

50 Citations (Scopus)

Abstract

Hyper-heuristics are search methodologies that aim to provide high-quality solutions across a wide variety of problem domains, rather than developing tailor-made methodologies for each problem instance/domain. A traditional hyper-heuristic framework has two levels, namely, the high level strategy (heuristic selection mechanism and the acceptance criterion) and low level heuristics (a set of problem specific heuristics). Due to the different landscape structures of different problem instances, the high level strategy plays an important role in the design of a hyper-heuristic framework. In this paper, we propose a new high level strategy for a hyper-heuristic framework. The proposed high-level strategy utilizes a dynamic multiarmed bandit-extreme value-based reward as an online heuristic selection mechanism to select the appropriate heuristic to be applied at each iteration. In addition, we propose a gene expression programming framework to automatically generate the acceptance criterion for each problem instance, instead of using human-designed criteria. Two well-known, and very different, combinatorial optimization problems, one static (exam timetabling) and one dynamic (dynamic vehicle routing) are used to demonstrate the generality of the proposed framework. Compared with state-of-The-Art hyper-heuristics and other bespoke methods, empirical results demonstrate that the proposed framework is able to generalize well across both domains. We obtain competitive, if not better results, when compared to the best known results obtained from other methods that have been presented in the scientific literature. We also compare our approach against the recently released hyper-heuristic competition test suite. We again demonstrate the generality of our approach when we compare against other methods that have utilized the same six benchmark datasets from this test suite.

Original languageEnglish
Article number6824192
Pages (from-to)217-228
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume45
Issue number2
DOIs
Publication statusPublished - 1 Feb 2015

Fingerprint

Combinatorial optimization
Gene expression
Vehicle routing

Keywords

  • CHeSC
  • dynamic optimization
  • gene expression programming
  • hyper-heuristic
  • timetabling
  • vehicle routing

ASJC Scopus subject areas

  • Computer Science Applications
  • Human-Computer Interaction
  • Information Systems
  • Software
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

A dynamic multiarmed bandit-gene expression programming hyper-heuristic for combinatorial optimization problems. / Sabar, Nasser R.; Ayob, Masri; Kendall, Graham; Qu, Rong.

In: IEEE Transactions on Cybernetics, Vol. 45, No. 2, 6824192, 01.02.2015, p. 217-228.

Research output: Contribution to journalArticle

@article{5e94e360fb55402c96abb361a96bea69,
title = "A dynamic multiarmed bandit-gene expression programming hyper-heuristic for combinatorial optimization problems",
abstract = "Hyper-heuristics are search methodologies that aim to provide high-quality solutions across a wide variety of problem domains, rather than developing tailor-made methodologies for each problem instance/domain. A traditional hyper-heuristic framework has two levels, namely, the high level strategy (heuristic selection mechanism and the acceptance criterion) and low level heuristics (a set of problem specific heuristics). Due to the different landscape structures of different problem instances, the high level strategy plays an important role in the design of a hyper-heuristic framework. In this paper, we propose a new high level strategy for a hyper-heuristic framework. The proposed high-level strategy utilizes a dynamic multiarmed bandit-extreme value-based reward as an online heuristic selection mechanism to select the appropriate heuristic to be applied at each iteration. In addition, we propose a gene expression programming framework to automatically generate the acceptance criterion for each problem instance, instead of using human-designed criteria. Two well-known, and very different, combinatorial optimization problems, one static (exam timetabling) and one dynamic (dynamic vehicle routing) are used to demonstrate the generality of the proposed framework. Compared with state-of-The-Art hyper-heuristics and other bespoke methods, empirical results demonstrate that the proposed framework is able to generalize well across both domains. We obtain competitive, if not better results, when compared to the best known results obtained from other methods that have been presented in the scientific literature. We also compare our approach against the recently released hyper-heuristic competition test suite. We again demonstrate the generality of our approach when we compare against other methods that have utilized the same six benchmark datasets from this test suite.",
keywords = "CHeSC, dynamic optimization, gene expression programming, hyper-heuristic, timetabling, vehicle routing",
author = "Sabar, {Nasser R.} and Masri Ayob and Graham Kendall and Rong Qu",
year = "2015",
month = "2",
day = "1",
doi = "10.1109/TCYB.2014.2323936",
language = "English",
volume = "45",
pages = "217--228",
journal = "IEEE Transactions on Cybernetics",
issn = "2168-2267",
publisher = "IEEE Advancing Technology for Humanity",
number = "2",

}

TY - JOUR

T1 - A dynamic multiarmed bandit-gene expression programming hyper-heuristic for combinatorial optimization problems

AU - Sabar, Nasser R.

AU - Ayob, Masri

AU - Kendall, Graham

AU - Qu, Rong

PY - 2015/2/1

Y1 - 2015/2/1

N2 - Hyper-heuristics are search methodologies that aim to provide high-quality solutions across a wide variety of problem domains, rather than developing tailor-made methodologies for each problem instance/domain. A traditional hyper-heuristic framework has two levels, namely, the high level strategy (heuristic selection mechanism and the acceptance criterion) and low level heuristics (a set of problem specific heuristics). Due to the different landscape structures of different problem instances, the high level strategy plays an important role in the design of a hyper-heuristic framework. In this paper, we propose a new high level strategy for a hyper-heuristic framework. The proposed high-level strategy utilizes a dynamic multiarmed bandit-extreme value-based reward as an online heuristic selection mechanism to select the appropriate heuristic to be applied at each iteration. In addition, we propose a gene expression programming framework to automatically generate the acceptance criterion for each problem instance, instead of using human-designed criteria. Two well-known, and very different, combinatorial optimization problems, one static (exam timetabling) and one dynamic (dynamic vehicle routing) are used to demonstrate the generality of the proposed framework. Compared with state-of-The-Art hyper-heuristics and other bespoke methods, empirical results demonstrate that the proposed framework is able to generalize well across both domains. We obtain competitive, if not better results, when compared to the best known results obtained from other methods that have been presented in the scientific literature. We also compare our approach against the recently released hyper-heuristic competition test suite. We again demonstrate the generality of our approach when we compare against other methods that have utilized the same six benchmark datasets from this test suite.

AB - Hyper-heuristics are search methodologies that aim to provide high-quality solutions across a wide variety of problem domains, rather than developing tailor-made methodologies for each problem instance/domain. A traditional hyper-heuristic framework has two levels, namely, the high level strategy (heuristic selection mechanism and the acceptance criterion) and low level heuristics (a set of problem specific heuristics). Due to the different landscape structures of different problem instances, the high level strategy plays an important role in the design of a hyper-heuristic framework. In this paper, we propose a new high level strategy for a hyper-heuristic framework. The proposed high-level strategy utilizes a dynamic multiarmed bandit-extreme value-based reward as an online heuristic selection mechanism to select the appropriate heuristic to be applied at each iteration. In addition, we propose a gene expression programming framework to automatically generate the acceptance criterion for each problem instance, instead of using human-designed criteria. Two well-known, and very different, combinatorial optimization problems, one static (exam timetabling) and one dynamic (dynamic vehicle routing) are used to demonstrate the generality of the proposed framework. Compared with state-of-The-Art hyper-heuristics and other bespoke methods, empirical results demonstrate that the proposed framework is able to generalize well across both domains. We obtain competitive, if not better results, when compared to the best known results obtained from other methods that have been presented in the scientific literature. We also compare our approach against the recently released hyper-heuristic competition test suite. We again demonstrate the generality of our approach when we compare against other methods that have utilized the same six benchmark datasets from this test suite.

KW - CHeSC

KW - dynamic optimization

KW - gene expression programming

KW - hyper-heuristic

KW - timetabling

KW - vehicle routing

UR - http://www.scopus.com/inward/record.url?scp=84921411064&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921411064&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2014.2323936

DO - 10.1109/TCYB.2014.2323936

M3 - Article

VL - 45

SP - 217

EP - 228

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

SN - 2168-2267

IS - 2

M1 - 6824192

ER -