large-scale optimization models with applications in
TRANSCRIPT
Clemson University Clemson University
TigerPrints TigerPrints
All Dissertations Dissertations
August 2021
Large-Scale Optimization Models with Applications in Biological Large-Scale Optimization Models with Applications in Biological
and Emergency Response Networks and Emergency Response Networks
Mustafa Can Camur Clemson University, [email protected]
Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations
Recommended Citation Recommended Citation Camur, Mustafa Can, "Large-Scale Optimization Models with Applications in Biological and Emergency Response Networks" (2021). All Dissertations. 2844. https://tigerprints.clemson.edu/all_dissertations/2844
This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected].
Large-Scale Optimization Models with Applications inBiological and Emergency Response Networks
A Dissertation
Presented to
the Graduate School of
Clemson University
In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy
Industrial Engineering
by
Mustafa Can Camur
August 2021
Accepted by:
Dr. Thomas C. Sharkey, Committee Chair
Dr. Chrysafis Vogiatzis
Dr. Yongjia Song
Dr. Emily Tucker
Abstract
In this dissertation, we present new classes of network optimization models and algorithms,
including heuristics and decomposition-based methods, to solve them. Overall, our applications
highlight the breadth which optimization models can be applied and include problems in protein-
protein interaction networks and emergency response networks. To our best knowledge, this is the
first study to propose an an exact solution approach for the star degree centrality (SDC) problem.
In addition, we are the first who introduce the stochastic-pseudo star degree centrality problem and
we are able to design a decomposition approach for this problem. For both problems, we introduce
new complexity discussions where the practical difficulty of problems based on different graph types
is classified. Moreover, we analyse an Arctic mass rescue event from an optimization perspective and
create a novel network optimization model that examines the impact of the event on the evacuees
and the time to evacuate them.
We first consider the problem of identifying the induced star with the largest cardinality
open neighborhood in a graph. This problem, also known as the SDC problem, has been shown to be
NP-complete. In this dissertation, we propose a new integer programming (IP) formulation, which
has a fewer number of constraints and non-zero coefficients in them than the existing formulation
in the literature. We present classes of networks where the problem is solvable in polynomial time,
and offer a new proof of NP-completeness that shows the problem remains NP-complete for both
bipartite and split graphs. In addition, we propose a decomposition framework which is suitable for
both the existing and the new formulations. We implement several acceleration techniques in this
framework, motivated by those techniques used in Benders decomposition. We test our approaches
on networks generated based on the BarabasiâAlbert, ErdosâRenyi, and WattsâStrogatz models.
Our decomposition approach outperforms solving the IP formulations in most of the instances in
terms of both solution time and solution quality; this is especially true when the graph gets larger
ii
and denser. We then test the decomposition algorithm on large-scale protein-protein interaction
networks, for which SDC was shown to be an important centrality metric.
We then introduce the stochastic pseudo-star degree centrality problem and propose methods
to solve it exactly. The goal is to identify an induced pseudo-star, which is defined as a collection of
nodes which form a star network with a certain probability, such that it maximizes the sum of the
probability values in the unique assignments between the star and its open neighborhood. In this
problem, we are specifically interested in a feasible pseudo-star, where the feasibility is measured as
the product of the existence probabilities of edges between the center node and leaf nodes and the
product of one minus the existence probabilities of edges among the leaf nodes. We then show that
the problem is NP-complete on general graphs, trees, and windmill graphs. We initially propose a
non-linear binary optimization model to solve this problem. Subsequently, we linearize our model via
McCormick inequalities and develop a branch-and-Benders-cut framework to solve it. We generate
Logic-Based-Benders cuts as alternative feasibility cuts and examine several acceleration techniques.
The performance of our implementation is tested on randomly generated networks based on small-
world (SW) graphs. The SW networks resemble large-scale protein-protein interaction networks
for which the deterministic star degree centrality has been shown to be an efficient group-based
centrality metric in order to detect essential proteins. Our computational results indicate that the
Benders implementation outperforms solving the model directly via a commercial solver in terms of
both the solution time and the solution quality in the majority of the test instances.
Lastly, we turn our attention to a network optimization problem with an application in
Arctic emergency response. We study a model that optimizes the response to a mass rescue event in
Arctic Alaska. The model contains dynamic logistics decisions for a large-scale maritime evacuation
with the objectives of minimizing the impact of the event on the evacuees and the average evacuation
time. Our proposed optimization model considers two interacting networks - the network that moves
evacuees from the location of the event to out of the Arctic (e.g., a large city in Alaska such as
Anchorage) and the logistics network that moves relief materials to evacuees during the operations.
We model the concept of deprivation costs by incorporating priority levels capturing the severeness
of evacueesâ current medical situation and period indicating the amount of time an evacuee has not
received key relief resources. Our model is capable of understanding the best possible response given
the current locations of response resources and is used to assess the effectiveness of an intuitive
heuristic that mimics emergency response decision-making.
iii
Dedication
to Belma and Cansu.
iv
Acknowledgements
First and foremost, I would like to express my gratitude to my Ph.D. advisor Dr. Thomas
C. Sharkey who has been a great supervisor and mentor throughout my four-year Ph.D. journey.
I absolutely feel lucky and blessed to have worked under his supervision. Second, I would like to
thank Dr. Chrysafis Vogiatzis who was a great influence on me to pursue my Ph.D. I would also like
to appreciate the support and guidance of the rest of my committee members; Drs. Yongjia Song
and Emily Tucker. Lastly, I would like to mention that I spent the first three years of my Ph.D.
program at Rensselaer Polytechnic Institute and thank all the great faculty members with whom I
took courses with there. I specifically want to mention Dr. John Mitchell, whose dedication to teach
is truly admirable.
During the years I spent thousands of miles away from home, my family and my friends have
been there to help and support me. I would like to especially mention my mom, Belma to whom
I owe everything I have accomplished so far. I thank both my father, Ali and my sister, Cansu
who has been like a second mother for me besides being a great sibling. All of my other family
members and friends, whose names could not appear here, should know that I am forever grateful
for everything they have done for me.
v
Table of Contents
Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Star Degree Centrality: Definitions and Problem Statements . . . . . . . . . . . 62.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Problem Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 The Star Degree Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1 Mathematical Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Complexity Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Solution Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4 Algorithmic Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 The Stochastic Pseudo-Star Degree Centrality . . . . . . . . . . . . . . . . . . . . 474.1 Complexity Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.2 Mathematical Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.3 Solution Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.4 Algorithmic Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Optimizing the Response for Arctic Mass Rescue Events . . . . . . . . . . . . . . 755.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.4 An Optimization Model for Arctic MREs . . . . . . . . . . . . . . . . . . . . . . . . 955.5 Overview of Solution Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055.6 Computational Study: Data Set Description and Baseline Analysis . . . . . . . . . . 1085.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
vi
6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .126A Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .135
vii
List of Tables
3.1 Parameter settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 The computational results for the BA Model . . . . . . . . . . . . . . . . . . . . . . 393.4 The computational results for the ER Model. . . . . . . . . . . . . . . . . . . . . . . 403.5 The computational results for the WS Model . . . . . . . . . . . . . . . . . . . . . . 413.6 The computational results for Helicobacter Pylori (n = 1, 570) . . . . . . . . . . . . 443.7 The computational results for Staphylococcus Aureus (n = 2, 852) . . . . . . . . . . 45
4.1 Summary of results (27 Instances) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 The computational results with θ = 0.99 . . . . . . . . . . . . . . . . . . . . . . . . 694.3 The computational results with different θ values via BD-LB and BD-LB-WS . . . 72
5.1 Data on communities in Arctic Alaska . . . . . . . . . . . . . . . . . . . . . . . . . . 785.2 Decisions conducted at the end of time 5 and their consequences . . . . . . . . . . . 955.3 Set definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.4 Variable definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.5 Parameter definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.6 New variables defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1075.7 Populations and capacities in locations . . . . . . . . . . . . . . . . . . . . . . . . . 1105.8 List of assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.9 Initial inventory in each location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.10 Resource and equipment list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.11 Initial deployment locations for ships . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.12 Changes in sr when resource demand is met . . . . . . . . . . . . . . . . . . . . . . . 1135.13 Jumps in ABNN (p, sr, se) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.14 Jumps in AESN (p, sr, se) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
A1 Comparison of the initial optimality gaps in the baseline experiment . . . . . . . . . 133A2 Comparison of the initial optimality gaps in the experiment with the upgraded runways133A3 Comparison of the solution methods in the baseline experiment [Time (in mins), Gap
(%)] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134A4 Comparison of the solution methods in the experiment with the upgraded runways . 134
viii
List of Figures
2.1 Examples of star graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Determining the star degree centrality of a given a node where the center and leaf
nodes are shown in red and blue, respectively. . . . . . . . . . . . . . . . . . . . . . . 112.3 A subgraph of the PPIN of Saccharomyces Cerevisiae . . . . . . . . . . . . . . . . . 112.4 Determining the stochastic pseudo-star degree centrality of a given node where the
center and leaf nodes are shown in red and blue, respectively. . . . . . . . . . . . . 132.5 Calculation of the SPSDC of proteins in a real-world PPIN . . . . . . . . . . . . . . 142.6 1 - The SPSDC of a non-essential protein . . . . . . . . . . . . . . . . . . . . . . . . 142.7 2 - The SPSDC of a non-essential protein (zoomed in) . . . . . . . . . . . . . . . . . 142.8 3 - The SPSDC of a non-essential protein (more zoomed in) . . . . . . . . . . . . . 142.9 4 - The SPSDC of an essential protein . . . . . . . . . . . . . . . . . . . . . . . . . . 142.10 5 - The SPSDC of an essential protein (zoomed in) . . . . . . . . . . . . . . . . . . . 142.11 6 - The SPSDC of an essential protein (more zoomed in) . . . . . . . . . . . . . . . . 14
3.1 A counter example where the optimal solution obtained in LP[NIP ] cannot be con-verted a feasible solution in LP[V CIP ]. . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 The transformation of Set Cover < U,S,k > to an instance < G(V,E), l > of StarDegree Centrality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 The impact of warm-start in the solution times in [NIP] in the BA model . . . . . . 353.4 The impact of warm-start in the optimality gaps in [NIP] in the BA model . . . . . 353.5 The impact of warm-start in the solution times in [VCIP] in the BA model . . . . . 353.6 The impact of warm-start in the optimality gaps in [VCIP] in the BA model . . . . 353.7 The impact of warm-start in the optimality gaps in [NIP] in the ER model . . . . . 363.8 The impact of warm-start in the optimality gaps in [VCIP] in the ER model . . . . . 363.9 The impact of warm-start in the optimality gaps in [NIP] in the WS model . . . . . 373.10 The impact of warm-start in the optimality gaps in [VCIP] in the WS model . . . . 373.11 The impact of warm-start in the solution times in [DNIP] in the WS model . . . . . 373.12 The impact of warm-start in the solution times in [DVCIP] in the WS model . . . . 373.13 Solution time comparison between [DNIP] and [DVCIP] in the BA model . . . . . . 423.14 Solution time comparison between [DNIP] and [DVCIP] in the ER model . . . . . . 423.15 Solution time comparison between [DNIP] and [DVCIP] in the WS model . . . . . . 423.16 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the BA
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.17 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the ER
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.18 The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the WS
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.1 The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > ofStochastic Pseudo-Star Degree Centrality on a tree . . . . . . . . . . . . . . . . . . . 49
ix
4.2 The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > ofStochastic Pseudo-Star Degree Centrality on a windmill graph . . . . . . . . . . . . 52
4.3 The illustration of the Benders Decomposition algorithm including Logic-based Ben-ders cuts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Distribution of Interaction Scores in HP . . . . . . . . . . . . . . . . . . . . . . . . . 674.5 Distribution of Interaction Scores in SA . . . . . . . . . . . . . . . . . . . . . . . . . 674.6 Solution time comparison between BD-LB and BD-LB-WS . . . . . . . . . . . . . . 704.7 Optimality gap comparison between BD-LB and BD-LB-WS . . . . . . . . . . . . . 704.8 Solution time comparison between BD-TB and BD-TB-WS . . . . . . . . . . . . . . 704.9 Optimality gap comparison between BD-TB and BD-TB-WS . . . . . . . . . . . . . 70
5.1 Visualization of transportation network in North Slope . . . . . . . . . . . . . . . . . 855.2 Illustration of deprivation cost function . . . . . . . . . . . . . . . . . . . . . . . . . 875.3 Evacuees in community 1 at time 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.4 Evacuees in community 2 at time 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.5 Movements when resource demand is not satisfied . . . . . . . . . . . . . . . . . . . 945.6 Movements when resource demand is satisfied . . . . . . . . . . . . . . . . . . . . . 945.7 Evacuees in community 1 at time 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.8 Evacuees in community 3 at time 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.9 Incident locations selected on the Crystal Serenityâs planned routes . . . . . . . . . . 1095.10 Objective values in the baseline experiment . . . . . . . . . . . . . . . . . . . . . . . 1155.11 The villages used in the baseline experiment . . . . . . . . . . . . . . . . . . . . . . . 1155.12 The total objective values in the base- line experiment and Experiment 1 . . . . . . 1175.13 The deprivation costs incurred during travel in the baseline experiment and Experi-
ment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175.14 The total objective values in the base- line experiment and Experiment 2 . . . . . . 1195.15 The percentage increase in the objective in Experiment 3 compared to the baseline
experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.16 The number of evacuees stayed in the C. ship at |T | in the baseline experiment and
Exp. 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205.17 The total number of tours completed by the ships in the baseline experiment and Exp. 31205.18 The total objective values in Experiment 3 and Experiment 4 . . . . . . . . . . . . . 122
x
Chapter 1
Introduction
Operations research (OR) has been playing a significant role in our lives since its first uses in
World War II and has applications ranging from the military, to the economy, to infrastructure anal-
ysis, to biological networks. The report on Operations Research: A Catalyst for Engineering Grand
Challenges (Sen et al., 2014) discusses OR as the catalyst for four major challenges: i) sustainability
(e.g., providing low-cost solar energy and higher water quality), ii) security (e.g., ensuring cyber-
security and nuclear safety), iii) healthcare (e.g., offering improved health service and engineering
higher quality medicines), and iv) joy of living (e.g., building smart houses and creating better online
recommendation systems). While it integrates computational and mathematical tools to overcome
challenges faced in real-world applications, its progress can be further enhanced through interdisci-
plinary studies. Well-known problems addressed by OR include the facility location problem, the
sports scheduling problem, the blending problem, the cutting stock problem, the diet problem and
the vehicle routing problem.
Network design models constitute an important class of network optimization. In most
cases, network design problems aim to identify optimal location selection (e.g., warehouse, shelter,
or distribution center) and allocation decisions (e.g., commodities, evacuees, or electricity) depending
on the application area. Dynamic decision making processes over a time horizon can be also modeled
as a network design problem and have been studied in the literature (Nurre et al., 2012; Garrett
et al., 2017; Nguyen et al., 2020). Important applications include, but are not limited to, evacua-
tion networks (Uster et al., 2018), transportation networks (Behbahani et al., 2019), supply chain
networks (Saif and Elhedhli, 2016), multi-commodity flow networks (Paraskevopoulos et al., 2016),
1
sensor networks (Keskin, 2017), and distribution networks (De Corte and Sorensen, 2016). In this
dissertation, we direct our attention specifically to the area of biological networks, more specifically,
protein-protein interaction networks (PPINs) and emergence response networks uniquely designed
for a remote region, the Arctic Alaska. We propose novel optimization models for problems in these
application areas, which broadly fit into network design problems.
Researchers also have been working on solution methodologies to tackle network design prob-
lems due to their consistent popularity over the last century. We can group the solution method-
ologies into three different categories: i) heuristic approaches, ii) approximation algorithms, and
iii) exact solution methods. Since optimization models often carry high inherent computational
complexity, heuristics are highly utilized to obtain good or near-optimal solutions. Some popular
approaches include neighbourhood search heuristics (Eskandarpour et al., 2017; Canca et al., 2017),
the Lagrangian-based heuristics (Fortz et al., 2017; Alkaabneh et al., 2019), column-generation based
heuristics (Crainic et al., 2016; Keskin, 2017), and meta-heuristics (SteadieSeifi et al., 2017; Govin-
dan et al., 2019). In addition, there exists a wide range of network design studies where authors
design approximation algorithms, which aim to approximate the optimal solution under the conjec-
ture of P 6= NP (Goemans et al., 1994; Ravi et al., 2001; Bley and Rezapour, 2016; Grimmer, 2018;
Friggstad et al., 2019; Govindan et al., 2019). Lastly, exact solution methods are designed to tackle
the network design problems when reaching the optimal is preferred over obtaining a solution within
a short amount of time. Decomposition algorithms including Benders decomposition (BD) (Gabrel
et al., 1999; Uster et al., 2007; Zetina et al., 2019), Lagrangian relaxation (Aykin, 1994; Gendron,
2019), and branch-and-cut algorithms (Alibeyg et al., 2018; Leitner et al., 2020) are highly utilized
exact solution methods in network design problems.
In this dissertation, we mainly reference BD to solve the models proposed for biological
networks (i.e., PPINs) at scale. On the other hand, we utilize heuristic solution methods to solve
an Arctic emergency response model due to its high complexity and non-decomposable structure of
the formulation. In other words, we approach the models related to PPINs from a computational
optimization perspective that contains modeling and design of exact solution methods. For the
emergency response application, we focus on modeling and policy analysis perspectives and design
heuristic solution methods that could offer solutions in real-time.
BD is a remarkably popular solution method often used in large-scale, mixed-integer linear
programming models which possess a block structure (Benders, 1962). This solution method has
2
been proven quite effective on solving problems where there exist some variables that are considered
âcomplicating,â i.e., once these variables are fixed, the remaining optimization problem can be solved
efficiently. In this case, the complicating variables become part of the âmaster problem (MP)â and
the remaining optimization problem is referred to as the subproblem (SP). There is a set of decision
variables in the MP which may appear in the SP as well. It should be noted that it is quite
likely to have multiple SPs, especially in situations where the initial SP is separable; however, for
this discussion and without loss of generality, we assume that there exists one SP. We also assume
that we are concerned with a maximization problem. The algorithm first solves the MP and then
proceeds to solve the SP with the fixed complicating variables obtained via the last solution to MP.
If the solution fixed does not yield a feasible solution in the SP, a cutting plane called a feasibility
cut is generated to eliminate the infeasible solution. If the SP turns out to be feasible, then i) an
upper (UB) is obtained on the objective function by solving the MP, and ii) a lower bound (LB) is
produced on the objective by figuring out the actual cost of the decisions in the SP in an iteration.
An optimality cut is then added to the MP to ensure that the last solution to the MPâs objective in
the MP reflects its true objective rather than just an UB on that solutionâs objective. These steps
are iteratively repeated until a user-defined convergence is obtained between the LB and UB. We
refer the reader to Benders (1962) and Geoffrion (1972) for detailed further information.
Although BD is widely used to solve large-scale problems, it often requires extra efforts to
obtain a fast convergence. In the traditional Benders approach, since the MP is solved from scratch
every time a new cut is incorporated, the solver likely visits the same nodes over and over again
during the branch and bound processes. To overcome this challenge, Modern Benders Decomposition
(Fischetti et al., 2016, 2017), where Benders cuts are added on-the-fly (if violated) when the solver
identifies both incumbent and fractional solutions, has been commonly utilized. This is also called
the branch-and-Benders cut approach implying that there exists only a single enumeration tree,
with which the solver never visits the same candidate nodes again. Whenever the solver identifies
an incumbent solution, a callback function (the generic callback function in CPLEX) is triggered
implying that the branch-and-bound tree is halted. If the incumbent solution overestimates the
objective (i.e., underestimates for a minimization problem) meaning that there is a cut violated by
the integer solution, then Benders cuts are generated through the dual solutions. Along with this,
at a non-integer solution before branching, the same function is used to generate a Benders cut
separating the fractional solutions. If no cut violated exists, then branching takes place as usual.
3
However, the cut generation might not be as straightforward as the cut generation taking place at
an incumbent solution implying that extra effort including employing heuristic approaches might be
necessary.
In literature, there are several acceleration techniques for BD, one of which is utilizing valid
inequalities based on constraint tightening, with which MPs are solved more efficiently (Sherali et al.,
2010; TaskÄąn et al., 2012; Frank and Rebennack, 2015). Providing initial bounds on the objective
value in the MP also plays an important role to reach quicker convergence of the selected solution
methods. In the literature, there are some helpful methods in that sense including introducing
valid inequalities (Ahat et al., 2017), solving the relaxed version of the model (Chen and Miller-
Hooks, 2012), using the Lagrangian relaxation (Holmberg, 1994), and employing heuristic approaches
(Contreras et al., 2011). It is also shown that tuning certain solver parameters when solving the
MP might yield a faster convergence (Bai and Rubin, 2009; Botton et al., 2013; Dalal and Uster,
2017). It should be noted that in most cases, changing default settings does not really provide
significant improvement in terms of the solution times when solving the original model via branch-
and-bound process. Lastly, warm starting the solution methods could also be helpful as it identifies
incumbent solutions, especially if the method is struggling to identify such solutions. Several warm
starting methods have been shown to be effective strategies. Extreme points or valid cuts might
be generated via solving relaxed primal SP (Adulyasak et al., 2015), deflecting the current master
solution (Rahmaniani et al., 2018), or designing meta-heuristic algorithms (Emde et al., 2020).
Besides, BD might be utilized even if the SP is not a linear programming (LP) model.
With this regard, Logic-based Benders Decomposition (LBBD) is formally introduced by Hooker
and Ottosson (2003). It distinguishes itself from the traditional BD by using the inference dual
rather than LP duality to generate the Benders cuts so as to eliminate the infeasible MP solutions.
Although it has been predominantly used in scheduling problems (Hooker, 2007; Roshanaei et al.,
2017; Emde et al., 2020; Guo et al., 2021), it is recently adapted to plant location (Fazel-Zarandi
and Beck, 2012), route planning (Kloimullner and Raidl, 2017), and network interdiction (Enayaty-
Ahangar et al., 2019) problems as well.
The remainder of this dissertation is outlined as follows. In Chapter 2, we introduce the
star degree centrality (SDC) problem tasking itself with identifying an induced star with the largest
open neighborhood together with its stochastic variant called the stochastic pseudo-star degree cen-
trality (SPSDC). In Chapter 3, we focus on the deterministic version and introduce a new integer
4
programming (IP) formulation, which stands stronger than the existing IP formulation in the litera-
ture in terms of the number of constraints and non-zero coefficients. Next, we present a complexity
discussion where the SDC problem is examined on certain network types. We then propose a BD
framework and conduct an extensive experimental studies on both randomly generated networks
and real-word PPINs. In the next chapter, we propose a non-linear binary optimization model and
provide complexity discussions for the SPSDC problem (see Chapter 4). We first convert the model
into a linear form and design a BD framework which contains optimality cuts, and both traditional
and logic-based Benders feasibility cuts. In addition, a wide range of computational experiments
are presented based on randomly generated networks according to small-world networks. Chapter 5
focuses on Arctic mass rescue events motivated by the entrance of large-sized cruise ships into the
region in the last decade. We first provide an initial comprehensive overview on the background of
Arctic Alaska. We discuss the changes and challenges that have occurred in the past and are to
occur in the future due to environmental, geographical, political, as well as economic reasons in the
region. We then provide a literature review related to our work and then provide a formal problem
definition containing our modelling assumptions and objective components. We next introduce an
IP formulation, which includes both transportation and logistics decisions, and discuss the solution
methodology created to solve this problem. We conclude this work with conducting an extensive
âwhat-ifâ analysis and answering some policy questions. Lastly, we present the conclusion and a
summary of this dissertation in Chapter 6.
5
Chapter 2
Star Degree Centrality: Definitions
and Problem Statements
In this chapter, we first review the star graph terminology and then detail the central-
ity concept, a well-recognized metric in graph theory and network analysis. After introducing the
protein-protein interaction networks, we discuss both the star degree centrality and the stochastic
pseudo-star degree centrality problems (see Section 2.1). Lastly, we provide formal problem defini-
tions together with illustrative examples detailing how the star degree centrality mechanism with
its two variants works in Section 2.2.
2.1 Introduction
A star graph can be defined as a tree graph with a maximum diameter of two, where
the diameter is defined as the maximum distance between any two nodes (see Fig. 2.1). Different
variations of star graphs have been drawing researchersâ attention since the late 1980s. Akers and
Krishnamurthy (1989) are the first who introduce the notion of a star graph as a new class of
networks. Day and Tripathi (1992) expanded this idea to generalized (n, k)-star graphs where n and
k are user-defined values tuning the number of nodes and the degree / diameter trade-off. The idea
was then used up by Akers et al. (1994) who proposed star graphs as an alternative to hyperbolic
structures. Afterwards, Chou et al. (1996) propose bubble-sort star graphs as a new interconnection
6
network structure. Past and recent studies heavily focus on the topological and functional analysis
of star graphs (Chiang and Chen, 1998; Lin et al., 2020; Li et al., 2020a).
Figure 2.1: Examples of star graphs
S1,2S1,2 S1,3S1,3S1,3 S1,4S1,4S1,4S1,4 S1,5S1,5S1,5S1,5S1,5 S1,6S1,6S1,6S1,6S1,6S1,6
Centrality, on the other hand, is one of the best-studied concepts in network analysis. It has
been used in a variety of applications to quantify the importance of nodes or entities in a network.
The main idea is that the more central a node is, the more importance it has. Expectedly, not
every measure of importance is equally valid in every application. Hence, a series of simpler or
more complex notions of centrality have been proposed over the years. They range from the early
work by Bavelas (1948, 1950) and Leavitt (1951) on task-oriented group creation, as well as the
introduction of eigenvector and bargaining centrality by Bonacich (1972, 1987), to more recent ideas
about subgraph (Estrada and RodrÄąguez-Velazquez, 2005), residual (Dangalchev, 2006) or diffusion
(Banerjee et al., 2013) centrality. In this dissertation, we turn our focus to a concept referred to as
group centrality (Everett and Borgatti, 1999).
In a fundamental contribution, Freeman (1978) examined three distinct and recurring con-
cepts in centrality studies, namely degree, betweenness, and closeness. The basic definitions involved
with each of the concepts are as follows. Degree is related to the number of connections that a node
has (i.e., number of nodes adjacent to a given node i, often normalized by the number of nodes in
the network minus 1); betweenness can be quantified as the fraction of shortest (geodesic) paths
that use a specific node i; finally, closeness is a function of the shortest (geodesic) paths that a node
i has to every other node in the network. A common theme behind the above definitions is their
focus on a specific node.
Group extensions to centrality have been proposed to help address questions of importance
for a group as a whole, as well as for introducing importance that can be attributed to the node
versus to the group it belongs. This idea was presented by Everett and Borgatti (1999, 2005)
and was immediately picked up and expanded upon by a series of researchers. Prominent extensions
include the definition of clique (cohesive subgroup) centrality (Vogiatzis et al., 2015; Rysz et al., 2018;
7
Nasirian et al., 2020). Identifying a general group of nodes with highest betweenness centrality is also
studied by Veremyev et al. (2017), where they also mention the possibility to introduce additional
âcohesivenessâ constraints.
More specifically, we study the recently introduced measure of star degree centrality (SDC)
by Vogiatzis and Camur (2019) where SDC has been shown to be a highly efficient centrality metric to
identify the essential proteins in protein-protein interaction networks (PPINs). The results indicate
that it performs better than other well-known metrics (i.e., degree, closeness, betweenness, and
eigenvector) in the determination of the essential proteins. The contributions of Vogiatzis and Camur
(2019) are in approximation algorithms for finding nodes with high SDC whereas we contribute to
the literature by providing exact solution approaches that are able to solve problems of significant
size.
The SDC tasks itself with identifying the induced star centered at a given node i that
possesses the maximum cardinality open neighborhood. An induced star centered at i will include i
and a subset of its neighbors as part of the star under the condition that no two neighbors in the star
have an edge between them. The open neighborhood is the set of all nodes not in the induced star
that are adjacent to a node in the induced star. Vogiatzis and Camur (2019) study the problem in
the context of a PPIN. The authors derive the computational complexity of the problem and show it
is NP-hard; additionally, they provide an integer programming (IP) formulation and approximation
algorithms to solve it efficiently. More importantly, they show that this is indeed a viable proxy
for predicting essentiality in PPINs. Essential genes (and their essential proteins) are ones whose
absence leads to lethality or the inability of an organism to properly reproduce themselves (Kamath
et al., 2003). Thus, identifying the node with the highest star degree centrality finds an important
application in PPINs.
PPINs are networks where nodes represent proteins and arcs represent protein-protein in-
teractions. Each arc is associated with an interaction score indicating the strength of the interaction
where a higher score implies a stronger interaction. These networks have been heavily studied over
the last two decades: for a series of surveys on computational methods for complex detection, clus-
tering, detecting essentiality, among others, in PPINs, we refer the interested reader to the recent
reviews by Wang et al. (2013); Bhowmick and Seah (2015), and Rasti and Vogiatzis (2019). Cen-
trality has been a staple in the study of biological networks, and specifically PPINs: CentiServer
(Jalili et al., 2015) is a database that has collected a large number of centrality-based approaches
8
for biological networks at https://www.centiserver.org.
Jeong et al. (2001) proposed the âlethality-centralityâ rule, in which the more central a
protein is, the higher the probability it is essential. This work led to significant research interest in
centrality metrics in PPINs (see the works by Joy et al. (2005) on betweenness, Estrada (2006) on
subgraph centrality, Wuchty and Stadler (2003) on closeness centrality). An updated survey and
comparison of 27 commonly used centrality metrics (including degree, betweenness, and closeness)
is presented in the work by Ashtiani et al. (2018).
At this point, we should mention that the high computational complexity in PPINs did not
allow Vogiatzis and Camur (2019) to conduct a full analysis across the entire network. That is why
they used two different approaches to simplify the problem: i) setting extremely high thresholds to
prune the edges in the networks and ii) utilizing a probabilistic approach to create the interactions
between the proteins. In addition, the essential protein analysis is performed by selecting k (i.e., a
user-defined value) top proteins for each of which an individual IP is solved assuming each as the
center. On the other hand, our decomposition implementation opens the door to a full analysis
of large-scale networks by being able to identify the node with the highest SDC across the entire
network. Our computational results indicate that we can avoid using high thresholds to perform
analysis in real-world PPINs.
Furthermore, we introduce the stochastic pseudo-star degree centrality (SPSDC) problem
where the goal is to detect an induced pseudo-star, that is truly a star with a high probability, where
âhigh probabilityâ examines a) the probability that the center has an edge to each leaf node , and
b) there are no edges between leaf nodes. The objective is to maximize the connection probability
of each neighbor node to the pseudo-star. From an application perspective, the SPSDC metric may
help to identify new proteins that should be investigated to determine their essentiality (see Section
2.2). It may also help to confirm that essential proteins identified through the SDC metric are
important.
In PPINs, there exist interaction scores that represent the strength of the interactions be-
tween two proteins. In fact, one can normalize the interaction scores and treat them as probability
values that would indicate the likelihood of two proteins interacting. Our first goal is to make sure
that i) probability values between the center node and each leaf nodes are high, and ii) the proba-
bility values between each connected leaf node is low in order to ensure the feasibility (i.e., existence
of star). It is crucial to point out that we now allow leaf nodes to connect as long as the induced
9
pseudo-star satisfies the âfeasibility conditionâ, which will be introduced shortly. Therefore, we use
the term of pseudo-star rather than star.
In the SPSDC problem, the main objective is to assign each neighbor node to a single
pseudo-star element (i.e., either the center or a leaf) which yields the largest probability value. In
other words, our goal is to maximize the maximum probability value of the connection between a
neighborhood node and the pseudo-star. This offers one potential way to evaluate the centrality of
the pseudo-star; different metrics could be applied in the future.
2.2 Problem Definitions
Let G = (V,E) be an undirected graph consisting of a vertex set V and an edge set E
where |V | = n and |E| = mâ. We define the open neighborhood of a node i â V as the set of nodes
adjacent to i; in other words, N(i) = j â V : (i, j) â E. Similarly, the closed neighborhood of a
node i â V is defined as N [i] = N(i) ⪠i. For a set of nodes S, we define the open neighborhood
as N(S) = j â V : i â S, j 6â S, (i, j) â E. Additionally, we define the k-neighborhood of a node
i â V as the set of nodes whose shortest path from i is exactly k edges and denote it as Nk(i). In
other words, Nk(i) represents the set of nodes that are reachable from i within at least k-edge hops.
Note that Nk(i) ⊠Nk+1(i) = â ,âk ⤠K where k â Z+ and K is the length of the longest shortest
path from node i to all other nodes in the network. Finally, we let pij represent the probability of
existence between each edge (i, j) â E.
Definition 1. The star degree centrality of node i represented by Di is a centrality metric which
aims to form an induced star Si centered at i with the largest size open neighborhood, where Di =
max|N(Si)| : Si is an induced star.
In the deterministic setting, we are not concerned with the probability values, in other
words, we assume that all the edges in the network exist with probability one.
Example 1. Below we present a small example showing how to identify the SDC of a given node
(see Fig. 2.2). We select node c as the candidate center. First, note that N(c) = l1, l2 represents
the set of candidate leaf nodes. In a deterministic induced star, no two leaf nodes can be connected,
therefore, l1 and l2 cannot be the elements of the same star. Since the objective is to maximize the
âWe will redefine notations as needed through each chapter. We will be consistent in terms of not changing thenotation used for a specific definition.
10
open neighborhood of the induced star, node l2 is preferable over node l1 where it gives access to
more nodes. Thus, we obtain Sc = c, l2 and N(Sc) = l1, n3, n4, n5.
Figure 2.2: Determining the star degree centrality of a given a node where the center and leaf nodes areshown in red and blue, respectively.
l1 l2
c
n1
n2
n3
n4
n5
Figure 2.3: An example of why a star structure helps identify essential proteins. In this figure, we present asubgraph of the PPIN of Saccharomyces Cerevisiae (yeast) using a threshold of 92%. The node in red corre-sponds to non-essential protein YMR300C and is the node of highest degree; the node in green correspondsto essential protein YHL011C and is the node of highest star degree centrality.
Example 2. In Fig. 2.3, we present some of the notions in this work using a real-life example from
the yeast proteome (Saccharomyces Cerevisiae) keeping only interactions above a threshold of 92%
(so that the induced subgraph is sparse enough for visualization purposes).
The highest degree centrality protein is also known as YMR300C (marked in red) and
despite its central location and its many documented interactions, it is not essential. We observe
that YMR300C is adjacent to two main protein complexes (dense subgraphs). This means that many
of the connections that YMR300C has to other nodes are also shared among the nodes themselves.
Hence, if we were to discard connections between neighbors (that is, we enforced a âstarâ constraint),
its importance would be sure to decrease.
11
On the other hand, the highest star degree centrality protein is known as YHL011C (marked
in green), an essential protein for many cell activities as it is used to synthesize phosphoribosyl
pyrophosphate. We observe that while its degree centrality is small (the number of neighbors it has
is only 7, compared to a degree centrality of 23 for YMR300C), it is adjacent to nodes that connect
different protein complexes and communities.
We now move to the SPSDC problem and first formally define the feasibility condition. For
a given pseudo-star Sk centered at node k, let L be the set of leaf nodes. Also, let θ â [0, 1] be a
user-defined value.
Definition 2. Given a pseudo-star Sk, the feasibility condition is defined asâjâL
pkjâ
i,jâL:(i,j)âE
(1â pij) ⼠1â θ (2.1)
where the first product term focus on the probability that edges exist between the center and the
leaf nodes and the second product term focuses on the probability of edges existing between two
leaf nodes. We can use the log transformation (i.e., a data transformation where each data point is
inserted into the logarithm function) to get rid of the multiplication operation in Ineq. (2.1) and
obtain an equivalent expression:âjâL
log(pkj) +â
i,jâL:(i,j)âE
log(1â pij) ⼠log(1â θ) (2.2)
Definition 3. The stochastic pseudo-star degree centrality of node i represented by Di is a central-
ity metric which aims to form an induced pseudo-star Si centered at i with maximizing the maxi-
mum probability value of each neighborâs connection to the pseudo-star, where Di = maxâjâN(Si)
maxkâSi pkj : Si is an induced pseudo-star satisfying the feasibility condition (2.1).
Example 3. In Fig. 2.4, we provide an example to see how SPSC mechanisms works, where the
probability values are shown on the edges and θ is given as 0.2. Considering node c as the center,
we can first create a candidate pseudo-star where node l1 is the only leaf node (see the figure on
the left). In this scenario, the feasibility condition is satisfied since 0.99 ⼠1 â θ and we obtain an
objective of 0.5 + 0.5 + 0.8 = 1.8. Note that node c is assigned to node l2 since c provides a stronger
connection compared to node l1 (i.e., 0.8 vs. 0.01).
However, probability values associated with the edges between the center node and nodes
l1 and l2 are relatively large. Also, even though nodes l1 and l2 share an edge, the corresponding
12
Figure 2.4: Determining the stochastic pseudo-star degree centrality of a given node where the center andleaf nodes are shown in red and blue, respectively.
l2l1
n3n1
n2
c
0.01
0.80.99
0.99
0.5
0.5
l2l1
n3n1
n2
c
0.01
0.80.99
0.99
0.5
0.5
probability value between those two nodes shows that they are highly likely not to interact. There-
fore, we can create an alternative induced pseudo-star centered at c where leaf nodes are selected as
l1 and l2 (see the figure on the right). Such pseudo-star would still satisfy the feasibility condition
(i.e., 0.99 â 0.8 â (1 â 0.01) = 0.8821 > 1 â θ). In addition, it yields a better objective, which is
calculated as 0.99 + 0.5 + 0.5 = 1.99.
It is important to mention that, there cannot be any guarantee that a pseudo-star gives a
better deterministic objective (i.e., the largest size of open neighborhood) than the deterministic
induced star since the threshold used in the feasibility condition impacts the size of the pseudo-
star. Therefore, a fair comparison cannot be made between the SDC and the SPSDC even if they
are associated with the same objective function. However, the goal of each of these problems in
our motivating application is to identify essential proteins and, therefore, it may be that each of
their solutions helps to diversify the set of proteins that should be investigated to determine their
essentiality or confirm the likeliness of certain proteins being essential (i.e., if they appear in both
the SDC and SPSDC problems).
The advantages of the SPSDC compared to the deterministic counterpart that has been
studied before are twofold. First and foremost, it allows us to solve the problem in a PPIN without
the need to trim edges based on their probability of existence. In the deterministic version, a thresh-
old is employed to remove edges below a certain probability of existence. This can be problematic,
as certain edges may interact with high enough probabilities just below the threshold and hence
get removed; on the other hand, other edges that are just above the threshold are considered as
present. As an example to showcase the success of the SPSDC in PPINs, we point the attention to
Figure 2.5. There we show two pseudo-stars obtained for a non-esssential protein (i.e., YBL072C or
RPS8A) and for an essential protein (i.e., YAL001C or TFC3) in Saccharomyces cerevisiae, which
is a species of yeast, with different zooming perspectives in the first three and last three images,
13
respectively. The pseudo-star obtained with the essential protein as the center leads to higher overall
objective function than the pseudo-star obtained for the non-essential protein. On the other hand,
had we employed a threshold of 60% (i.e., removing all edges with likelihood smaller than 60%), the
objective functions of the two stars would be reversed, leading to the non-essential one possessing a
higher value.
Figure 2.5: In this example, we show the pseudo-stars with maximum objective function value obtainedfor a non-essential and an essential proteins in a real-world PPIN, Saccharomyces cerevisiae. The networkconsists of a giant connected component that includes 6,416 nodes out of the 6,418 proteins documented inSTRING-DB (Szklarczyk et al., 2015), and 939,997 edges of varying reliability (probability of existence).The networks presented are the full connected component that contains the two proteins (left), a zoomedin perspective (middle) and an even more zoomed in perspective showing the pseudo-star centered at eachprotein (right). The pseudo-stars are obtained with Îą = 0.99; in other words, they form induced stars withprobability 1%. To show the two pseudo-stars, we show in red the center and in yellow the leaves; edges fromthe center to the leaves are solid, whereas edges connecting two leaves are dashed. The pseudo-star obtainedfor the essential protein (see 4-6) leads to a higher objective function value (equal to 1293.72) than the valueobtained for the pseudo-star centered at the non-essential protein (see 1-3) which is equal to 1002.57.
1 - The SPSDC of a non-essentialprotein
2 - The SPSDC of a non-essentialprotein (zoomed in)
3 - The SPSDC of a non-essentialprotein (more zoomed in)
4 - The SPSDC of an essential pro-tein
5 - The SPSDC of an essential pro-tein (zoomed in)
6 - The SPSDC of an essential pro-tein (more zoomed in)
We conclude this chapter by providing the definitions of the DSDC and SPSDC problems
(see Definitions 4 and 5, respectively). We examine each problem in a detailed way in Chapters 3
and 4 in order.
Definition 4. The deterministic star degree centrality problem aims to identify the node which has
the largest star degree centrality in a given network.
14
Definition 5. The stochastic pseudo-star degree centrality problem aims to identify the node which
has the largest stochastic pseudo-star degree centrality in a given network.
15
Chapter 3
The Star Degree Centralityâ
In this chapter, we provide IP mathematical formulations to the SDC problem. We begin
the discussion in Section 3.1 from the previously introduced formulation by Vogiatzis and Camur
(2019) and then proceed to propose a new, compact formulation. Section 3.2 presents classes of net-
works where the problem is solvable in polynomial time and offers a new proof of NP-completeness
that shows the problem remains NP-complete for bipartite and split graphs (thus tightening the
complexity analysis of Vogiatzis and Camur (2019)). In Section 3.3, we provide a branch-and-cut
implementation motivated by Benders decomposition for solving the problem on real-life, large-
scale networks, such as the ones typically encountered in computational biology and specifically in
PPINs. Section 3.4 discusses acceleration techniques utilized to speed up our implementation. All
our algorithmic efforts are put to the test in Section 3.5 which is divided into two subsections for
randomly generated instances and protein-protein interaction networks instances. We conclude with
a summary of our findings and recommendations for future work in Section 3.6.
3.1 Mathematical Formulations
First, we present the formulation that appears in the literature (the Vogiatzis and Camur
(2019) integer programming (VCIP) formulation). Then, we introduce a new formulation, which
is more compact in theory with respect to the number of constraints. In the original formulation,
there are three sets of binary variables: (i) xi is equal to 1 if and only if i â V is the center of the
âThe paper has been accepted at INFORMS Journal on Computing.
16
star, (ii) yi is equal to 1 if node i is in the star, and (iii) zi is equal to 1 if node i is in the open
neighborhood of the star. The IP model is now provided in (3.1).
[VCIP]:
maxâiâV
zi (3.1a)
s.t. yi + zi ⤠1, âi â V (3.1b)
zi â¤â
jâN(i)
yj , âi â V (3.1c)
yi â¤âjâN [i]
xj , âi â V (3.1d)
xi ⤠yi, âi â V (3.1e)
yi + yj ⤠1 + xi + xj , â(i, j) â E (3.1f)âiâV
xi = 1, (3.1g)
xi, yi, zi â 0, 1, âi â V. (3.1h)
The objective function (5.33) maximizes the number of the nodes adjacent to the star. Constraints
(3.1b) indicate that no node can be in the star and the neighborhood. Constraints (3.1c) ensure
that for a node to be a neighbor to the star, it must be adjacent to at least one node in the star. In
addition, every node in the star must be in the closed neighborhood (i.e., a neighborhood containing
the node itself) of the center node by constraints (3.1d). We should point out that constraints (3.1e)
ensuring that the center node is part of the star were absent in the printed version in Vogiatzis and
Camur (2019). Constraints (3.1f) prevent two adjacent nodes from being in the star if neither is
the center. This computationally stands as the most expensive constraint due to the fact that it
must appear for every edge. Constraint (3.1g) makes sure that the model identifies a single star by
selecting one center node. Last, constraints (3.1h) dictate the binary requirements for each variable.
Note that there is a total of 4n+m+ 1 constraints in [VCIP]. Further, we can examine the number
of total non-zero coefficients across each type of constraint: (3.1b) has 2n; (3.1c) has n+ 2m; (3.1d)
has 2n + 2m (since i â N [i]); (3.1e) has 2n; (3.1f) has 4m; and (3.1g) has n. These sum to a total
of 8n+ 8m non-zero coefficients.
In the former formulation [VCIP], though there is a specific variable used for the center
node (i.e., xi), variable yi corresponds to any node in the star without making any distinction. An
17
important observation is that leaf nodes in a star carry a unique characteristic which differentiates
them from the center node. That is, while a leaf node has solely one edge connecting it to the star
via the center node, the center node shares an edge with every leaf node. Hence, we remove variable
yi and introduce a new variable to represent the leaf nodes.
li =
1, if node i â V is a leaf of the star
0, otherwise.
After this conversion, we can remodel the problem with a new IP (NIP) formulation.
[NIP]:
maxâiâV
zi (3.2a)
s.t. xi + li + zi ⤠1, âi â V (3.2b)
zi â¤â
jâN(i)
(lj + xj), âi â V (3.2c)
li â¤â
jâN(i)
xj , âi â V (3.2d)
âjâN(i)
lj ⤠|N(i)|(1â li), âi â V (3.2e)
âiâV
xi = 1, (3.2f)
xi, li, zi â 0, 1, âi â V. (3.2g)
First of all, constraints (3.2a), (3.2f), and (3.2g) correspond to constraints (5.33), (3.1g), and (3.1h),
respectively. Constraints (3.2b) guarantee that a node cannot be the center, a leaf, and a neighbor of
the star at the same time, which is similar to original constraints (3.1b). Constraints (3.2c) replace
(3.1c) and indicate that a node should be adjacent to either the center node or at least one of the
leaf nodes, if it is adjacent to the star. Each leaf node is connected to the center node to form a
feasible star, which is enforced by constraints (3.2d). With the new variable definition (i.e., li), we
eliminate two constraints (that is, (3.1e) and (3.1f)), and no longer need to account for all edges
in the graph. Constraints (3.2e) state that if a node is selected as a leaf, none of the nodes which
are adjacent to it can also be a leaf node. Note that there is a total of 4n+ 1 constraints in [NIP].
Further, we can examine the number of total non-zero coefficients across each type of constraint:
18
(3.2b) has 3n; (3.2c) has n+ 4m; (3.2d) has n+ 2m; (3.2e) has n+ 2m; and (3.2f) has n. These sum
to a total of 7n+ 8m non-zero coefficients.
We now examine the tightness of the linear programming (LP) relaxations of these two
formulations.
Theorem 1. The LP relaxation of [VCIP] is stronger than the LP relaxation of [NIP].
Proof. Given two LP formulations LP i and LP j , let Pi and Pj be the polyhedra defined by LP i
and LP j , respectively. LP j is said to be stronger than LP i, if i) there exists at least once instance
and one point contained by Pi while not contained by Pj , and ii) all the points contained by Pj are
also contained by Pi.
First of all, note that constraints (1g) and (2f) are equivalent, and do not need an explicit
comparison. Now, let li = yi â xi,âi â V be the mapping from LP[V CIP ] to LP[NIP ] between
the variables. When replacing each li by yi â xi in LP[NIP ], it is straightforward to see that
constraints (1b) and (1c) imply constraints (2b) and (2c), respectively. When we replace yi by
li + xi in constraints (1d), they imply constraints (2d) since yi = li + xi â¤âjâN [i] xj =â li â¤
âxi +âjâN [i] xj =
âjâN(i) xj . In addition, constraints (1e) implies the non-negativity of variables
li due to the fact that xi ⤠yi =â 0 ⤠yi â xi =â 0 ⤠li. If we rearrange constraints (1f) based
on the map definition, we obtain li + lj ⤠1,â(i, j) â E. For a given node i, we then openly write
constraints (1f) and aggregate them.
(li + lj1) + ¡ ¡ ¡+ (li + lj |N(i)|) ⤠|N(i)| =ââ
jâN(i)
lj ⤠|N(i)|(1â li)
It can be seen that, constraints (1f) imply constraints (2e) with a slight modification. There-
fore, we can conclude that all the points contained by the polyhedron generated by LP[V CIP ] is
contained by the polyhedron generated by LP[NIP ]; in other words, OBJLP[V CIP ]⤠OBJLP[NIP ]
.
Below we present a counter example where a solution produced by LP[NIP ] cannot be
converted a feasible solution in LP[V CIP ].
For this example, LP[NIP ] sets x3, x4 and x5 0.2, 0.2, and 0.6, respectively while the leaf
variables of the same nodes (i.e., li) are set as 1 â xi where i = 3, 4, 5 in an optimal solution.
As a result, the objective value becomes nine. On the other hand, since nodes 3 and 4 share an
edge, the same solution becomes infeasible in LP[V CIP ] due to constraints (1f) (i.e., 1.6 1.4). The
19
Figure 3.1: A counter example where the optimal solution obtained in LP[NIP ] cannot be converted a feasiblesolution in LP[V CIP ].
0
1 2
3
4
5
6
7
8 9
10
11
solver returns 8.5 as optimal solution in LP[NIP ]. Hence, we can conclude that [VCIP] is a tighter
formulation than [NIP] with respect to LP-relaxations.
Even though [VCIP] is a stronger formulation than [NIP] in terms of the LP-relaxation, we
observe here that while the constraint set is bounded by O(n+m) in [VCIP], the new formulation
[NIP] is associated with a constraint set bounded by O(n). Furthermore, the number of non-zero
coefficients are slightly higher in [VCIP] (i.e., 8n+8m) compared to [NIP] (i.e., 7n+8m). It is worth
mentioning that the number of non-zero coefficients can be reduced with a constraint tightening in
[NIP], which is discussed in Section 3.4.1. All of these factors may impact the computational per-
formance of solving these problems. This is further examined in Section 3.5, where we demonstrate
that [NIP] is the foundation for more efficient methods to solve the problem.
3.2 Complexity Discussion
The SDC problem over general graphs was shown to be NP-complete by Vogiatzis and Ca-
mur (2019). In this section, we provide graphs where the SDC problem can be solved in polynomial-
time and prove that the SDC problem remains NP-complete on certain networks.
3.2.1 Polynomial-Time Cases
Theorem 2. The SDC problem is solvable in polynomial time on trees.
Proof. We propose Algorithm (1) that identifies the optimal induced star with the maximum size
20
neighborhood in O(m) time for a tree. For the sake of simplicity, we assume that the given graph is
connected and n ⼠3. The algorithm goes through each edge (i, j) â E and determines whether an
adjacent node is considered a leaf node or a neighbor node. For a given edge (i, j), there exist three
cases, considering each node as a center of a star.
1. If |N(i)| > 1 and |N(j)| = 1, then i would be a leaf for a star centered at j and all nodes
N(i)\j would serve as the neighbors of the star. In this case, j would be selected as being in
the neighborhood of the star centered at i since having it as a leaf would result in no additional
neighbors.
2. If |N(i)| = 1 and |N(j)| > 1, then j would be leaf for a star centered as i and i would be in
the neighborhood for a star centered at j.
3. If both |N(i)| and |N(j)| are greater than one, then they would each be a leaf for a star centered
at the other. Note that after identifying a node i â V as a leaf, we can directly compute its
contribution to the objective with |N(i)| â 1 due to the fact that the graph is acyclic.
Thus, we can conclude that the problem can be solved efficiently if the given graph is a tree.
Algorithm 1: An algorithm to solve the SDC problem on a tree
Input: G = (V,E), L, S1 L[i]â â ; âi â V | L[i] : list of leaf nodes connected to center i2 S(i) = 0; âi â V | S(i) : number of nodes adjacent to the star whose center is i3 for (i, j) â E do4 if |N(i)| > 1 and |N(j)| = 1 then5 S(i) + +;6 L[j]â L[j] ⪠i;7 S(j) = S(j) + |N(i)| â 1;
8 else if |N(i)| = 1 and |N(j)| > 1 then9 L[i]â L[i] ⪠j;
10 S(i) = S(i) + |N(j)| â 1;11 S(j) + +;
12 else13 L[i]â L[i] ⪠j;14 S(i) = S(i) + |N(j)| â 1;15 L[j]â L[j] ⪠i;16 S(j) = S(j) + |N(i)| â 1;
17 iâ = arg maxiâV
Si;
18 return iâ, L[iâ]
21
Definition 6. A graph Wd(k, n) where k ⼠2 and n ⼠2 is called a windmill graph, with n copies
of Kk complete graphs with a shared universal vertex.
Proposition 1. Given a windmill graph Wd(k, n), there exists a unique optimal solution solely
containing the universal vertex for the SDC problem.
Proof. By the definition of the windmill graph, there exist n identical complete graphs with k vertices
each of which is connected to the universal vertex u. A star whose center is u with no selected leaves
has a neighborhood of size |V | â 1 = (kâ 1)n. Note that any node selected as a leaf node decreases
the objective by one since all its neighbors are already in the starâs neighborhood. For any node
j â V \u as a center, we must have the universal node u as a leaf node in order to gain access
to the nodes j does not have an edge to. If u is not a leaf node, then the maximum neighborhood
would be k â 1 (all nodes incident to j are in the neighborhood). If u is a leaf node, then the
maximum neighborhood is for all nodes besides j and u to be in it, which implies the maximum size
is |V | â 2 < |V | â 1. Hence, the optimal solution is unique and provided by the universal vertex u
with no leaf nodes.
3.2.2 NP-Complete Classes
Vogiatzis and Camur (2019) show that the SDC problem is NP-complete via a reduction
from a well-recognized combinatorial problem, the Maximum Independent Set (MIS). It is widely
known that according to the Konigâs theorem, the MIS can be efficiently determined if the graph is
bipartite. Yet, we show that the SDC problem preserves its complexity even in a bipartite graph.
We first provide the decision versions of the SDC problem and the Set Cover Problem (SCP) via
which we will perform a reduction.
Definition 7. (Star Degree Centrality) Given an undirected graph G = (V,E) and an integer
`, does there exist a node i and an induced star C centered at i such that |N(C)| ⼠`?
Definition 8. (Set Cover) Given a set of elements U = u1, u2, ¡ ¡ ¡ , un (i.e., the universe), a
collection of subsets, S = S1, S2, ¡ ¡ ¡ , Sm where âŞmi=1Si = U , and an integer k, does there exists a
set I â S such that |I| ⤠k and âŞiâISi = U?
Theorem 3. The SDC problem is NP-complete on bipartite graphs.
22
Proof. Given a potential induced star centered at node i, we must verify if any two leaf nodes share
an edge to verify if it is truly an induced star. One can then verify if |N(C)| ⼠` easily. This shows
that SDC problem is in NP if the graph is bipartite.
Now, let < U,S, k > be an instance of the SCP where k represents the number of sets to
cover all the elements in U . We can then construct an instance of SDC problem < G, ` > on a
bipartite graph as follows:
V [G] = V1 ⪠V2 where V1 = S1, S2, ¡ ¡ ¡ , Sm, d1 and V2 = u1, u2, ¡ ¡ ¡ , un, d2, d3, d4 ¡ ¡ ¡ , d|S|+3
E[G] = âŞmi=1âŞjâSi(Si, uj) ⪠âŞmi=1(d2, Si) ⪠(d1, d2) ⪠âŞ|S|+3i=3 (d1, di).
The construction proposed (see Fig 3.2) can be explained as follows. Each set Si â S, and each
element ui â U are considered a node in V1 and V2, respectively. Then, we add edges between each
set and all elements contained in the set. A dummy node d2 is placed in V2 and is connected with
each Si â V1. Another dummy node d1 is added into V1 and is connected to d2. Finally, we add
|S| + 1 dummy nodes into V2, each of which shares an edge with d1. After this configuration, we
obtain a bipartite graph. Lastly, we set ` = 2|S|+ |U |+ kâ 1. We examine the potential size of the
induced stars centered at five different potential nodes: a set node, an element node, di with i ⼠3,
d1, and d2 which helps us to show that a particular choice of the star centered at d2 corresponds to
a set cover (if one exists).
Figure 3.2: The transformation of Set Cover < U,S,k > to an instance < G(V,E), l > of Star DegreeCentrality.
S1 S2 S3 Sn d1
u1 u2 u3 um d2 d3 d4 d|S|+3
1. If Si â V1 is the center, then the upper bound (UB) on the size of the potential neighborhood
is (|U | â 1) + (|S| â 1) + 1 = |U |+ |S| â 1 since either d1 or d2 can be in the neighborhood and
then all other Sj and uk nodes may be in it.
2. If ui â V2 is the center, then the UB on the size of the potential neighborhood is (|S| â 1) +
(|U | â 1) + 1 = |U |+ |S| â 1 since either d2 can be in the neighborhood and then all other Sj
23
and uk nodes may be in it.
3. If a dummy node di where i ⼠3 is the center, the size of the neighborhood is |S| + 1. Every
dj such that (j ⼠3 and j 6= i) and d2 are neighbor nodes while d1 is a leaf.
4. If dummy node d1 is the center, then the size of the neighborhood is 2|S|+ 1 by picking d2 as
a leaf node.
5. If dummy node d2 is the center, then d1 is considered a leaf and |S| + 1 nodes become the
neighbors (i.e., âdj , j ⼠3). Every Si node can appear as either a leaf or in the starâs neighbor-
hood. Consider a partition of the set nodes into leaves and those in the starâs neighborhood.
If there is a node that is a leaf such that all elements uj in it are covered by other leaf node
sets, then we can move that set node to the neighborhood of the star and increase its size. If
there is a node in the neighborhood which contains one or more uj that are not in the starâs
neighborhood, then we can move that node to be a leaf and either keep the size the same (if
exactly one uj is uncovered) or increase the size of the neighborhood. This latter point shows
that we can create another star whose neighborhood size is greater than or equal to the size
of our current star. This means that all uj nodes should be in the neighborhood of the star.
Note that if |U | ⤠k in SCP, then the problem is solvable in polynomial time by verifying
that each element appears in one set. We focus our analysis on situations where |U |âk > 0. Suppose
there is a set cover, I such that |I| ⤠k. Consider the star centered at d2 with the set of leaf nodes
being d1, Si : i â I. From Point 5, we know that all dj , j ⼠3 are in the neighborhood, all SiⲠfor
iⲠâ I are in the neighborhood, and all uj are in the neighborhood since I is a cover. This means
that this star has a size of |S| + 1 + |U | + |S| â |I| ⼠2|S| + 1 + U â k = `. Alternatively, suppose
we have a star whose neighborhood is greater than or equal to `. This star has to be centered at d2
by Points 1-4 above. By Point 5, we know that we can convert this star (if necessary) to one where
all uj are in the neighborhood of the same or greater size. By accounting for the dummy nodes
dj , j ⼠3 and the uk nodes, we have that |S| â k or more set nodes must be in the neighborhood.
Note that since all uj are in the neighborhood, this means that the set nodes that are leaves (there
are at most k of these) must cover all the elements. Therefore, there exists a set cover of less than
or equal to k sets.
24
Definition 9. A graph is called a split graph, if the vertices can be partitioned into two sets where
one is a clique and the other one is an independent set.
Theorem 4. The SDC problem is NP-complete on split graphs.
Proof. We can create a reduction via a set cover instance in the following way.
V [G] = V1 ⪠V2 where V1 = S1, S2, ¡ ¡ ¡ , Sm, d1 and V2 = u1, u2, ¡ ¡ ¡ , un, d2, d3, d4 ¡ ¡ ¡ , d|S|+3
E[G] = âŞmi=1âŞjâSi(Si, uj) âŞâŞnj=1âŞmp=1(uj , up)
⪠âŞmi=1(d2, Si) ⪠(d1, d2)
âŞâŞ|S|+3i=3 (d1, di)
Note that we connect all the elements in the universe set with one another to create a clique instance.
With this formation following the similar steps discussed to prove Theorem 3, if we solve the SDC
problem, the dummy node d2 would be the center of the star with the largest objective value implying
that we obtain the solution for the set cover instance. Hence, we conclude that the SDC problem is
NP-complete when a split graph is concerned.
3.3 Solution Methodology
While both models proposed contain 3n binary variables, the number of constraints are
O(n + m) and O(n) in [VCIP] and [NIP], respectively. Solving the IP models via a commercial
solver is computationally challenging (see Section 3.5); especially, as the graph gets larger and/or
denser. Therefore, we first examine Benders Decomposition (Benders (1962)) for both formulations.
We find that the most computationally effective implementation of this decomposition approach is
a branch-and-cut framework that adds violated constraints from the original problem back into the
master problem. We propose to find a feasible induced star in the master problem (MP) and then
check the size of the neighborhood in the subproblem (SP), i.e., the z variables move to the SP in
both formulations. Hence, we are only concerned with optimality cuts.
We split the variables into (x, y) and (x, l) in the first stage for [VCIP] and [NIP], respectively.
This means that we have 5n+ 6m non-zero coefficients in the MP for the method using [VCIP] and
3n+ 4m for the method based on [NIP]. Given a fixed (y) or (l, x), we obtain the following SPs by
isolating ~z in the second stage:
25
ĎV CIP (y) := maxz
âiâV
zi
s.t. zi ⤠1â yi, âi â V
zi â¤â
jâN(i)
yj , âi â V
z â 0, 1n
ĎNIP (l, x) := maxz
âiâV
zi
s.t. zi ⤠1â li â xi, âi â V
zi â¤â
jâN(i)
(lj + xj), âi â V
z â 0, 1n
We first note that the primal SPs represented above are separable over each node as shown
below. As a result, multiple Benders cuts can be generated at the same time.
ĎV CIP (y) =âiâV
ĎV CIPi (y) :=âiâV
maxziâ0,1
zi : zi ⤠1â yi, zi â¤â
jâN(i)
yi
ĎNIP (l, x) =
âiâV
ĎNIPi (l, x) :=âiâV
maxziâ0,1
zi : zi ⤠1â li â xi, zi â¤â
jâN(i)
(lj + xj)
We refer the reader to see Cordeau et al. (2019) for similar Benders frameworks generated
for both large-scale partial set covering and maximal covering problems, where the authors discuss
different ways of generating feasibility cuts (e.g., normalized and facet-defining feasibility cuts). Note
that for our methods, the procedure to generate cuts added based on fractional and integer solutions
are the same.
In examining both SPs for integer incumbent solutions (y) or (l, x), the binary decision
variables zi are bounded by integer values. Therefore, we can solve these SPs by relaxing the zi
variables which will be helpful in deriving Benders cuts for both integer and fractional values of (y)
and (l, x). Moreover, whenever an incumbent solution is passed to the relaxed SPs, the optional
solution to these problems is indeed binary, which shows the correctness of the traditional Benders
decomposition method to solve the problem. In particular, we can use LP duality to generate the
Benders cuts.
i. For [VCIP], since 0 ⤠yi ⤠1, (1â yi) also lies in [0, 1] implying zi ⤠1. Further,âjâN(i) yi is
a non-negative integer. Taking this into consideration with the fact we maximize over zi, we
do not need to explicitly enforce zi ⼠0. Hence, we can relax the integrality and non-negativity
requirements on zi. We obtain:
ĎV CIPi (y) = maxzi
zi : zi ⤠1â yi, zi â¤â
jâN(i)
yi
26
ii. For [NIP], using the same reasoning, (1â liâ xi) also lies in [0, 1], because a node cannot be a
leaf and center at the same time, implying zi ⤠1. The right hand side (RHS)âjâN(i)(lj + xj)
also implies a non-negative integer. Hence, we obtain:
ĎNIPi (l, x) = maxzi
zi : zi ⤠1â li â xi, zi â¤â
jâN(i)
(lj + xj)
Both MPs guarantee that the corresponding SP is always feasible and bounded. Therefore,
the dual SP (DSP) is also feasible and bounded by strong duality. We create following DSPs for
each SP introduced above.
ÎŚV CIPi (y) = minÎąi,βiâĽ0
Îąi(1â yi) + βiâ
jâN(i)
yj : ιi + βi = 1
ÎŚNIPi (l, x) = min
Îťi,ĎiâĽ0
Îťi(1â li â xi) + Ďiâ
jâN(i)
lj + xj : Îťi + Ďi = 1
As a result, we obtain the following Benders optimality cuts from solution y for [VCIP] and from
solution (x, l) for [NIP]:
Âľi ⤠ιi(1â yi) + βiâ
jâN(i)
yj ,âi â V
Âľi ⤠Νi(1â li â xi) + Ďiâ
jâN(i)
(lj + xj),âi â V
Observe that the feasible region of the DSPs are independent from the upfront fixed master
variables. In fact, we can analytically approach these problems rather than solving their linear
programs. Let (1â yi) andâ
jâN(i)
yj be represented by ÎŚiV CIP1 and ÎŚi
V CIP2 , respectively. Further,
let (1â li â xi) andâ
jâN(i)
(lj + xj) be represented by ÎŚiNIP1 and ÎŚi
NIP2 , respectively. Without loss
of generality, we only present Algorithm 2 which solves the primal and dual formulations presented
above for [NIP] (i.e., ĎNIPi and ÎŚNIPi , respectively). Note that models ĎV CIPi and ÎŚV CIPi can be
solved in the same way. We then show that the algorithm satisfies the LP optimality conditions.
Proposition 2. The primal and dual variables calculated through Algorithm 2 are optimal solutions.
Proof. First of all, since constraint Îťi + Ďi = 1 is satisfied (i.e., tight) for every (Îť, Ď, θ) in all
the assignment cases, the algorithm produces a dual feasible solution for a given solution vector
27
Algorithm 2: Solution of ĎiNIP and ÎŚi
NIP
Input: i â V , 0 ⤠θ ⤠1, ~l, ~x1 if ÎŚi
NIP1 > 0 then
2 if ÎŚiNIP1 > ÎŚi
NIP2 then
3 zi = ÎŚiNIP2 , Îťi = 0, Ďi = 1;
4 else if ÎŚiNIP1 < ÎŚi
NIP2 then
5 zi = ÎŚiNIP1 , Îťi = 1, Ďi = 0;
6 else
7 zi = ÎŚiNIP1 , Îťi = θ, Ďi = 1â θ;
8 else
9 if ÎŚiNIP2 = 0 then
10 zi = 0, Îťi = θ, Ďi = 1â θ;11 else12 zi = 0, Îťi = 1, Ďi = 0;
(l, x). As for the primal problem, we set zi = 0 for a node i if the RHS of either constraints
in ĎNIPi (l, x) is zero. On the other hand, if the RHSs of constraints are positive, then we set
zi = min1â li â xi,â
jâN(i)
(lj + xj). Therefore, we also obtain a primal feasible solution.
In addition, the objective values of ĎNIPi (l, x) and ÎŚNIPi (l, x) are the same (i.e., the strong
duality holds). In the case of primal variable zi = (1 â li â xi), we set the dual variables Îťi and
Ďi accordingly to keep the contribution to the dual objective the same. When zi =â
jâN(i)
(lj + xj),
we set Îťi = 0, wi = 1 which yields the same objective in ÎŚNIPi (l, x). When zi = 0, based on the
value ofâ
jâN(i)
(lj + xj), we keep the contribution of node i to the dual objective as zero by tuning
the dual variables Îťi and Ďi accordingly. Therefore, the algorithm produces primal/dual solutions
that satisfy the complementary slackness. As a result, the primal and dual variables calculated are
indeed optimal solutions.
We note that the Benders cut generated through this algorithm carries the same violation
characteristic independent from the value of θ. Ahat et al. (2017) provide a detailed discussion
including the proof conducted on an algorithm that solves a Bender SP in a similar fashion. However,
in our problem, setting θ to one of the integral bounds (i.e., 0 or 1) is preferred over fractional values
as to avoid cuts with fractional coefficients.
Remark 1. In Algorithm 2, setting θ = 1 produces sparser Benders cuts.
In fact, our preliminary results indicated that generating Benders cuts with θ = 1 produces
slightly better results compared to setting either fractional values (e.g., 0.5) or 0 values in terms of
28
the solution time.
It is necessary to observe that setting θ between 0 and 1 yields Benders cuts that are the
convex combinations of the original constraints (i.e., Constraints (3.1b)-(3.1c) and (3.2b)-(3.2c) in
[VCIP] and [NIP], respectively) removed to obtain the MPs. This is due to the fact that there exist
a one-to-one correspondence between variables ¾i and zi. By setting θ to be either 0 or 1, the cuts
are the original constraints from the IP models. Therefore, we refer to our decomposition approach
as a general branch-and-cut method and examine common acceleration techniques used in Benders
decomposition.
3.4 Algorithmic Enhancements
In this section, we discuss the acceleration techniques that we utilize to speed up both
decomposition methods and directly solving the IP formulations.
3.4.1 Constraint Tightening
Recall that, constraints (3.2e) make sure that no leaf node shares an edge with another leaf.
The constraints also indicate that if a node i is not selected as a leaf, then any node j within its
neighborhood (i.e., j â N(i)) can be a potential leaf. However, it is highly likely that some nodes
within N(j) are connected which implies that we might determine a better bound on the RHS of
the constraint.
Definition 10. Given a graph G = (V,E), the independence number of G is defined as the cardinality
of the maximum independent set. Formally, it can be stated as Î(G) = max|U | : U â V, (i, j) /â
E âi, j â U.
Definition 11. Given a graph G = (V,E) and set of nodes S â V , the induced subgraph G[S] is a
graph which contains nodes in S and all the edges that connect any two nodes contained by S.
Proposition 3. Given a graph G = (V,E), the number of leaves of any star centered at some node
i â V is upper bounded by Î(G[N(i)]).
Proof. Considering the constraint that no leaf node is connected in a star, let us answer the following
question: âWhat is the largest number of nodes that can be selected as leaf nodes within N(i)?â.
In fact, this question is equivalent to the MIS which is the maximum number of nodes such that
29
none of which is connected to the other in a given graph. Hence, a feasible star centered at node
i cannot have more leaves than the cardinality of MIS for the induced graph formed by the nodes
within N(i).
Remark 2. For a given graph G = (V,E), the total number of feasible stars can be computed by
enumerating the independent sets in G[N(i)],âi â V (see Kleitman and Winston (1982), Samotij
(2015) for discussions on how to count the number of independent sets).
We can interpret Proposition 3 in another way such that in an induced subgraph G, we
cannot select more leaves than Î(G). That is why if one solves the MIS problem for the induced
graph generated by the neighborhood of each node, a good bound for the RHS of Constraint (3.2e)
is obtained. However, MIS cannot be solved efficiently due to its complexity. Yet, for each induced
graph, we can place a bound for the cardinality of the MIS.
For a given network G = (V,E), let I and Î(G) be the MIS and the independence number,
respectively. Then, the number of edges for the nodes included in I is bounded above by Î(G)(nâ
Î(G)). In addition, the number of edges between all the nodes j â V \I and k â I is bounded
above by(
Î(G)2
). Therefore, it can be stated that m ⤠Î(G)(n â Î(G)) +
(Î(G)
2
). Rearranging
the mathematical inequality, one can obtain the following standard UB for Î(G) stated as Îł(G)
(Schiermeyer, 2019):
Î(G) ⤠γ(G) =1
2(1 +
â(2nâ 1)2 â 8m) (3.5)
For every node i, we first form an induced graph G[N(i)]. Then, we calculate the bound
(i.e., Îł(G[N(i)])) presented in Inequality (3.5) and rephrase constraints (3.2e);
âjâN(i)
lj ⤠γ(G[N(i)])(1â li), âi â V (3.6)
3.4.2 Upper Bounds
it is important to initially bound the objective functionâiâV Âľi to get high quality initial
solutions thereby obtaining faster convergence. We first state few natural UBs on the objective value
through introducing valid inequalities, and then propose a heuristic approach that approximates the
objective value. The very first natural UB on the objective value is calculated as nâ 1. A star can
have at most n â 1 adjacent nodes where such star consists of a single center node. Then, the UB
30
can be stated as:
âiâV
Âľi ⤠nâ 1 (3.7)
Another important point is that the objective function (i.e., the size of the neighborhood
of a star) is only affected by the first and second degree nodes of the center node. Hence, we can
introduce another UB which changes according to the node selected as center and is calculated by
the summation of the size of the first and second degree nodes of the center.
âiâV
Âľi â¤âiâV
(|N(i)|+ |N2(i)|)xi (3.8)
Note that once a first degree node j â N(i) is accepted as a leaf node, the RHS presented
in inequality (3.8) decreases by one. The key observation is that if node j produces a unique path
to any second degree node, then it can be considered a leaf node. In this case, we can decrease
|N(i)|+ |N2(i)| by one thereby tightening the RHS. If node j is not a leaf node in a feasible solution,
then its contribution would be one to the objective value, which is bounded above by the contribution
of the second degree nodes uniquely reached via node j. Hence, it stays as a valid bound. Based
on this argument, we propose Algorithm 3 which approximates a bound on the objective value for
every candidate node as the center.
After running Algorithm 3, a new bound δi,âi â V , which is in practice tighter than the
former ones, is obtained. Then, the following is a valid inequality for the IPs and MPs of the Benders
decomposition algorithms:
âiâV
Âľi â¤âiâV
δixi (3.9)
Notice that Âľi replaces zi in the original formulations where zi is a binary variable. There-
fore, the next natural UB is to bound each single Âľi based on the binary restriction. We note that
this one-to-one correspondence between Âľi and zi also indicates that the Benders cuts generated are
the convex combination of the original constraints removed from the model to obtain a restricted
MP. In other words, our Benders framework can be seen like a cutting-plane algorithm. The upper
bound constraints are:
Âľi ⤠1, âi â V (3.10)
31
Algorithm 3: Bound strengthening at a given star-center i â VInput: i â V
1 δi = Ď = 0;2 for k â N2(i) do3 pred[k] = â1;4 visited[k] = 0;
5 for j â N(i) do6 unique[j] = |(j, k) â E : k â N2(i)|;7 for k â N2(i) do8 if (j, k) â E then9 if visited[k] = 0 then
10 pred[k] = j;11 else if visited[k] = 1 then12 unique[j]ââ;13 unique[pred[k]]ââ;
14 else15 unique[j]ââ;16 visited[k] + +;
17 for j â N(i) do18 if unique[j] > 0 then19 Ď + +;
20 if Ď > 0 then21 δi = |N(i)|+ |N2(i)| â Ď;22 else23 if |N2(i)| = 0 then24 δi = |N(i)|;25 else26 δi = |N(i)|+ |N2(i)| â 1;
27 return δi
Although constraints (3.10) are the tightest UB one can obtain for each individual Âľi, we
emphasize on that incorporating this UB increases the solution time and decreases solution quality
in every single instance of the decomposition implementation. We believe that this is attributed to
the fact that its addition changes the pre-solve and heuristic routines of the solver and that this tight
UB is simple enough for the solver to identify on its own. Therefore, the benefits of its potential
addition are outweighed by its drawbacks. Note that we could take a similar approach and remove
the binary restriction on zi in the IP models; however we observed that the average optimality gap
across instances increases in this situation. Therefore, our discussion remains valid only for the
restricted MPs.
32
3.4.3 Parameter Tuning
For our decomposition implementation, we switch the MIP emphasis to optimality. Since
finding a feasible star is a relatively easy task, we prefer CPLEX to focus on optimality over feasibility.
Second, the strategy for variable selection is changed to strong branching with which CPLEX puts
more effort on identifying the most favorable branch. Note that strong branching goes through
each branch to identify the best one in terms of the contribution to the objective value. In certain
scenarios, this operation might be computationally challenging. Last, we set the relaxation induced
neighborhood search (RINS) as 1,000 where CPLEX applies the RINS heuristic at every 1,000 nodes.
When solving the IPs directly, we prefer the default CPLEX settings since no consistent improvement
in terms of the solution time and/or quality is observed.
3.4.4 Warm-Start
In our experiments, we use the ratio-based greedy approach proposed by Vogiatzis and
Camur (2019) to generate a set of high quality initial solutions. The heuristic is shown to have an
approximation guarantee of O(âi) for node i where âi is the degree of node i â V which is the
center of a candidate induced star.
The algorithm has two phases and continuously checks the ratio between the possible gain
and loss of adding a node into a star in terms of the cardinality of the open neighborhood. In the
first phase, we pick a node with the highest contribution to the objective where placing the node
into the star does not decrease the contribution of the other candidate leaves. In the second phase,
we look for a node which yields the highest ratio whose denominator keeps track of the potential loss
that could occur due to the adjacent nodes. For more details about the heuristic and its pseudocode,
we refer to reader to Vogiatzis and Camur (2019).
While the UBs introduced in Section 3.4.2 help the solver to tighten the dual bounds, our
intention with using warm-start is to help with the primal bounds. It is crucial to point out that
we use the valid inequalities (see Sections 3.4.1 and 3.4.2) if applicable for both IP models for a fair
comparison. For the warm-start strategy, we have a set of experiments to see its impact on each
model in Section 3.5.1.1.
33
3.5 Experimental Results
All the experiments are conducted using Java and CPLEX 12.8.1 on an Intel Core i7-6500
CPU at 3.10GHz laptop with 16 GB of RAM. During the implementation of the decomposition
algorithm, we utilize the callback function feature to add the Benders cuts as lazy cuts and user
defined cuts. While Algorithm 3 and ratio-based heuristic are implemented in Java, the UB (3.5)
introduced in Section 3.4.2 is calculated in R using the igraph library. All data sets and code sources
used in our study are available online at https://github.com/mcamur/SDC.
3.5.1 Randomly Generated Instances
We first randomly generate test cases according to three well-known models through igraph
(Igraph, 2020): i) BarabasiâAlbert (BA) (i.e., scale-free networks), ii) ErdosâRenyi (ER) (i.e.,
random networks), and iii)WattsâStrogatz (WS) (i.e., small-world networks). We consider instances
with n â 500, 600, 700, 800, 900, 1000 regardless of the model type, as each model has its own
parametric settings, which are summarized in Table 3.1.
Table 3.1: Parameter settings
Model Parameter DefinitionBA g the number of edges generated at each stepER pr probability of adding an edge between randomly selected two nodes
WSr the rewiring probabilitynei the average degree of each node
In the BA model, we consider g in the set 10, 12, 14, 16. For ER model, we set pr as
in where i â 10, 20, 30, 40, 50 and i â 20, 30, 40, 50, 60 for 500, 600, 700 and 800, 900, 1000
nodes, respectively. Finally, in the WS model r is pulled from the set 0.3, 0.5, 0.7 in every instance,
and nei is in the set 12, 14, 16 and 14, 16, 18 for 500, 600, 700 and 800, 900, 1000 nodes,
respectively. Overall, the total number of instances generated in the BA, ER, and WS models are
24, 30, and 54, respectively.
During our computational studies, we set a time limit of 3,600 seconds, where we also take
the time required by Algorithm 3 into consideration. We will first test the impact of warm-start
on each solution technique and then proceed to the full set of analysis conducted on the randomly
generated networks. We present the comparisons between [NIP], [VCIP], [DNIP], and [DVCIP] for
each model where [DNIP] and [DVCIP] represent the decomposition implementations for the IP
34
models [NIP] and [VCIP], respectively.
3.5.1.1 Warm-Start Analysis
We examine the impact of warm-start on the randomly generated networks where n â
500, 700, 900. The main goal is to decide whether performing the full analysis with or without
warm-start in each solution technique (i.e., [NIP], [VCIP], [DNIP] and [DVCIP]) should be done.
Note that we take the time taken to run the ratio-based greedy approach into consideration in each
instance.
Figure 3.3: The impact of warm-start in the solutiontimes in [NIP] in the BA model
-1800
-1600
-1400
-1200
-1000
-800
-600
-400
-200
0
14 12 10
500 700 900
Sec
on
ds
g / n
BarabĂĄsiâAlbert model [NIP]
Time Difference
Figure 3.4: The impact of warm-start in the opti-mality gaps in [NIP] in the BA model
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
14 12 16
500 900Op
tim
alit
y G
ap
g / n
BarabĂĄsiâAlbert model [NIP]
Gap Difference
Figure 3.5: The impact of warm-start in the solutiontimes in [VCIP] in the BA model
-500
0
500
1000
1500
2000
12 14 16 10 12 10 12 14
500 700 900
Sec
on
ds
g / n
BarabĂĄsiâAlbert model [VCIP]
Time Difference
Figure 3.6: The impact of warm-start in the opti-mality gaps in [VCIP] in the BA model
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
14 16 12 16
700 900
Op
tim
alit
y G
ap
g / n
BarabĂĄsiâAlbert model [VCIP]
Gap Difference
We compare the solutions obtained with and without warm-start from two different per-
spectives: (i) difference between the solution times when either produces a feasible solution, and (ii)
difference between the optimality gaps when either produces an optimal solution. We set thresholds
of 30 seconds and 0.5% for (i) ad (ii), respectively. If the absolute value of a difference value is less
than the corresponding threshold, we neglect to report such result. Note that the negative improve-
ment in both solution time and optimality gap indicates that warm-start improves the performance
of the solution technique utilized.
35
In the BA model, we observe that while warm-start helps [NIP] to improve the solution time
a considerable amount in three instances out of 12, an inconsistent pattern takes place in terms of
optimality gaps (see Figs. 3.3 and 3.4). Furthermore, [VCIP] does not show a clear trend in both
solution times and optimality gaps as depicted in Figs. 3.5 and 3.6.
In the ER model, while warm-start increases the solution time in [NIP] in solely one instance
by roughly 2300 seconds (i.e., n = 900, pr = 0.033), we do not observe any instance where it helps
with the solution time. As for [VCIP], there is no instance with respect to the solution time that
meets our threshold definition of improvement (30 seconds). Furthermore, similar to the BA model,
no consistent pattern appears in terms of optimality gaps in both IP models as depicted in Figs. 3.7
and 3.8.
Figure 3.7: The impact of warm-start in the opti-mality gaps in [NIP] in the ER model
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.08 0.033 0.044 0.056
500 900
Op
tim
alit
y G
ap
pr / n
ErdĹsâRĂŠnyi model [NIP]
Gap Difference
Figure 3.8: The impact of warm-start in the opti-mality gaps in [VCIP] in the ER model
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.1 0.043 0.057 0.044 0.067
500 700 900
Op
tim
alit
y G
ap
pr / n
ErdĹsâRĂŠnyi model [VCIP]
Gap Difference
Lastly, in the WS model, we observe that warm-start helps [NIP] with the solution time in
two instances (i.e., n = 500, nei = 12, r = 0.3, n = 700, nei = 12, r = 0.5) to a great extent, which is
a decrease of nearly 3500 seconds. On the other hand, while [VCIP] shows a worse performance in
one instance (n = 500, nei = 12, r = 0.5) with an increase of around 1200 seconds via warm-start, no
apparent improvement is seen in any of the instances. Similar to other network models, we cannot
see a distinguishable performance with respect to the optimality gaps in both IP formulations when
warm-starting (see Figs. 3.9 and 3.10). Therefore, it becomes hard to reach a solid conclusion.
As for the decomposition implementations, we do not observe big changes with respect to
solution time and optimality gaps either; especially in both BA and ER models in the majority of
the instances. For the changes occurring, they turn out to be more erratic patterns compared to the
IP models. As an example, we share Figs. 3.11 and 3.12 which illustrate the solution time changes
in the WS model with warm-start in [DNIP] and [DVCIP], respectively.
36
Figure 3.9: The impact of warm-start in the opti-mality gaps in [NIP] in the WS model
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.3 0.5 0.7 0.5 0.7 0.3 0.5 0.7 0.5 0.3 0.3 0.5 0.7 0.3 0.5 0.5 0.7
12 14 16 12 14 16 16 18
500 700 900
Op
tim
alit
y G
ap
r - nei - n
WattsâStrogatz model [NIP]
Gap Difference
Figure 3.10: The impact of warm-start in the opti-mality gaps in [VCIP] in the WS model
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.3 0.5 0.3 0.5 0.7 0.3 0.3 0.5 0.7 0.3 0.7 0.3 0.5 0.7
12 14 14 16 16 18
500 700 900
Op
tim
alit
y G
ap
r - nei - n
WattsâStrogatz model [VCIP]
Gap Difference
Figure 3.11: The impact of warm-start in the solu-tion times in [DNIP] in the WS model
-600
-500
-400
-300
-200
-100
0
100
200
300
400
0.5 0.3 0.5 0.7 0.3 0.7 0.5 0.7 0.3 0.5 0.7 0.7 0.3 0.7 0.5
12 14 16 12 14 16 14 16
500 700 900
Sec
on
ds
r - nei - n
WattsâStrogatz model [DNIP]
Time Difference
Figure 3.12: The impact of warm-start in the solu-tion times in [DVCIP] in the WS model
-1200
-1000
-800
-600
-400
-200
0
200
400
0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.3 0.5 0.7 0.3 0.7 0.3 0.5 0.7 0.5
12 14 16 12 14 16 14 16
500 700 900
Sec
on
ds
r - nei - n
WattsâStrogatz model [DVCIP]
Time Difference
Our results have three main findings: i) the solver does not face a difficulty in improving the
primal bounds, which can also be practically observed when engine logs are analysed, ii) warm-start
does not improve the solution quality in terms of optimality gaps in many instances, and iii) one
cannot reach a sharp conclusion whether warm-starting both IP models and MPs via an effective
heuristic solution works well or not. As a result, we decide to move into the full analysis without
using warm-start as an acceleration technique.
3.5.1.2 Full Analysis
In this section, we compare the performance of the solution techniques on all randomly
generated networks. If the optimal solution is not obtained by the time limit (TL), we report the
optimality gap provided by CPLEX. For each instance, we share: i) the time taken to reach the
solution in seconds, ii) the optimality gap returned in %, and iii) the number of branch-and-bound
nodes saturated by the solver. In addition, we show n, m, the density of the graph represented by
D (i.e., 2m/[n(n â 1)]), and the corresponding parameters (see Table 3.1). Tables 3.3, 3.4 and 3.5
show the results for the BA, ER, and WS models, respectively.
37
Table 3.2: Summary of results
BA Model - 24 instances ER Model - 30 instances WS Model - 54 instances
[NIP][VCIP][DNIP][DVCIP][NIP][VCIP][DNIP][DVCIP][NIP][VCIP][DNIP][DVCIP]
Optimal 10 14 20 19 11 12 14 14 13 21 35 34Pct 42 58 83 79 37 40 47 47 24 39 65 63Ave Gap 8.82 7.16 0.44 1.66 12.06 10.97 4.02 4.29 24.71 20.95 2.22 2.61Best 3 6 12 3 3 4 14 9 12 6 30 6
We start our analysis with a summary of the computational results in Table 3.2. For
each network model, we compare all four methods in terms of: i) the number of instances solved
to optimality, ii) the percentage of instances where optimal solutions were found, iii) the average
optimality gap over all instances, and iv) the number of instances where a method shows the best
performance. Note that the best performance is first identified based on the optimality gaps. If
more than one method reaches the optimal solution for the same instance, then we compare the
solution times.
We observe that the decomposition implementations significantly outperform the [NIP] and
[VCIP]. We do note that [VCIP] turns out to be the slightly better IP formulation; however, our
analysis indicates that [DNIP] outperforms [DVCIP].
To start with, both decomposition algorithms show a considerably high performance in the
BA model where [DNIP] and [DVCIP] solve two-times and and two-third-times more instances to
optimality compared to their corresponding IPs, respectively. However, when it comes to the ER
model, the performance of the two algorithms worsens, yet is still better than the IPs, and they
can only solve 14 of the instances, which is roughly half of the total number of ER instances. It is
important to mention that the instances that cannot be solved to optimality are the same instances
in both algorithms with two exceptions (n = 800, pr = 0.038 and n = 1000, pr = 0.03). Furthermore,
it is worth mentioning that there is no single instance in both BA and ER models where either of
the IP models reaches the optimal solution while decomposition methods do not.
The reason behind the lower performance shown via decomposition implementations in the
ER model compared to the BA models can be explained from two perspectives. First, the average
edge numbers and the average graph densities are 9, 656/0.036 and 13, 016/0.046 in the BA and ER
models, respectively. In other words, the problem gets harder to solve with higher edge numbers
and/or a denser graph. Also, the density of graphs in the ER model increases at a faster rate than
the other models for our selected parameters. Second, we examine the number of clique inequalities
added by the solver. For instance, while the solver generates 184 clique inequalities on average in the
38
Table 3.3: The computational results for the BA Model
[NIP] [VCIP] [DNIP] [DVCIP]
n m D gTime(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
500 4945 0.04 10 58.16 0 9639 19.95 0 3144 78.97 0 1678 140.22 0 1517500 5922 0.048 12 88.61 0 17839 228.01 0 21646 133.51 0 2992 233.48 0 2482500 6895 0.055 14 TL 11.08 238608 309.08 0 22597 593.65 0 9663 569.76 0 4364500 7864 0.063 16 TL 9.48 363414 3292.34 0 162751 1328.52 0 21795 1668.44 0 11623600 5945 0.033 10 18.43 0 3910 260.31 0 10526 89.04 0 2024 106.83 0 1053600 7122 0.034 12 1824.26 0 139597 310.81 0 16615 203.28 0 3361 324.7 0 2580600 8295 0.046 14 171.11 0 22949 459.01 0 22185 641.98 0 8188 624.58 0 3950600 9464 0.053 16 TL 10.81 169044 TL 13.21 58488 1605.87 0 24178 2777.47 0 16317700 6945 0.028 10 141.95 0 13754 363.34 0 18020 316.04 0 4811 169.49 0 1785700 8322 0.034 12 3519.86 0 183841 485.29 0 29795 700.25 0 8734 442.82 0 3670700 9695 0.04 14 TL 13.95 131019 TL 14.50 65264 1883.1 0 23709 2021.94 0 14561700 11064 0.045 16 TL 13.47 148775 TL 15.82 49246 TL 2.15 37234 TL 3.09 18598800 7945 0.025 10 201.33 0 9630 51.12 0 4839 154.34 0 2390 100.9 0 816800 9522 0.03 12 3059.28 0 125518 TL 17.33 51590 818.19 0 7947 596.68 0 3782800 11095 0.035 14 TL 19.08 102405 TL 20.75 54750 1528.24 0 15288 2311.62 0 10275800 12664 0.04 16 TL 17.09 111822 TL 20.13 57051 TL 1.65 34356 TL 11.77 8614900 8945 0.022 10 1018.83 0 58500 122.62 0 4480 275.33 0 3626 339.92 0 2156900 10722 0.027 12 TL 3.00 135767 TL 10.80 49816 961.72 0 8640 1393.53 0 7405900 12495 0.031 14 TL 16.41 90017 946.41 0 36842 1964.96 0 19329 2432.43 0 11498900 14264 0.035 16 TL 19.04 130576 TL 17.77 41565 TL 1.15 29232 TL 10.71 83751000 9945 0.02 10 TL 20.45 82900 589.65 0 21920 631.54 0 5953 503.15 0 29791000 11922 0.024 12 TL 15.68 80103 2993.83 0 62964 1596.08 0 16203 1925.37 0 102231000 13895 0.028 14 TL 22.97 94927 TL 23.15 33999 2416.05 0 20431 TL 2.63 111921000 15864 0.032 16 TL 19.12 90223 TL 18.40 38594 TL 5.66 20919 TL 11.65 6166
BA model in [DNIP], this average drops to 10 in the ER model. For the [DVCIP], it produces, on
average, 2 clique inequalities in the BA model and only 0.8 in the ER model. As a potential future
research direction, one might be interested in incorporating clique inequalities for each triangle in a
cutting-plane manner to test whether it would strengthen the decomposition implementations.
In the WS model, while [DNIP] solves nearly threefold the number of instances solved by
[NIP], [DVCIP] solves one and a half times more than the instances solved by [VCIP]. For the
instances that are not solved to optimality, [DNIP] and [DVCIP] give an average of 6.30% and
7.05% optimality gaps, respectively. While both decomposition implementations far outperform the
corresponding IPs in the majority of the instances with respect to the solution status, we observe
only two instances where they fail to reach the optimal solution while [VCIP] does (see the instances
(n = 1000, nei = 16, p = 0.5) and (n = 1000, nei = 16, p = 0.7) in Table 3.5).
Note that both IP formulations show poorer performances on the WS model compared to
the other network models. First, we believe that the number of clique inequalities is again a driving
factor to reach the optimal solution especially in [VCIP]. For example, for the instances solved to
39
Table 3.4: The computational results for the ER Model.
[NIP] [VCIP] [DNIP] [DVCIP]
n m D prTime(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
500 2469 0.02 0.02 7.98 0 159 5.13 0 139 0.52 0 0 0.77 0 0500 4999 0.041 0.04 52.88 0 1047 31.19 0 857 21.61 0 231 39.86 0 147500 7537 0.061 0.06 TL 19.21 82190 TL 18.81 79125 1611.1 0 12939 2406.97 0 11377500 9870 0.08 0.08 TL 11.13 99443 TL 11.10 72571 TL 3.90 22869 TL 6.15 5690500 12466 0.1 0.1 TL 7.71 109527 TL 7.69 75616 TL 7.36 10978 TL 7.30 3426600 2948 0.017 0.017 5.05 0 0 7.34 0 0 1.01 0 0 1.09 0 0600 6009 0.034 0.033 22.81 0 1215 34.24 0 971 101.22 0 721 121.21 0 437600 8993 0.051 0.05 TL 26.15 50422 TL 24.34 89329 2056.21 0 11162 1578.78 0 5446600 11967 0.067 0.067 TL 13.97 113537 TL 15.06 55589 TL 7.33 10120 TL 8.55 4122600 14993 0.084 0.083 TL 7.26 115613 TL 11.78 39343 TL 7.64 10802 TL 10.11 3057700 3483 0.015 0.014 11.7 0 57 7.12 0 0 0.78 0 0 1.28 0 0700 6895 0.029 0.029 35.81 0 1064 31.77 0 973 30.94 0 186 54.6 0 265700 10526 0.044 0.043 TL 30.30 22403 TL 33.38 98316 3182.75 0 155342024.99 0 5393700 13943 0.057 0.057 TL 18.36 48110 TL 17.07 33905 TL 5.80 5886 TL 6.47 6557700 17713 0.073 0.071 TL 11.60 48468 TL 11.29 26307 TL 7.36 7383 TL 9.12 3477800 7890 0.025 0.025 28.81 0 903 7.34 0 0 7.89 0 50 12.11 0 25800 11969 0.038 0.038 TL 33.70 102977 34.24 0 971 TL 1.45 10888 3440.81 0 6737800 15859 0.05 0.05 TL 24.06 62404 TL 24.34 89329 TL 9.63 0 TL 10.44 0800 20003 0.063 0.063 TL 15.14 52575 TL 15.06 55589 TL 9.12 3269 TL 8.48 2920800 19910 0.063 0.075 TL 14.47 36246 TL 11.78 39343 TL 8.21 3842 TL 7.55 2873900 9064 0.023 0.022 50.42 0 1241 39.3 0 1025 50.43 0 343 60.67 0 202900 13418 0.034 0.033 1285.1 0 14926 265.01 0 1737 2182.82 0 0 1957.7 0 2250900 17979 0.045 0.044 TL 29.64 28104 TL 23.34 21991 TL 8.47 2585 TL 8.90 4589900 22397 0.056 0.056 TL 19.28 17336 TL 16.33 9846 TL 8.84 3976 TL 8.32 1829900 22349 0.056 0.067 TL 15.60 36047 TL 16.58 4784 TL 8.60 3742 TL 8.46 21791000 10003 0.021 0.02 35.65 0 1408 32.41 0 838 8.28 0 34 8.82 0 281000 14926 0.03 0.03 164.97 0 5218 333.14 0 1918 3540.07 0 5114 TL 3.26 96551000 20008 0.041 0.04 TL 25.81 50235 TL 33.36 11367 TL 8.66 2083 TL 9.69 40131000 24896 0.05 0.05 TL 18.53 30891 TL 20.61 3325 TL 9.84 1850 TL 8.04 24321000 25015 0.051 0.06 TL 19.97 31503 TL 17.24 3470 TL 8.43 1846 TL 7.90 1784
optimality by [VCIP], the solver produces 364 clique inequalities on average. On the other hand, this
number drops to 30 for instances that fail to solve to optimality. Further, we expect to have more
feasible stars in WS model than in both BA and ER models. We believe this is due to the fact that the
small world nature of the WS model implies that there are many stars with open neighborhoods of
similar size centered at i because nodes tend to share a common neighbor. Potentially, this symmetry
may cause issues in solving the IP models. One might be interested in examining symmetry breaking
techniques during the search process for WS networks in the future. Lastly, since [NIP] is not as
tight as [VCIP] (please see the proof of Theorem 1 in the online supplement), we believe that the
graphs generated by the WS model may be more challenging for [NIP].
We now look at the cases where both decomposition algorithms reach the optimal solution
and make a comparison in terms of the solution time. As shown in Fig. 3.13, [DNIP] outperforms
40
Table 3.5: The computational results for the WS Model
[NIP] [VCIP] [DNIP] [DVCIP]
n m D nei rTime(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
500 6000 0.049 12 0.3 TL 9.02 25308 TL 21.28 91678 51.7 0 227 24.28 0 105500 6000 0.049 12 0.5 TL 23.43 77774 2417.97 0 128770 574.56 0 4370 663.63 0 3293500 6000 0.049 12 0.7 TL 20.69 84833 2151.62 0 73650 232.84 0 1520 358.92 0 1173500 7000 0.057 14 0.3 TL 28.14 76538 TL 28.50 72283 482.08 0 3171 612.67 0 2075500 7000 0.057 14 0.5 TL 18.96 106315 TL 19.36 127357 221.26 0 0 278.13 0 1355500 7000 0.057 14 0.7 TL 20.04 86474 TL 17.37 86253 752.48 0 4378 873.46 0 3004500 8000 0.065 16 0.3 TL 24.59 195299 TL 25.43 77650 1831.41 0 14462 2941.25 0 12349500 8000 0.065 16 0.5 TL 18.76 177452 TL 20.46 78010 2915.64 0 21736 TL 5.16 10613500 8000 0.065 16 0.7 TL 20.17 199549 TL 20.96 89842 3462.82 0 24098 TL 4.97 9936600 7200 0.041 12 0.3 62.47 0 3084 63.42 0 1209 240.25 0 1472 343.41 0 1123600 7200 0.041 12 0.5 TL 20.20 62763 63.52 0 1149 145.35 0 681 114.59 0 353600 7200 0.041 12 0.7 37.89 0 2018 60.7 0 1153 357.76 0 1785 355.04 0 1176600 8400 0.047 14 0.3 TL 24.59 36714 TL 34.15 80105 114.52 0 390 118.75 0 201600 8400 0.047 14 0.5 TL 32.60 58742 TL 30.85 110016 2154.81 0 12498 3193.15 0 10954600 8400 0.047 14 0.7 TL 30.73 31864 TL 32.09 119002 2175.71 0 11386 1162.07 0 5026600 9600 0.054 16 0.3 TL 36.59 69550 TL 31.87 67384 2617.79 0 13338 3161.54 0 11501600 9600 0.054 16 0.5 TL 20.88 128234 TL 19.55 67171 1370.85 0 0 2021.53 0 0600 9600 0.054 16 0.7 TL 21.17 108631 TL 24.01 61322 2842.7 0 14824 TL 2.39 8895700 8400 0.035 12 0.3 48.16 0 1368 79.6 0 1339 10.64 0 71 41.4 0 173700 8400 0.035 12 0.5 TL 20.46 73254 93.55 0 1319 310.13 0 1105 385.78 0 1608700 8400 0.035 12 0.7 43.91 0 2246 93.88 0 1314 241.96 0 1209 327.95 0 795700 9800 0.041 14 0.3 TL 36.23 101988 TL 50.48 86444 468.17 0 1091 300.63 0 479700 9800 0.041 14 0.5 TL 31.33 63199 183.8 0 1379 795.9 0 2575 835.1 0 1843700 9800 0.041 14 0.7 TL 25.24 55767 125.84 0 1363 195.25 0 889 513.84 0 1325700 11200 0.046 16 0.3 TL 26.22 37692 TL 27.42 56269 105.26 0 202 131.25 0 170700 11200 0.046 16 0.5 TL 33.45 40710 TL 30.36 61073 TL 4.99 10109 TL 5.57 6050700 11200 0.046 16 0.7 TL 29.74 45132 TL 23.14 57484 1399.7 0 0 3391.95 0 2902800 11200 0.036 14 0.3 98.88 0 3825 286.84 0 1602 1306.57 0 3667 1412.73 0 3405800 11200 0.036 14 0.5 105.97 0 6467 172.33 0 1576 TL 4.00 7536 2785.38 0 8209800 11200 0.036 14 0.7 106.27 0 4124 169.39 0 1559 1188.81 0 3737 1340.02 0 2292800 12800 0.041 16 0.3 TL 51.67 72137 TL 51.78 52088 TL 2.45 9949 2029.77 0 6406800 12800 0.041 16 0.5 TL 39.25 36231 TL 33.31 60910 TL 3.48 5719 TL 3.66 7985800 12800 0.041 16 0.7 TL 32.89 58367 TL 34.33 52240 TL 6.27 9798 TL 8.36 6806800 14400 0.046 18 0.3 TL 41.58 49452 TL 45.56 42441 TL 6.88 5977 TL 7.96 7436800 14400 0.046 18 0.5 TL 26.99 80005 TL 26.96 41281 TL 4.82 5382 TL 5.50 7904800 14400 0.046 18 0.7 TL 30.80 50056 TL 25.96 45543 TL 7.06 4540 TL 7.90 6258900 12600 0.032 14 0.3 108.51 0 3025 280.17 0 1793 1591.75 0 3298 1441.56 0 1868900 12600 0.032 14 0.5 107.42 0 4075 258.75 0 1749 1136.86 0 2300 999.72 0 3231900 12600 0.032 14 0.7 104.11 0 5085 252.56 0 1733 1641.69 0 4338 1950.74 0 5006900 14400 0.036 16 0.3 TL 56.89 90708 TL 57.94 50475 TL 3.86 6021 TL 2.94 9397900 14400 0.036 16 0.5 TL 33.72 69199 TL 35.05 44805 1333.74 0 2868 1093.58 0 3233900 14400 0.036 16 0.7 TL 39.68 66614 TL 42.36 47839 TL 5.48 3897 TL 6.89 7582900 16200 0.041 18 0.3 TL 46.37 56945 TL 47.88 30751 TL 8.94 4912 TL 9.60 5537900 16200 0.041 18 0.5 TL 34.79 74381 TL 34.74 29603 TL 10.20 3297 TL 10.96 3975900 16200 0.041 18 0.7 TL 32.97 51064 TL 35.54 27628 TL 5.68 2982 TL 6.55 56521000 14000 0.029 14 0.3 127.2 0 2867 241.73 0 1978 2027.07 0 3646 1490.76 0 49501000 14000 0.029 14 0.5 100.75 0 3070 217.52 0 1922 1328.25 0 2301 894.33 0 25601000 14000 0.029 14 0.7 84.9 0 1996 184.2 0 1781 202.7 0 375 281.6 0 7191000 16000 0.033 16 0.3 TL 77.90 82180 TL 75.20 35581 TL 9.31 4730 TL 10.46 76211000 16000 0.033 16 0.5 TL 50.80 113819 505.56 0 1992 TL 8.35 3539 TL 8.35 69671000 16000 0.033 16 0.7 TL 39.52 121941 317.29 0 1961 TL 4.19 4891 TL 5.47 71861000 18000 0.037 18 0.3 TL 48.99 54205 TL 47.41 23574 TL 7.15 4069 TL 9.52 74271000 18000 0.037 18 0.5 TL 43.92 88314 TL 42.44 18365 TL 10.69 3309 TL 11.64 53191000 18000 0.037 18 0.7 TL 32.27 61227 TL 37.70 23845 TL 5.87 2756 TL 7.08 6828
[DVCIP] solution-time-wise and reaches the optimal solution quicker in 12 instances. As for the ER
model, we observe slightly a different trend. For the instances where both method take more than
41
1,000 seconds to solve (i.e., four instances), [DVCIP] performs better and outperforms [DNIP] in
three instances (see Fig. 3.14). Even though overall [DNIP] produces a better solution time in more
instances ( i.e., 10 out of 13 instances), [DVCIP] is 75 seconds faster than [DNIP] on average. Lastly,
as for the WS model, [DNIP] notably outperforms [DVCIP] as depicted in Figs. 3.15 and reaches
the optimal solution faster in 22 instances out of 32. On average, [DNIP] is 139 seconds faster than
[DVCIP].
Figure 3.13: Solution time comparison between[DNIP] and [DVCIP] in the BA model
0
500
1000
1500
2000
2500
3000
10 12 14 16 10 12 14 16 10 12 14 10 12 14 10 12 14 10 12
500 600 700 800 900 1000
So
luti
on
Tim
e (s
ec)
g / n
BarabĂĄsiâAlbert model
[DNIP]
[DVCIP]
Figure 3.14: Solution time comparison between[DNIP] and [DVCIP] in the ER model
0
500
1000
1500
2000
2500
3000
3500
0.02 0.04 0.06 0.017 0.033 0.05 0.014 0.029 0.043 0.025 0.022 0.033 0.02
500 600 700 800 900 1000
So
luti
on
Tim
e (s
ec)
pr / n
ErdĹsâRĂŠnyi model
[DNIP]
[DVCIP]
Figure 3.15: Solution time comparison between [DNIP] and [DVCIP] in the WS model
0500
1000150020002500300035004000
0.3
0.5
0.7
0.3
0.5
0.7
0.3
0.3
0.5
0.7
0.3
0.5
0.7
0.3
0.5
0.3
0.5
0.7
0.3
0.5
0.7
0.3
0.7
0.3
0.7
0.3
0.5
0.7
0.5
0.3
0.5
0.7
12 14 16 12 14 16 12 14 16 14 14 16 14
500 600 700 800 900 1000
So
luti
on
Tim
e (s
ec)
r - nei - n
WattsâStrogatz model
[DNIP]
[DVCIP]
Although the new IP formulation [NIP] could not compete with the formulation [VCIP], the
decomposition implementation [DNIP] shows a better performance compared to [DVCIP] in terms of
both solution time and solution quality in more instances. First, as mentioned earlier, the number of
constraints is bounded by O(n) in [NIP]. and its number of non-zero coefficients are lower compared
to [VCIP]. Second, the number of non-zero coefficients is further decreased in [NIP] by constraint
tightening( Section 3.4.1). Third, when decomposing [NIP], the two constraints causing the increase
in the number of non-zero coefficients â constraints (3.2b) and (3.2c) â are placed in the SP. In fact,
as discussed previously, one the MPs of [VCIP] and [NIP] have i) 5n + 6m and 3n + 4m non-zero
coefficients, and ii) 2n + m + 1 and 2n + 1 constraints, respectively. All these facts imply that the
restricted MP generated via [NIP] is more efficient than the MP generated via [VCIP]. Note that
42
even though Theorem 1 states that [VCIP] is stronger than [NIP] with respect to LP-relaxations,
we observe that the root node relaxations turn out to be same in all randomly generated instances,
implying that the size of the formulations likely plays an important role in the quality of solving
them. Lastly, the number clique inequalities created by the solver in [DNIP] is significantly higher
than [DVCIP] on average in all three network models. Taking all these into consideration, it makes
sense that [DNIP] produces more fruitful results than [DVCIP].
Figure 3.16: The optimality gap comparisons in[NIP], [VCIP], [DNIP] and [DVCIP] in the BAmodel
0%
5%
10%
15%
20%
25%
14 16 16 14 16 12 14 16 12 14 16 10 12 14 16
500 600 700 800 900 1000
Op
tim
alit
y G
ap
g / n
BarabĂĄsiâAlbert model
[NIP]
[VCIP]
[DNIP]
[DVCIP]
Figure 3.17: The optimality gap comparisons in[NIP], [VCIP], [DNIP] and [DVCIP] in the ER model
0%
5%
10%
15%
20%
25%
30%
35%
40%
0.0
6
0.0
8
0.1
0.0
5
0.0
67
0.0
83
0.0
43
0.0
57
0.0
71
0.0
38
0.0
5
0.0
63
0.0
75
0.0
44
0.0
56
0.0
67
0.0
3
0.0
4
0.0
5
0.0
6
500 600 700 800 900 1000
Op
tim
alit
y G
ap
pr / n
ErdĹsâRĂŠnyi model
[NIP]
[VCIP]
[DNIP]
[DVCIP]
Figure 3.18: The optimality gap comparisons in [NIP], [VCIP], [DNIP] and [DVCIP] in the WS model
0%10%20%30%40%50%60%70%80%90%
0.3 0.7 0.5 0.3 0.7 0.3 0.7 0.5 0.5 0.5 0.3 0.7 0.3 0.7 0.5 0.3 0.7 0.5 0.3 0.7 0.5
12 14 16 12 14 16 12 14 16 14 16 18 16 18 16 18
500 600 700 800 900 1000
Op
tim
alit
y G
ap
r - nei - n
WattsâStrogatz model
[NIP]
[VCIP]
[DNIP]
[DVCIP]
Lastly, we compare all four methods in terms of the optimality gaps to solidify our point
when one of the methods cannot reach the optimal solution. We present Figs. 3.16, 3.17, and 3.18
where it can be clearly seen that both decomposition implementations show a better performance
than their corresponding IPs. Fig. 3.16 illustrates that [DNIP] is the best method when we have a
graph following the properties of the BA model. When we cannot reach the optimal solution with
it, the optimality gap does not exceed 5.66%. On the other hand, both IP models returns over
12.5% optimality gaps for the instances shown in Fig 3.16. The ER model turned out to be the most
challenging model where even decomposition methods had a hard time to converge to the optimal
solution for certain instances (see Fig. 3.17) whose potential reasons are discussed earlier. Yet,
[DNIP] and [DVCIP] never return an optimality gap larger than 9.84% and 10.44%, respectively.
43
As for the WS model, Fig. 3.18 depicts that as the number of nodes go up, both IP models start
returning poorer optimality gaps with few exceptions. On the other hand, both decomposition
implementations show a strong performance with the instances up to 800. When the number of
nodes is 800 or more, the average optimality gaps become 6% and 6.4% in [DNIP] and [DVCIP],
respectively; in certain cases which is still better than solving the IP model directly.
3.5.2 Protein-Protein Interaction Networks (PPINs)
In this section, we analyze the datasets of two organisms: i) Helicobacter Pylori (HP) and
ii) Staphylococcus Aureus (SA) obtained by Szklarczyk et al. (2015). Each data set is converted
into a PPIN as follows. A protein is represented by a node that is connected by an edge to all
other proteins if there exists an interaction. Each interaction is associated with an interaction score
defined within the range of [0, 1000].
With this configuration, the networks created turn out to be highly dense graphs with
diameter equal to six. The number of nodes and edges are (n = 1, 570,m = 89, 507) and (n =
2, 852,m = 146, 783) for HP and SA, respectively. Hence, we prune the interactions which are below
a certain threshold. In this study, we set the interaction threshold Îş as 600, 500, 400, 300 and
500, 400, 300, 200 for the organisms HP and SA, respectively. As a result, we obtain four networks
per organism studied. In addition, we increase the time limit to 10,800 seconds (i..e, 3 hours) due
to the size of the networks.
Table 3.6: The computational results for Helicobacter Pylori (n = 1, 570)
[NIP] [VCIP] [DNIP] [DVCIP]
Îş mTime(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
600 17735 888.09 0 7617 3117.88 0 43721 78.19 0 595 415.9 0 1522500 27570 TL 16.20 59510 TL 17.28 63273 741.73 0 2709 3356.8 0 8329400 33663 TL 18.32 68789 TL 21.16 55947 9572.51 0 28965 TL 5.69 11412300 45123 TL 15.03 53843 TL 13.53 36859 TL 4.32 12271 TL 6.32 9815
We first share the computational results for HP (see Table 3.6). As Îş decreases, the difficulty
in solving the problem increases since the graph gets denser. We initially point out that [VCIP] shows
the worst performance where it takes 51 minutes to reach an optimal solution when all other methods
converge to optimality in under 15 minutes when Îş = 600. In addition, when Îş is set as 500 and 400,
we obtain the worst optimality gaps employing [VCIP]. This is an interesting finding since [VCIP]
showed marginally a better performance than [NIP] on the randomly generated graphs as discussed
44
in the previous section. On the other hand, [DNIP] outperforms all three methods by reaching
the optimal solution in three instances out of four. Even though none of the methods reaches the
optimal solution when Îş is 300, [DNIP] provided the best optimality gap (4.32%).
We now share Table 3.7 and the results for SA. Once again, we observe that [VCIP] shows
a poorer performance compared to the others. For instance, when Îş is set as 400, even though all
three other methods converge to the optimal solution, [VCIP] returns an optimality gap of 16.37%.
Similar to the results seen in HP, [DNIP] produces the best optimality gaps when no other method
can reach the optimal solution. Yet, even though [DNIP] gives the best optimality gap when Îş = 200,
the result does not seem as good as the other instances (i.e., 18.09%). Therefore, it might be better
to increase the solution time limit when κ ⤠200. Lastly, it is worth mentioning that [NIP] reaches
the optimal solution roughly two times faster than both decomposition methods when Îş = 400.
Table 3.7: The computational results for Staphylococcus Aureus (n = 2, 852)
[NIP] [VCIP] [DNIP] [DVCIP]
Îş mTime(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
Time(sec)
Gap(%)
BBNodes
500 21549 65.18 0 415 89.39 0 621 94.19 0 0 45.34 0 43400 30276 202.13 0 2576 TL 16.37 29888 429.81 0 1557 504.22 0 957300 45645 TL 37.30 48008 TL 32.64 36084 TL 3.40 10957 TL 13.20 6671200 87607 TL 26.54 21999 TL 27.93 18250 TL 18.09 5873 TL 27.99 2687
Our computational results on the real-world PPINs indicate that [DNIP] is the best method
among all others methods where the optimal solution can be reached for the most of the instances
for both organisms tested (i.e., 75% and 50% success rate for HP and SA, respectively). On the
other hand, the new IP formulation showed a better performance compared to the one existing in the
literature that is different than the observation made in the previous section. We can interpret this
from two different points of view: i) [NIP] might be more effective in larger and denser graphs, and/or
ii) [NIP] works better specifically in PPINs which carry different characteristics (e.g., following
different probability distributions) than the well-known networks models.
3.6 Conclusion
In this chapter, we first introduce a new IP formulation for the SDC problem where the goal
is to identify the induced star with the largest open neighborhood. We then show that while the
SDC can be efficiently solved in tree graphs, it remains NP-complete in bipartite and split graphs
45
via a reduction performed from the set cover problem. In addition, we implement a decomposition
algorithm inspired by the Benders Decomposition together with several acceleration techniques to
both the new IP formulation and the existing formulation in the literature. Finally, we share ex-
tensive computational results on three well-known network models (BarabasiâAlbert , ErdosâRenyi,
and WattsâStrogatz model), and large-scale PPINs generated for two organisms (Helicobacter Pylori
and Staphylococcus Aureus).
Our findings include: i) the existing formulation performs better with respect to the solution
time and solution quality when solving the IP models via a branch-and-cut process on randomly
generated graphs; ii) the new formulation starts showing its effectiveness in real networks as the size
and density increase; iii) the decomposition approaches significantly outperform both IP models in
every network model; and iv) the decomposition approach based on the new IP model is shown to be
a more effective decomposition framework than the one designed based on the previously proposed
IP model.
In the future, it might be interesting to investigate the weighted SDC problem and analyze
the impact of the weights on the identification of the essential proteins, rather than employing
thresholds to cut off less frequent protein-protein interactions. In addition, from an algorithmic
perspective, it could be a good direction to accelerate the decomposition implementations by: i)
working on determining new valid inequalities and ii) incorporating clique inequalities especially for
triangles.
46
Chapter 4
The Stochastic Pseudo-Star Degree
Centrality
We show that the SPSDC problem is NP-complete on general graphs, trees, and windmill
graphs in Section 4.1. Next, we introduce a non-linear binary optimization model for the SPSDC
problem and convert it into a linear form via McCormick inequalities (see Section 4.2). While Section
4.3 discusses the solution methodology that we adapt (i.e., Benders Decomposition), we present the
algorithmic enhancements in Section 4.4. We focus on the data generation phase and provide a wide
range of computational experiments in Section 4.5. Lastly, we summarize our contribution and share
our insights for a future research in Section 4.6.
4.1 Complexity Discussion
We first discuss the computational complexity of the problem of detecting the node as the
center of the stochastic pseudo-star with the maximum-connection probability. Below, we present
the decision version of the problem.
Definition 12. (Stochastic Pseudo-Star Degree Centrality) Given an undirected graph
G = (V,E), probability vector #p , user-defined value θ and a positive real-number `, does there exists
an induced pseudo-star Sk centered at any node k such that the total assignment probability is at
least T ?
47
We show that SPSDC problem is NP-complete under the assumption of P 6= NP.
Theorem 5. The stochastic pseudo-star degree centrality is NP-complete.
Proof. Given an instance of SDC < G, ` >, let us generate an instance of SPSDC as < G, T , #p , θ >
where G = G, T = `, #p =#1 , θ = 0. With this formation, one can realize that the SPSDC problem
solves the SDC due to the fact that (i) no leaf can share an edge according to Ineq. (2.1), (ii) the
objective function becomes the maximization of the number of nodes in the open neighborhood.
Hence, the proof is relatively straightforward and we can conclude that the problem at hand is
NP-complete.
The SDC problem is shown to be solvable in polynomial time on trees by proposing an
algorithm running in O(m) in Chapter 3. However, we show that the SPSDC problem remains
NP-complete even the given graph is a tree by a reduction from the knapsack problem.
Definition 13. (Knapsack) Given a set of q items (I = 1, 2, ¡ ¡ ¡ , q) with sizes s1, s2, ¡ ¡ ¡ , sq and
values v1, v2, ¡ ¡ ¡ , vq, capacity C, and value V, does there exists a subset K â 1, 2, ¡ ¡ ¡ , q such that
the total size and total value of the subset are less than or equal to C and greater than or equal to V,
respectively?
Theorem 6. The stochastic pseudo-star degree centrality is NP-complete on trees.
Proof. Given an instance of the knapsack problem < ~s,~v, C,V >, we will create an instance of the
SPSDC problem as < G(V,E), `, #p , θ >. The underlying graph G(V,E) is a tree where nodes and
edges are defined as follows (see Fig. 4.1 for a visual representation of the network):
V = d1, d2, d3, d4, d12, d
22, ¡ ¡ ¡ , d
q2, d
13, d
23, ¡ ¡ ¡ , d
q3, d
14, d
24, ¡ ¡ ¡ d
q4, 1, 2, ¡ ¡ ¡ q, 1â˛, 2â˛, ¡ ¡ ¡ qâ˛, 1â˛â˛, 2â˛â˛, ¡ ¡ ¡ qâ˛â˛
E = âŞqi=1(d1, i) ⪠(d1, d2) ⪠(d1, d3) ⪠(d1, d4) ⪠âŞqi=1(d2, di2) ⪠âŞqi=1(d3, d
i3)
⪠âŞqi=1(d4, di4) ⪠âŞqi=1(i, iâ˛) ⪠âŞqi=1(i, iâ˛â˛)
where each i â I represents an item and the rest of the nodes are considered dummy nodes whose
total number is 5q+ 3. For the sake of simplicity, let us assume that q ⼠2 and vi â Z+, which does
not change the complexity of the knapsack problem. We set ` = Vvmax
+ 3q +âjâI e
âsj/C where
vmax := maxvi : i â I.
Let us consider the SPSDC problem for a given node d on a tree, which helps us to define
both ~p and θ. Since trees are acyclic graphs, we are not concerned with the connections between leaf
48
nodes. In such a scenario, any node j adjacent to d can contribute to the objective in two different
ways. If it is a leaf node, then its contribution isâkâN(j):d 6=k pkj ; pdj otherwise. Moreover, the only
constraint that should be satisfied is the feasibility condition which is defined asâ
jâN(k)
pdj ⼠(1â θ).
We are now ready to determine the probability values used in network G. First, we assign
pd1i = eâsi/C ,âi â I. We then set piiⲠ= vivmax
and piiâ˛â˛ = eâsi/C ,âi â I. All remaining probability
values are set equal to one. Lastly, we set θ = (1â 1e ). Our transformation is presented in Fig. 4.1,
where edges are labeled with their probability values.
Figure 4.1: The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > of StochasticPseudo-Star Degree Centrality on a tree
d1
d2
d3
d4
d13
d23
dq3
d14
d24
dq4
dq2
d22
d12
1â˛
qâ˛
1â˛â˛
qâ˛â˛
1
q
1
eâs1/C
eâsq/C
1
1
1
1
1
1
1
1
1
v1vmax
vqvmax
1
1
eâs1/C
eâsq/C
We examine the problem in order to demonstrate that the induced pseudo-star with largest
pseudo-star degree centrality will occur with a center of d1. We consider each possible center:
i. If dummy node d1 is selected as the center, d2, d3 and d4 are directly selected as leaf nodes
which adds 3q to the objective. The other candidate leaf nodes are in set I. Remember that
for any node i â I, the objective value is guaranteed to increase at least by eâsi/C regardless
of i being a leaf or a neighbor node. Thus, one can see that the objective is bounded below by
3q.
ii. If node dj is selected as the center where j = 2, 3, 4, then the pseudo-star centered at dj
selects d1 as the only leaf since adding another node into the star would only decrease the
objective value. Such a pseudo-star clearly satisfies the feasibility condition since 1 > (1â θ)
49
and the pseudo-star degree centrality becomes (q + 2) +âiâI e
âsi/C < 2q + 2.
iii. If node dij is selected as the center node where j = 2, 3, 4 and i â I, then the corresponding
node i can be the only possible leaf node and the pseudo-star degree centrality would be q.
iv. If node i â I is selected as the center node, then the objective can become at most vi/vmax +
3 + eâsi/C +âjâI:i 6=j e
âsj/C < q + 4.
v. If node iⲠwhere i â I is selected as the center node, then there are two possibilities where the
objective is either vi/vmax or 2eâsi/C . As a result, the objective is bounded above by two.
vi. If node iâ˛â˛ where i â I is selected as the center node, then similar to the previous point,
there are two possibilities where the objective is either eâsi/C or eâsi/C + vi/vmax. Thus, the
objective is bounded above by two.
The points above indicate that the best pseudo-star is centered at d1, thus, we continue our
analysis using d1 as our basis.
Now, suppose there exists a feasible knapsack instance K such that the total value is greater
than or equal to V. We argue that there is a feasible pseudo-star with centrality greater than or
equal to `. Consider the pseudo-star centered at d1 with leaf nodes d2, d3, d4 and j â K. First, we
examine the feasibility condition of the star:
â(d1,j):jâK
eâsj/C ⼠1â (1â 1
e)
take the logâ======âof both sides
â(d1,j):jâK
log(eâsj/C) ⼠log(1
e)
We then examine the generic knapsack capacity constraint below.
âjâK
sj ⤠Cdivide by Câ=====â
âjâK
sjC⤠1
raise e to theâ==========âvalue on each side
eâjâK sj/C =
âjâK
esj/C ⤠e take the logâ======â
âjâK
log(esj/C) ⤠log(e)multiply by -1â========â
âjâK
log(eâsj/C) ⼠log(1
e)
One can see that the feasibility condition is satisfied since the knapsack feasibility is sat-
isfied. For each j â K, the contribution to the objective isvjvmax
+ esj/C , which is in total Vvmax
+â(d1,j):jâK e
sj/C . Each j /â K becomes a neighbor node covered by d1, which increases the objective
byâj /âK e
âsj/C . Nodes d2, d3, and d4 increases the objective by 3q as explained in Point 1. Overall,
the pseudo-star centered at d1 produces an objective of 3q + Vvmax
+âjâI e
âsj/C ⼠`.
50
Alternatively, suppose we have a pseudo-star which yields an objective greater than or equal
to `. This pseudo-star must be centered at d1 by Points 2-6. We can extract a knapsack solution
by isolating leaf nodes selected in set I. The leaf nodes selected satisfy the feasibility condition, in
other words, the knapsack capacity constraint is satisfied,which can be verified by working the steps
above âbackwardsâ.
We know that nodes d1, d2 and d3 contribute to the objective by 3q implying that the
rest of the objective value is increased by nodes in I as being either a leaf node or a neighbor
node. Let L be the set of leaf nodes selected from 1, 2, . . . , q. Then, the objective increases byâjâL e
sj/C+âj /âL e
sj/C =âjâI e
sj/C , whose value is always the same and completely independent
from which or how many nodes are in L. In addition, for each j â L, node j addsvjvmax
to the
objective where this total contribution must be at least Vvmax
since the pseudo-star is assumed to
have an objective bounded below by `. Thus, we can isolate all those nodes in L and rescale the
summation ofvjvmax
s through multiplying it by vmax. As a result, we can obtain a knapsack solution,
based on those nodes in L, whose objective is at least V.
In addition, it is shown that the SDC problem has a trivial unique optimal solution on a
windmill graph in Chapter 3. However, we present a reduction through the knapsack problem to
prove that the SPSDC problem preserves its complexity. The problem remains NP-complete on
windmill graphs.
Theorem 7. The stochastic pseudo-star degree centrality is NP-complete on windmill graphs.
Proof. Given a knapsack instance, we create a SPSDC instance on a windmill graph whose con-
struction is presented below. We create q + 1 cliques of size three all of which are connected to a
universal vertex named d (see Fig. 4.2 for the visualization).
V = d, d1, d2, d3, 1, 2, ¡ ¡ ¡ q, 1â˛, 2â˛, ¡ ¡ ¡ qâ˛, 1â˛â˛, 2â˛â˛, ¡ ¡ ¡ qâ˛â˛
E = âŞ3i=1(d, di) ⪠âŞ3
i=2(d1, di) ⪠(d2, d3) ⪠âŞqi=1(d, i) ⪠âŞqi=1(d, iâ˛) ⪠âŞqi=1(d, iâ˛â˛)âŞ
âŞqi=1(i, iâ˛) ⪠âŞqi=1(i, iâ˛â˛) ⪠âŞqi=1(iâ˛, iâ˛â˛)
Prior to setting the probabilities in our reduction, we first identify Ď â (0, 1) such that
2qĎ < min
1vmax
, eâs1/C , ¡ ¡ ¡ , eâsq/C , 1e
. This helps us to ensure that we cannot create a feasible
pseudo-star i) when it is centered at node d with any node iⲠand / or iâ˛â˛ selected as a leaf, and ii)
51
when it is centered at node iⲠor iâ˛â˛ with d being a leaf node. We then set ` = Vvmax
+âiâI e
âsi/C+ 2
and θ = 1â 1e .
Now, we assign the probability values. We set pdi = piiâ˛â˛ = eâsi/C ,âi â I, piiⲠ= vivmax
,âi â I,
pdd1 = 1, and pjd1 = 1,âj â N(d1). Lastly, Ď is assigned for the rest of the edges as depicted in Fig.
4.2. Let us examine the potential objective values for different pseudo-stars centered at each node.
Figure 4.2: The transformation of Knapsack < ~s,~v, C,V > to an instance < G(V,E), `, ~p, θ > of StochasticPseudo-Star Degree Centrality on a windmill graph
d
1
1
2
1â˛
3
1â˛â˛
q qⲠqâ˛â˛
d1
d2
d3
eâs1/C
eâs1/C
v1vmax
Ď
Ď Ď
eâsq/C
eâsq/C
vqvmax
Ď
Ď Ď
1
Ď
Ď
1
Ď
1
i. If node d is selected as the center of the pseudo-star, then node d1 is selected as a leaf node
which does not strain the feasibility condition and increases the objective by two. Then, node
i â I in each clique becomes the only possible node to be selected as a leaf. In this scenario,
the objective is bounded below byâiâI e
âsi/C + 2qĎ + 2 + miniâI vivmax.
ii. If node i â I is the center node, then node d is preferred as a leaf node over nodes iⲠand iâ˛â˛
to have access to the rest of the network. The objective value can be at mostâiâI e
âsi/C +
2qĎ + vivmax
.
iii. If the pseudo-star is centered at node d1, then node d becomes the only leaf node, and the
objective becomesâiâI e
âsi/C + 2qĎ + 2.
iv. A pseudo-star centered at any other node cannot have d as a leaf, thus, cannot compete with
the other pseudo-stars in terms of the objective value.
Our discussion above shows that the best pseudo-star with the largest SPSDC is obtained when it
is centered at node d. Also, note that both pseudo-star and knapsack instances obtained in each
52
direction of the proof must satisfy the feasibility conditions due to the same reasons presented in
Theorem 6.
Suppose we are given with a knapsack instance K whose total value is greater than or
equal to V. Let us examine the pseudo-star centered at d with leaf nodes d1 and j â K. While
d1 contributes to the objective by two, each node j â K produces an objective ofvjvmax
+ esj/C as
discussed in Theorem 6. Hence, we obtain a pseudo-star whose objective is at least `.
Now, suppose we have a pseudo-star instance with an objective of at least `. Such pseudo-
star must be centered at node d due to Points 2-4 and the selection of Ď where Ď < 1vmax
. The
pseudo-star cannot have any node iâ˛, iâ˛â˛, d2 or d3 as a leaf since Ď is guaranteed to be less than
1 â θ. Let us now show how the objective value is calculated by considering the rest of the nodes
as a candidate leaf. The objective value increases byâ
(d,i):iâI esj/C regardless of i â I being a leaf
or neighbor due to the way we constructed the network. In addition, d1 becomes a leaf since it
increases the objective by two as a leaf, while only increasing it by one as a neighbor. Then, leaf
nodes included from I must increase the objective by Vvmax
to ensure that the total objective is at
least `. Then, we can isolate the leaf nodes selected in I and obtain a knapsack instance whose
objective is at least V similar to the previous proof.
4.2 Mathematical Formulation
In this section, we propose an optimization model to solve the SPSDC problem that extends
the improved formulation proposed for the SDC problem in Chapter 3. The model contains three
sets of binary variables: i) xi is 1 if node i is selected as the center; 0 otherwise, ii) yi is 1 if node i
is selected as a leaf node; 0 otherwise, and iii) zij is 1 if pseudo-star element i covers node j in the
pseudo-stars open neighborhood; 0 otherwise. The formulation is:
IP:
maxâ
(i,j)âE
pijzij (4.1a)
s.t. xi + yi +â
jâN(i)
zji ⤠1 âi â V (4.1b)
zij ⤠xi + yi â(i, j) â E (4.1c)
53
yi â¤â
jâN(i)
xj âi â V (4.1d)
âiâV
xi = 1 (4.1e)
â(i,j)âE
log(pij)xiyj +â
i<j:(i,j)âE
log(1â pij)yiyj ⼠log(1â θ) (4.1f)
xi, yi â 0, 1 âi â V (4.1g)
zij â 0, 1 â(i, j) â E (4.1h)
The objective function (4.1a) maximizes the total probability of neighborhood assignments.
Constraints (4.1b) indicate that a node i can be either i) the center, ii) a leaf or iii) selected in the
open neighborhood and assigned to a node j that has an edge into i. For case (iii) to hold, node j
has to be connected to the center or a leaf, which is guaranteed by Constraints (4.1c). Note that
case (iii) ensures the unique assignment of a neighbor node to the pseudo-star. While Constraints
(4.1d) make sure that each leaf node is connected to the center node, Constraint (4.1e) enforces the
model to select a single pseudo-star. Constraint (4.1f) states that the pseudo-star selected satisfies
the feasibility condition. Lastly, Constraints (4.1g)-(4.1h) enforces the binary conditions on the
variables.
The model proposed is a non-linear binary optimization problem where the numbers of
variables and constraints are both O(m). Thus, it remains as a challenging problem to solve even
if the given graph is small and sparse. However, we can linerailize Constraints (4.1f) with the
well-known McCormick inequalities. We will introduce variables to represent the product of binary
variables as follows: aij = xiyj ,â(i, j) â E and bij = yiyj ,âi < j : (i, j) â E. We then obtain the
following linear model, which is equivalent to IP.
LIP:
max (4.1a) (4.2a)
s.t. (4.1b)â (4.1c)â (4.1d)â (4.1e)â (4.1g)â (4.1h)â(i,j)âE
log(pij)aij +â
i<j:(i,j)âE
log(1â pij)bij ⼠log(1â θ) (4.2b)
aij ⼠xi + yj â 1 â(i, j) â E (4.2c)
bij ⼠yi + yj â 1 â(i, j) â E (4.2d)
54
aij â 0, 1 â(i, j) â E (4.2e)
bij â 0, 1 i < j : â(i, j) â E (4.2f)
We first note that since 0 ⤠pij ⤠1, each log(pij) value is non-positive. This implies that
whenever bij and / or aij variables takes a positive value, then the left hand side (LHS) of Constraint
(4.2b) decreases. As a result, assigning a positive value for either variable when it is not ânecessaryâ
would only strain the feasibility condition and does not impact the objective function. That is why
the McCormick upper bound (UB) constraints (e.g., bij ⤠xi,âi â V and bij ⤠yj ,âj â V ) are not
necessarily needed during the linear transformation and we omit those constraints. Also, we will be
using the same aij and bij variables whenever McCormick inequalities are introduced for the same
transformations. Although we end up with a linear model, the number of variables and constraints
are still bounded by O(m). We propose a decomposition algorithm to solve the model at a scale,
which is discussed in the next section in a detailed way.
4.3 Solution Methodology
We will be referencing Benders Decomposition (BD) as our solution methodology. Our
method will remove constraints (4.1b)-(4.1c) and (4.1f) to design a master problem (MP) whose
aim is to identify a candidate pseudo-star. We then obtain two different sub problems (SP) where
we focus on the feasibility (i.e., Constraints (4.1f)) and the open neighborhood (i.e., Constraints
(4.1b)-(4.1c)) components of the problem separately in that order. This is because the feasibility
component has no impact on the objective.
At every candidate solution, we first check the feasibility condition. If the condition does not
hold, then we eliminate the current solution via either Benders feasibility cuts or logic-based Benders
cuts (LBBCs). If the pseudo-star is feasible, then we proceed to the next SP to check whether an
optimality cut that aims to approximate the objective value (i.e., the open neighborhood with the
maximum total probability assignment) can be generated. Below we present the MP without the
optimality, feasibility and LBBCs where t represents the estimation of the true objective. Benders
cuts will be presented shortly and be incorporated into the MP at every iteration as needed.
MP = maxtt : (4.1d)â (4.1e)â (4.1g), t ⤠UB
55
4.3.1 Benders Feasibility Cuts
An important observation is that McCormick variables (i.e., aij and bij) used to linearize
IP can be relaxed and both take binary values due to the same reason why we do not include the
UB constraints.
Proposition 4. Variables aij and bij take binary values when they are relaxed.
Proof. Without loss of generality (WLOG), let us examine aij . If both xi and yj take the value
of one, then aij = 1 by Constraint (4.2c). If (a) they are both zero or (b) either of them is zero,
then aij becomes free. However, since increasing aij would only decrease the LHS of Constraint
(4.2b), the model would not prefer to assign a non-negative value for aij . Even if there could be
a degenerate case where Constraint (4.2b) is satisfied by assigning aij a positive value, we can set
aij = 0 and obtain the same objective value. As a result, we obtain a binary optimal solution when
variable aij is relaxed.
Proposition (4) enables us to generate traditional Benders feasibility cuts. The second
important observation is that, the feasibility condition has two components where we look at the
connections between the center and the leaf nodes (i.e., xiyj), and among the leaf nodes (i.e., yiyj).
Hence, one can generate two different cuts as i) a local feasibility cut where infeasibility occurs with
both center and leaf nodes (i.e.,âjâLpkj
âi,jâL:(i,j)âE(1âpij) ⼠1â θ), or ii) a global feasibility cut
where infeasibility occurs directly within the set of leaf nodes (i.e.,âi,jâL:(i,j)âE(1â pij) ⼠1â θ).
The reason we refer to the latter as global is that it is applicable to any potential center that
could be connected to that set of leaf nodes. We first present the traditional Benders local feasibility
problem. Let δ, νij and ¾ij be the penalty variables defined for Constraints (4.2b), (4.2c), and (4.2d),
respectively. They aim to approximate how much we should perturb the current fixed solution ensure
it satisfies the subproblem constraints.
BLF:
min δ +â
(i,j)âE
νij +â
i<j:(i,j)âE
Âľij (4.3a)
s.t.â
(i,j)âE
log(pij)aij +â
i<j:(i,j)âE
log(1â pij)bij + δ ⼠log(1â θ) (4.3b)
aij + νij ⼠xi + yj â 1 â(i, j) â E (4.3c)
bij + Âľij ⼠yi + yj â 1 âi < j : (i, j) â E (4.3d)
56
aij , νij â R+ â(i, j) â E (4.3e)
bij , Âľij â R+ âi < j : (i, j) â E (4.3f)
δ â R+ (4.3g)
We then take the dual of the problem, where dual variables Ď, Ď , and ÎĽ correspond to
Constraints (4.3b), (4.3c), and (4.3d), respectively.
DBLF:
max log(1â θ)Ď+â
(i,j)âE
(xi + yj â 1)Ď ij+ (4.4a)
âi<j:(i,j)âE
(yi + yj â 1)ÎĽij
s.t. log(pij)Ď+ Ď ij ⤠0 â(i, j) â E (4.4b)
log(1â pij)Ď+ ÎĽij ⤠0 âi < j : (i, j) â E (4.4c)
0 â¤ Ď â¤ 1 (4.4d)
0 â¤ Ď ij ⤠1 â(i, j) â E (4.4e)
0 ⤠μij ⤠1 âi < j : (i, j) â E (4.4f)
The following is what we call as a local Benders feasibility cut that can be added into MP
to eliminate the infeasible candidate solution.
log(1â θ)Ď+â
(i,j)âE
Ď ij(xi + yj â 1) +â
i<j:(i,j)âE
ÎĽij(yi + yj â 1) ⤠0 (4.5)
However, if the infeasibility takes place because of the leaf nodes selected even without taking
the center node into consideration, then we solve a smaller size LP to obtain a global feasibility cut.
WLOG, let us use the same penalty variables (i.e., δ and νij) and define the following feasibility
problem.
BGF:
min δ +â
i<j:(i,j)âE
Âľij (4.6a)
s.t.â
i<j:(i,j)âE
log(1â pij)bij + δ ⼠log(1â θ) (4.6b)
57
bij + Âľij ⼠yi + yj â 1 âi < j : (i, j) â E (4.6c)
bij , Âľij â R+ âi < j : (i, j) â E (4.6d)
δ â R+ (4.6e)
WLOG, let variables Ď and ÎĽ be the dual variables corresponding to Constraints (4.6b) and
(4.6c). The dual of BGF can be presented as follows.
DBGF:
max log(1â θ)Ď+â
i<j:(i,j)âE
(yi + yj â 1)ÎĽij (4.7a)
s.t. log(1â pij)Ď+ ÎĽij ⤠0 âi < j : (i, j) â E (4.7b)
0 â¤ Ď â¤ 1 (4.7c)
0 ⤠μij ⤠1 âi < j : (i, j) â E (4.7d)
In this case, we obtain a tighter feasibility cut than Ineq. (4.5) since it is not tied to any
center node. The constraint is:
log(1â θ)Ď+â
i<j:(i,j)âE
(yi + yj â 1)ÎĽij ⤠0 (4.8)
Note that both Benders feasibility cuts introduced are associated with dual solutions, which
are mostly fractional values. Our preliminary results indicate that such feasibility cuts are not able to
yield a quick convergence even in small scale instances. Hence, in the following section, we examine
LBBCs. In Section 4.5.3, we test both sets of cuts and discuss their impacts on the solution time.
4.3.2 Logic-Based Benders Cuts
For a fixed pseudo-star Sk centered at node k with a set of leaf nodes denoted by L, if
the condition is not satisfied, then we can generate a generic no-good cut that aims to change the
current solution by removing a single leaf node from Sk. Note that the cut should not take into
consideration adding another leaf node since adding a new leaf node would only decrease the LHS
of (2.2). Hence, we define the following LBBC:âjâL
yj ⤠(|L| â 1)xk + |L|(1â xk) (4.9)
58
Theorem 8. The LBB feasibility cut (4.9) is valid.
Proof. To prove that a LBBC is valid, we show that (i) the constraint cuts off the current master
solution since it is infeasible, and (ii) it does not eliminate a global feasible solution. We use the
same methodology to prove the similar theorems presented in the rest of this chapter.
Note that if node k is selected as the center node (i.e., xk = 1), then the right hand side
(RHS) implies that at least one of the leaf nodes in L of Sk must be turned off, thereby eliminating
the current solution. Otherwise, the center node alternates without enforcing any restriction on the
nodes in L, thus, the infeasible solution is eliminated. As a result, pseudo-star Sk is guaranteed to
be removed from consideration.
In the following iterations, when we obtain a new candidate pseudo-star (feasible or not),
if it is centered at a node different then k, then it becomes clear that the cut does not eliminate a
feasible solution since the RHS becomes the aggregating of the binary restrictions for the leaf nodes,
in other words, a trivial constraint. If k is the candidate center node in an alternative Sâ˛
k, since the
RHS changes at most one leaf node, it guarantees that Sâ˛
k 6= Sk. It also makes sure that the only
solution removed is Sk, hence, no global feasible solution is removed.
Note that cut (4.9) does not aggressively change the current solution and is not effective in
general due to the fact that it only targets to eliminate Sk rather than understanding the subset of
nodes at the ârootâ of its infeasibility. Thus, we can design an integer SP with a fixed center k to
check if we can eliminate more leaf nodes for Sk to be feasible.
WF := max~yâ0,1
âjâL
yj :âjâL
pkjxkyj +â
i<j:i,jâL(1â pij)yiyj ⼠log(1â θ)
The Model WF aims to identify the maximum number of leaf nodes that could be selected in
L to obtain a feasible pseudo-star structure via a knapsack-type constraint. Since its nonlinearities
come from the product of two binary variables, we can use the McCormick inequalities. We then
obtain an equivalent linear formulation:
LWF := max~y,~bâ0,1
âjâL
yj :âjâL
pkjxkyj +â
i<j:i,jâL(1â pij)bij ⼠log(1â θ), bij ⼠yi + yj â 1,
âi < j : i, j â L
59
Let 뫉 be the optimal solution to LWF. Then we define a new LBBC.
âiâL
yj ⤠δâxk + |L|(1â xk) (4.10)
Theorem 9. The LBB feasibility cut (4.10) is valid.
Proof. Similar to Theorem 8, the second component of the RHS (i.e., |L|(1 â xk)) guarantees that
the current infeasible solution is cut off and no global feasible solution is eliminated. Therefore, we
examine the non-trivial scenario when xk = 1.
The objective aims to select as many leaf node as possible while ensuring the feasibility
by the constraint defined in WF. First, it makes sure that at least one leaf node is removed from
the candidate pseudo-star, hence, δâ ⤠|L| â 1. As a result, the cut removes the current infeasible
solution. Second, it implies that any alternative candidate pseudo-star Sâ˛
k that has more than 뫉
leaf nodes is infeasible. Since the cut prevents Sâ˛
k from having more 뫉 leaf nodes and all solutions
having less than or equal to 뫉 leaf nodes are still feasible, it does not cut off any global feasible
solution.
This cut stands stronger than cut (4.9) since we have that δâ ⤠|L| â 1, however it still
depends on the selection of xk as the center meaning that once the center node is changed, the cut
does not help. Therefore, we name (4.10) as a local LBBC. The |L| term plays the role of a big-M,
hence, the question becomes if we can further improve cut (4.10).
Based on the same argument provided in the previous section,if the infeasibility occurs
directly due to the connections between the leaf nodes selected (i.e.,âi,jâL(1â pij) < 1â θ), then
we can focus on a smaller size IP model to obtain a better cut. This leads to a different integer SP,
as well as a more general and stronger cut.
SF := max~yâ0,1
âjâL
yj :â
i<j:i,jâL(1â pij)yiyj ⼠log(1â θ)
Similar to model WF, we have non-linear terms and should use the McCormick inequalities.
LSF := max~y,~bâ0,1
âjâL
yj :â
i<j:i,jâL(1â pij)bij ⼠log(1â θ), bij ⼠yi + yj â 1,âi < j : i, j â L
Let ââ be the optimal objective of LSF. Then, we define the following LBBC, which does
60
not depend on variable x and is called global LBBC.
âjâL
yj ⤠ââ (4.11)
Theorem 10. The LBB feasibility cut (4.11) is valid.
Proof. LSF guarantees that ââ < |L| as a result of which the current infeasible solution is eliminated.
The cut also carries the information of how many nodes in L can be selected by any pseudo-star.
Note that it does not necessarily guarantee feasibility since according to the selection of the center
node, we can still face a feasibility problem. Yet, it makes sure that no global feasible solution is
removed since in any scenario having more than ââ leaf nodes from L is directly infeasible regardless
of which node is selected as the center.
One can realize that all LBB feasibility cuts (4.11) can be prepopulated and add into LIP.
Yet, since the number of those cuts is bounded by O(n2m), it is not practical to incorporate all such
cuts in advance and we generate them on the fly in our solution method.
4.3.3 Optimality Cuts
Once the pseudo-star fixed satisfies the feasibility condition, we proceed to solve a SP to
generate an optimality cut. Given a fixed solution (~x, ~y), we define the following primal problem.
Ď(x, y) :
maxâ
(i,j)âE
pijzij (4.12a)
s.t.â
jâN(i)
zji ⤠1â xi â yi, âi â V (4.12b)
zij ⤠xi + yi, â(i, j) â E (4.12c)
zij â 0, 1, â(i, j) â E (4.12d)
Here we can use the LP duality to generate the dual formulation by relaxing variables zij .
The relaxation of variable zij produces binary solutions when passing an incumbent solution to
Ď(x, y) due to the fact that the constraint matrix is totally unimodular. Let βi and Îłij be the
dual variables corresponding to Constraints (4.12b) and (4.12c), respectively. The dual of Ď(x, y) is
presented as follows.
61
ÎŚ(x, y) := minβâĽ0,ÎłâĽ0
âiâV
(1â xi â yi)βi +â
(i,j)âE
(xi + yi)Îłij : βi + Îłji ⼠pji,â(j, i) â E
We observe that the constraint set of the dual formulation ÎŚ(x, y) does not depend on the
fixed MP solution. Also, the constraint set is always closed and bounded, in other words, we are
not concerned with feasibility as expected. Whenever a solution violated is identified, we generate
the following optimality cut and add into MP.
t â¤âiâV
βi(1â xi â yi) +â
(i,j)âE
Îłij(xi + yi) (4.13)
Lastly, we illustrate our Benders implementation in Fig. 4.3. Note that here we show LBBCs
(i.e., constraints (4.10) and (4.11)) as feasibility cuts. The only change to focus on Benders Feasibility
Cuts is the type of SP solved and cut generated in the lower portion of the figure.
Figure 4.3: The illustration of the Benders Decomposition algorithm including Logic-based Benders cuts.
Solve MP usingbranch-and-
Benders approach
Utilize the genericcallback function
Is feasibilitycondition met?
Solve ÎŚ(x, y)
Do only leaf nodescause infeasibility?
Solve LWF
Solve LSF
Yes
No
No
Yes
Add cut (15)
Add cut (13)
Add cut (12)
62
4.4 Algorithmic Enhancements
In this section, we present the acceleration techniques that we adapt to speed up our Benders
implementation. We note that any technique applicable to the full LIP is directly adopted to make
a fair comparison on our computational testing.
4.4.1 Algorithmic Approach for Optimality Cuts
We observe that ÎŚ(x, y) can be solved by a direct algorithm for it rather than utilizing a
commercial solver. More importantly, we can separate the problem over each node i thereby enabling
ourselves to generate multiple cuts at every iteration. Below we first show how to divide ÎŚ(x, y)
over each node as ÎŚi(x, y) and then propose an algorithm that identifies the optimal solution for
ÎŚi(x, y) for a given i. This algorithm works for both incumbent and fractional solutions, i.e., it
follows Modern BD. The problem over node i is:
ÎŚi(x, y) := minβi,ÎłâĽ0
(1â xi â yi)βi +â
jâN(i)
(xj + yj)Îłji : βi + Îłji ⼠pji,âj â N(i)
We first restate the objective function of the MP as
âiâV ti and separate cut (4.13) over
each node as shown below.
ti ⤠βi(1â xi â yi) +â
jâN(i)
Îłji(xj + yj),âi â V (4.14)
In order to solve each ÎŚi(x, y), we follow the following procedure. First, for the sake of
simplicity, let us assume that for a fixed node i, every node in N(i) is indexed from 1 to l where
p1i ⼠p2i ⼠¡ ¡ ¡ ⼠pli. If there exists an index j such thatâjk=1
((xk + yk) > (1â xi â yi)
), then we
identify the minimum j satisfying the inequality and set βi = pji; otherwise, we set βi = 0. Then,
we assign Îłji as maxpji â βi, 0,âj â N(i).
4.4.2 Greedy Heuristic and Warm-Start
Our preliminary experiments in the SPSDC problem indicates that our selected commercial
solver, CPLEX, has difficulty in determining an initial feasible solution, as well improving the
optimality gaps in both solving the LIP directly and solving the problem via BD. Therefore, we
63
design a greedy heuristic that produces an induced pseudo-star for every node in order to test the
impact of warm-start (see Alg. 4).
Given a node i and pseudo-star Si centered at i, let uncovered (i.e., nodes that are not yet
covered by an element in Si) first and second degree nodes of i be represented by R â N2(i) âŞ
N(i) \ âŞjâSiN(j). We then define R as the complement of R. We let hj be the index of the
element in Si that is assigned to a neighbor node j in R. For a node j â N(i), we define uj as the
contribution of node j as the total increase in the objective in case it is selected as a leaf node where
uj :=âkâRâŠN(j) pjk +
âkâRâŠN(j) maxpjk â phjj , 0. Finally, we let Îśj be the total probability
value that node j adds into Si if j is selected as a leaf node (i.e., Îśj = pijâkâSi\i(1â pkj)).
Algorithm 4: Greedy Heuristic
Input: i â V1 Si â i;2 C â N(i); #candidate leaf nodes3 Ď = 1; # total probability of Si4 zij = 1,âj â N(i); # initially assign the center to every node in N(i)5 R = N2(i)6 while C 6= â do7 jâ = arg max
jâCujÎśj ;
8 if ÎśjĎ < 1â θ then9 C â C \ jâ;
10 else11 Si â Si ⪠jâ;12 zijâ = 0;13 C â C \ jâ;14 R â N2(i) \N(jâ);15 Ď = ÎśjâĎ;16 for k â N(jâ) do17 if âhk â Si : zhkk = 1 then18 if phkk < pjâk then19 zhkk = 0;20 zjâk = 1;
21 else22 continue;
23 else24 zjâk = 1;
25 return Si, ~z
Given a node i, the heuristic identifies the candidate leaf node that has the highest weight
function(i.e., wj = ujÎśj , j â C
)in candidate leaf nodes and the node is added into Si as long as it
does not violate the feasibility condition. If it is violated, then the node is removed from C. Once
64
we obtain one candidate pseudo-star centered at each node, we evaluate the objective value of each
(i.e.,â
(i,j)âE pijzij) and warm-start both IP and MP via the best solution.
4.4.3 Valid Inequalities
While we aim to help the solver with improving the primal bounds via warm-start, it is
also important to use valid inequalities to help with the dual bounds. With this purpose, we use
the heuristic algorithm proposed in Chapter 3 and adapt it to our problem. While more details
and pseudocode can be reached in Section 3.4.2 and Appendix A, here we informally explain the
heuristic and our slight modification.
First, we note that the heuristic remains as a valid UB even if we are concerned with a
deterministic objective in the SPSDC problem. For a given node i and candidate induced star Sk,
let δSk be the UB. We initially set δSk = |N(i)⪠N2(i)|. The heuristic identifies each node j in N(i)
which creates a unique path to a node in N2(i) and decreases δSk for each j identified.
Once the bound is obtained, we sum up the δSi largest probability values in the set Vi =
pij : j â N(i) ⪠pjk : j â N(i), k â N(i)⪠N2(i). Let the summation is represented by Ďi. Then,
the following is a valid inequality which can be placed in MP.âiâV
ti ⤠Ďixi (4.15)
Note that we use the same bound for the objective function (4.2a) in LIP. In addition,
since we look for a unique assignment between a neighbor node and a pseudo-star element, the
contribution of each node to the objective is bounded above by the largest probability connection.
The following is a valid inequality that can be only used in MP:
ti ⤠maxjâN(i)
pji,âi â V (4.16)
4.4.4 Separation of Fractional Solutions
Our preliminary experiments indicate that the initial MP quickly ends up being overloaded
with feasibility cuts, thus limiting its ability to solve the problem. In addition, having fractional
values for the center variable increases the difficulty of the feasibility separation problem. Therefore,
in our implementation, fractional solutions are only separated when all variables xi are binary and
the leaf variables are fractional. Otherwise, we let the solver continue its branching process.
65
When it comes to separating fractional y solutions, we adapt two different strategies. First,
we treat each yi having a fractional value as a leaf node and conduct the feasibility test accordingly.
In other words, we apply a rounding heuristic to turn the fractional solution into an integer solution.
If the current solution is not feasible, then we proceed to solve a feasibility problem. If the feasibility
condition is met, then we focus on the dual problem with the original fractional solutions. As a
second approach, we follow the standard procedure and perform the feasibility the test with the
original fractional values. Employing the latter strategy turns out to be the most effective since the
solver generates fewer user-defined cuts, as well as branching to a fewer number of nodes to reach
the optimal in most of the instances.
4.5 Experimental Results
We perform all the experiments using Java API and CPLEX solver 12.8.1 on a laptop having
3.10GHz Intel Core i7-6500 processor and 16 GB of RAM. We change the default CPLEX settings
during the decomposition implementation. Similar to Chapter 3, we switch the MIP emphasis to
optimality over feasibility, use strong branching (i.e., VarSel = 3) and set the heuristic frequency
1000 (i.e., RINSHeur = 1000). Furthermore, we set the number of threads as the number of cores
on the laptop (i.e., 4) both when solving the IP directly and solving the model via BD.
4.5.1 Networks Based on the Watts-Strogatz Model
Due to the complexity of our model, it becomes challenging to directly apply it to PPINs
of the scale available. Therefore, we randomly generate network instances for testing purposes.
The instances are created based on the the Watts-Strogatz (WS) model, which is also called the
small-world model. In such models, we observe local clusters and small average path length that is
tuned by the rewiring probability. The reason for selecting the small-world network as our choice is
that one can observe a large number of local clusters in PPINs. Second, the diameter in PPINs is
relatively small. For instance, we take the dataset of two organisms Helicobacter Pylori (HP) with
1,570 nodes and Staphylococcus Aureus (SA) with 2,853 nodes as our reference (Szklarczyk et al.,
2015). In both networks, the diameter is six.
In WS models, one can tune the neighborhood parameter (nei) and rewiring probability
(rp) to generate different network instances. We consider instances with |V | â 500, 750, 1000,
66
nei â 12, 14, 16, and p â 0.3, 0.5, 0.7 which in total produces 27 instances.
4.5.2 Calculation of Probability Values
In this section, we present the methodology that we use to identify the probability values
associated with the edges. In PPINs, there exists interaction scores in (1,1000) where the higher
score implies a stronger interaction between two proteins. We normalize the interaction scores and
plot the distribution of the normalized scores (see Figures 4.4 and 4.5 for HP and SA, respectively).
Figure 4.4: Distribution of Interaction Scores in HP Figure 4.5: Distribution of Interaction Scores in SA
We observe that the normalized scores show a right-skewed distribution, which resembles
both the gamma distribution with a shape parameter less than or equal to one and the exponential
distribution with a rate parameter around 1.5. Therefore, generating probability values according
to either distribution can be acceptable. We will be using the exponential distribution with rate
parameter 1.5 to generate the probability values associated with edges. It is important to mention
that one can also use Monte-Carlo sampling on the real-data sets in order to both generate network
samples and probability values. However, sampling from a PPIN would favor that specific network.
Hence, we prefer to use our proposed random generation process over Monte-Carlo sampling in order
to demonstrate its wider applicability.
4.5.3 Computational Experiments
We set a time limit of 5,400 seconds. In initial testing, we set θ = 0.99. Before getting into
a detailed analysis, we share a macro table which summarizes the experimental results. In Table
4.1, we compare six methods including i) LIP, ii) LIP with warm-start (LIP-WS), iii) BD with the
67
LBB cuts (BD-LB) used as feasibility cuts, iv) BD-LB with warm-start (BD-LB-WS), v) BD with
the traditional Benders feasibility cuts (BD-TB), vi) and BD-TB with warm-start (BD-TB-WS).
For each method, we report the number of instances solved to optimality, the ratio between the
optimal solutions and the total number of instances, the average optimality gap calculated over all
the instances, as well as the number of times the method achieved the best performance across all
six methods. The best performance is first evaluated according to the optimality gaps. If more than
one method returns the optimal solution for the same test instance, then we examine the time spent
to reach the optimal. In addition, we use the bold font to indicate the best method for each criteria.
Table 4.1: Summary of results (27 Instances)
LIP LIP-WS BD-LB BD-LB-WS BD-TB BD-TB-WS
Optimal 8 7 17 16 11 13Percentage (%) 30 26 63 60 41 49Average Gap (%) 168.88 181.56 7.5 8.51 64.98 54.33Best Performance 2 0 17 7 0 1
One can clearly observe that our BD implementation including the LBBCs shows the best
performance. Although warm-start does not seem to be effective in improving the performance, we
further examine its impact as θ varies. On the other hand, using the LIP with or without warm-
start does not even produce an average optimality gap below 100%. Even though the BD with the
traditional feasibility cuts performs relatively better than solving the problem directly via LIP, it
still cannot compete with both BD-LB and BD-LB-WS. This implies that LBBCs show a better
performance than the traditional Benders cuts in dealing with the infeasibility. We believe that this
is because LBBCs carry more specific information and tell the model exactly which leaf nodes cause
the infeasibility (i.e., see LBBC (4.11)).
We now move into a detailed analysis and share the computational results obtained through
all six methods for each instance. In Table 4.2, we report the following outputs: i) time spent to
reach the solution in seconds, ii) the final optimality gap in percentage, and iii) the number of branch
and bound (BB) nodes visited by the solver. Note that if the optimal solution is not reached within
the time limit (TL), then we use TL as an abbreviation in the table. Also, similar to Table 4.1, we
use the bold font to indicate which method performs the best in each network instance.
Overall, we observe that warm-start does not show a fruitful impact on the performance
of both LIP and BD-LB. The model LIP shows a better performance than LIP-WS in nearly all
the instances with four exceptions for which the optimality gap differences are negligible (e.g.,
68
Table
4.2
:T
he
com
puta
tional
resu
lts
wit
hθ
=0.9
9
LIP
LIP
-WS
BD
-LB
BD
-LB
-WS
BD
-TB
BD
-TB
-WS
nneip
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
500
12
0.3
2713.8
90
1278
3031.1
40
1254
615.48
018833
623.9
70
20000
2180.0
30
32361
2035.7
50
31620
500
12
0.5
2164.0
40
1401
2297.3
90
1537
670.02
017844
684.5
30
17577
1494.3
00
22142
1460.9
00
18390
500
12
0.7
2542.5
80
1443
2838.1
40
1399
2118.9
00
34524
3236.3
70
39421
2105.4
00
291782057.58
027664
500
14
0.3
3885.6
30
1702
4238
019481327.81
021896
1624.0
50
26077
3044.1
80
33166
3390.0
80
35195
500
14
0.5
3806.4
80
1692
3993.2
60
16902422.03
036579
3015.8
70
40712
3687.9
50
36836
4062.2
40
36647
500
14
0.7
3363.0
80
2099
3215.6
00
14232308.69
036671
4609.0
60
32370
2644.4
20
30608
5117.5
00
33071
500
16
0.3
TL
159.3
91576
TL
187.5
3808
TL29.14
39029
TL
36.0
728370
TL
110.6
928174
TL
36.6
930665
500
16
0.5
4703.84
01622
4915.1
70
1457
TL
18
39657
TL
17.0
328570
TL
12.3
239234
TL
9.6
941771
500
16
0.7
4018.79
01474
TL
233.4
71357
TL
11
47798
TL
12.9
846253
TL
25.2
537553
TL
21.5
538488
750
12
0.3
TL
152.1
61417
TL
164.6
81191
1864.3
30
349511727.21
029199
4570.1
70
44520
3719.1
20
41432
750
12
0.5
TL
145.5
91695
TL
155.6
114622072.62
032410
2432.4
20
33549
2896.8
20
29463
2382.0
70
27725
750
12
0.7
TL
179.7
41565
TL
192.6
511551196.31
026428
2699.8
10
30669
3638.7
10
34048
3234.4
50
32733
750
14
0.3
TL
201.7
61300
TL
205.3
11133
4686
048120
TL
6.6
943820
TL
105.4
227386
TL
48.4
529999
750
14
0.5
TL
268.8
3945
TL
277.4
4923
4320.5
40
511464187.23
048752
TL
15.6
437139
TL
17.7
138279
750
14
0.7
TL
217.9
9832
TL
220.9
8763
4576.0
40
490054496.01
045464
TL
14.7
731747
4714.2
00
41180
750
16
0.3
TL
290.8
01217
TL
294.1
31257
TL
26
42572
TL
36.4
833277
TL
170.9
628991
TL
114.0
627776
750
16
0.5
TL
260.1
31323
TL
267.1
41073
TL
16.6
238040
TL
6.59
41283
TL
76.3
232294
TL
44.3
730935
750
16
0.7
TL
249.3
8498
TL
252.7
1456
TL22.82
32374
TL
23.6
339329
TL
25.2
132085
TL
93.1
423302
1000
12
0.3
TL
198.6
81034
TL
214.2
9709
2002
028338
2252.1
60
32063
TL
17.3
635668
4805.3
90
41281
1000
12
0.5
TL
241.0
21372
TL
239.9
01367
69.76
01112
1438.4
50
20823
3853.9
30
28128
4033.8
10
26151
1000
12
0.7
TL
226.3
51065
TL
226.5
01271
2205.0
50
233091584.60
022981
5285.1
90
26609
3669.9
20
25151
1000
14
0.3
TL
206.1
2513
TL
206.4
14633430.06
037788
3508.5
10
34604
TL
143.5
925283
TL
149.7
823064
1000
14
0.5
TL
256.3
21371
TL
254.9
01259
TL
3.99
45634
TL
6.2
049167
TL
27.9
428829
TL
110.3
922301
1000
14
0.7
TL
292.8
9961
TL
294.4
6985
5367.5
10
501595092.16
050159
TL
33.6
922346
TL
41.6
921968
1000
16
0.3
TL
330.1
5500
TL
331.2
0892
TL20.59
43851
TL
24.4
739445
TL
322.8
614522
TL
284.1
514575
1000
16
0.5
TL
394.6
7634
TL
392.1
4719
TL
21.8
338180
TL21.27
42369
TL
406.3
216200
TL
316.1
320677
1000
16
0.7
TL
287.8
6583
TL
290.6
0448
TL32.57
35488
TL
38.4
038885
TL
246.1
417284
TL
179.1
421141
69
Figure 4.6: Solution time comparison between BD-LB and BD-LB-WS
0
1000
2000
3000
4000
5000
6000
0.3 0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.7 0.3 0.5 0.7 0.3 0.7
12 14 12 14 12 14
500 750 1000
Tim
e (s
ec)
|V| - nei - p
BD-LB
BD-LB-WS
Figure 4.7: Optimality gap comparison between BD-LB and BD-LB-WS
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.3 0.5 0.7
16 16 14 16
500 750 1000
Gap
|V| - nei - p
BD-LB
BD-LB-WS
Figure 4.8: Solution time comparison between BD-TB and BD-TB-WS
0
1000
2000
3000
4000
5000
6000
0.3 0.5 0.7 0.3 0.5 0.7 0.3 0.5 0.7 0.5 0.7
12 14 12 12
500 750 1000
Tim
e (s
ec)
|V| - nei - p
BD-TB
BD-TB-WS
Figure 4.9: Optimality gap comparison between BD-TB and BD-TB-WS
0%
50%
100%
150%
200%
250%
300%
350%
400%
450%
0.3 0.5 0.7 0.3 0.5 0.3 0.5 0.7 0.3 0.3 0.5 0.7 0.3 0.5 0.7
16 14 16 12 14 16
500 750 1000
Gap
|V| - nei - p
BD-TB
BD-TB-WS
(1000, 12, 05)â (1000, 14, 0.5)â (1000, 16, 0.5)). As for BD including the LBB cuts, we present two
figures for illustration purposes. Fig. 4.6 represents the solution time comparisons between BD-LB
and BD-LB-WS where we compare the instances solved to optimality with both methods. BD-LB
reaches the optimal solution faster than BD-LB-WS in 11 out of 16 instances. Warm-start results
in visiting more BB nodes which might be the reason behind having a higher solution time. Fig.
4.7 illustrates the comparison of optimality gaps when both BD-LB and BD-LB-WS fail to reach
the optimal solution. BD-LB returns a better optimality gap than BD-LB-WS in 80% of samples
presented in the figure. However, when we look at the BD including the traditional feasibility
cuts, we observe a completely reversed trend where BD-TB-WS outperforms BD-TB in most of the
instances (i.e., 18 out of 27) which implies that warm-start does impact the performance in a positive
way. Moreover, Fig. 4.8 and 4.9 illustrate the comparison of both solution time and optimality gaps
between BD-TB and BD-TB-WS, respectively. While the warm-start version beats BD-TB in eight
instances out of eleven in terms of solution time as presented in Fig. 4.8, it also performs better with
respect to the optimality gaps in 60% of the instances presented in Fig. 4.9.
After comparing each method with its warm-start variant, based on the âwinnerâ cases
we proceed to compare LIP, BD-LB, BD-TB-WS where it can be seen that BD-LB significantly
70
outperforms the other two with respect to both solution time and solution quality. LIP shows
a quite poor performance especially in the networks with more than 500 nodes where the average
optimality gap turns out to be 244.47%. The reason behind this could be two-fold. First, the number
of BB nodes pruned by the solver is relatively small (see Table 4.2) which indicates that the size of
the model becomes an issue for the solver to detect new branches. We believe that specifically Eq.
(4.2b) defined in LIP (i.e., feasibility condition) might cause numerical issues due to the existence
of a high number of non-zero coefficients. Second, by looking at the engine logs we observe that
the number of feasible solutions identified by the solver within TL is quite small. Thus, the solver
has a hard time in both reaching the optimal solution and determining a feasible solution. When
analyzing the results returned by BD-TB-WS, we see that instances containing more than 500 nodes
and having nei parameter 14 and higher are the most challenging ones where only a single instance
is solved to optimality (i.e., (750,14,0.7)) and the average optimality gap is 116.58% among those
12 instances.
Since the performance of the warm-start variant was somewhat âcloseâ for the best algorithm
(i.e., BD-LB) for the initial θ value, we further investigate the impact of warm-start. We now solve
the problem for θ = 0.95, 0.9, 0.8. In Table 4.3, we present the computational experiments conducted
via BD-LB and BD-LB-WS by varying θ values.
One can realize that the warm-start strategy becomes quite useful as we decrease θ value.
This is because the feasibility condition becomes harder to satisfy and, therefore, identifying feasible
integer solutions to the problem that have objectives close to the dual bounds becomes more difficult.
BD-LB-WS outperforms BD-LB in 59%, 78%, and 82% of the instances with θ = 0.95, θ = 0.9, and
θ = 0.8, respectively. More importantly, we do not have a single instance for which BD-LB reaches
the optimal while BD-LB-WS cannot. With θ = 0.95, when BD-LB returns optimality gaps over
200% for instances (1000, 14, 0.3) and (1000, 16, 0.7), we obtain optimality gaps of 28.79% and 40.05%
for those two instances via BD-LB-WS. Furthermore, with θ = 0.9, while BD-LB produces an average
optimality gap of 406.31% for instances (500, 16, 0.5), (750, 14, 0.3), (750, 16, 0.7), and (1000, 16, 0.7),
BD-LB-WS reaches the optimal within 3,478 seconds on average for the same instances. Lastly, with
θ = 0.8, BD-LB-WS reaches the optimal while BD-LB fails to do so for instance (1000, 16, 0.5). BD-
LB-WS performs better than BD-LB especially with respect to the solution time for the instances
where both methods reach the optimal. For such 16 instances, BD-LB-WS returns the optimal
solution faster than BD-LB in 11 of them. As a result, we continue our analysis with BD-LB-WS
71
Table
4.3
:T
he
com
puta
tional
resu
lts
wit
hdiff
eren
tθ
valu
esvia
BD
-LB
and
BD
-LB
-WS
BD
-LB
(θ=
0.9
5)
BD
-LB
-WS
(θ=
0.9
5)
BD
-LB
(θ=
0.9
)B
D-L
B-W
S(θ
=0.9
)B
D-L
B(θ
=0.8
)B
D-L
B-W
S(θ
=0.8
)
nneip
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
Tim
e(sec)
Gap
(%)
BB
Nodes
500
12
0.3
648.6
20
13507
548.56
013746
491.62
014250
548.3
40
15594
1263.9
40
17604
643.89
017174
500
12
0.5
486.8
80
13421
483.84
013426
547.0
50
12561
210.50
06318
1651.8
70
15539
656.31
014889
500
12
0.7
667.60
015289
809.8
10
16437
821.3
00
17536
580.95
0145481085.85
019545
3360.6
50
53631
500
14
0.31176.59
024768
1495.6
60
26501
2624.7
50
376091777.34
034008
3875.0
60
552273757.11
057156
500
14
0.5
752.56
017883
872.6
00
18748
683.6
60
16690
634.61
016455
1972.9
80
245201792.68
023579
500
14
0.7
495.98
012302
550.9
30
12758
348.0
80
7450
264.89
06811
962.2
10
18512
896.09
018495
500
16
0.3
TL406.5053306
TL
449.4
359672
TL
619.4
446029
TL430.0152856
TL
615.6
257658
TL500.6565709
500
16
0.5
2074.7
70
312141700.30
028692
TL
523.9
2394552857.08
036284
TL
828.4
755128
TL605.3360253
500
16
0.7
983.1
40
18503
832.28
020714
858.0
60
13508
797.21
012227
2416.4
60
300241594.62
026169
750
12
0.3
1461.7
50
25435
317.77
06675
1513.2
20
247071322.48
021647
2709.6
10
299681947.27
029961
750
12
0.5
933.14
015519
954.7
00
16898
866.9
30
14478
801.37
0134861247.14
016321
1474.0
40
17370
750
12
0.7
729.49
012586
756.2
50
12486
796.0
90
12442
726.50
012619
872.2
00
13082
662.78
08261
750
14
0.3
4944.2
70
523514366.46
051576
TL
350.5
6405224678.75
052326
TL
565.7
740298
TL409.2652450
750
14
0.5
1227.1
30
179391004.51
015520
1526.8
90
202021301.13
015922
2237.1
10
233701344.47
020205
750
14
0.71295.16
020890
1498.9
50
21073
1828.9
10
205961395.34
018243
455.85
09086
1734.1
50
25037
750
16
0.3
TL276.7644435
TL
327.3
943788
TL
542.6
845812
TL490.2045089
TL
663.5
954216
TL618.3155755
750
16
0.5
3207.3
50
300452469.24
0266031724.98
023394
4480.1
20
36059
TL
972.0
545450
TL584.8342349
750
16
0.7
1930.3
70
269631745.12
025159
TL
326.4
3340222771.51
026206
TL
775.7
445104
TL562.2440405
1000
12
0.31061.77
09955
1826.9
30
211631843.67
021799
2313.5
10
21839
3420.5
40
286282308.73
026465
1000
12
0.5
1403.3
70
133981335.79
0150921184.58
012462
1212.9
60
133921440.91
013906
1464.8
00
14976
1000
12
0.7
1283.3
90
143631145.05
012453
1403.3
70
123201074.83
011691
1172.3
60
107281088.94
012345
1000
14
0.3
TL
204.5
237625
TL
28.7941592
TL
292.0
541244
TL284.1342296
TL
500.2
829324
TL419.4731691
1000
14
0.5
2519.0
70
266401779.30
0204162393.76
023811
2713.6
10
23850
586.79
09419
2722.2
30
28957
1000
14
0.71612.23
015493
1619.8
30
170791603.05
016171
1895.3
30
18003
1904.4
40
180411854.58
017867
1000
16
0.3
TL
476.2
332942
TL433.1937231
TL
529.1
738714
TL507.8639908
TL
677.3
042804
TL673.4349242
1000
16
0.5
2941.8
20
289302564.71
024514
2709.9
70
231862161.05
021477
TL
720.5
927768
251.37
07737
1000
16
0.7
TL
262.2
428996
TL
40.0540593
TL
424.3
3258043605.20
034308
TL
722.8
532877
TL601.8933711
72
for Table 4.3.
On one hand, the BD implementation with the warm-start performs pretty well when
θ = 0.95 and θ = 0.90 where the number of instances solved to optimality are 22 and 23, re-
spectively. While BD-LB on average branches to 35,249 nodes with θ = 0.99, in these both cases
the average number of BB nodes visited is roughly 24,500. This could be a good indication for
having the smaller average solution time for the instances solved to optimality. For example, our
Benders implementations produce average solution times of 2,426 and 1,394 when θ is 0.99 and 0.9,
respectively. On the other hand, the optimality gaps returned for the instances where we fail to
attain the optimal turn out to be quite bad with θ = 0.95 and θ = 0.90 where the average gaps are
255.77% and 428.05%, respectively. We note that the reason behind this failure is that CPLEX is not
able to improve the dual bounds. Hence, it could be quite useful to identify new valid inequalities,
especially upper bounds, similar to ones proposed in Section 4.4.3 in a future research. We expect
that tighter upper bounds could increase the performance of BD to a great extent.
However, as we decrease θ value more (i.e., 0.8), the performance of BD again decreases,
where the number of instances solved to optimality becomes 17. First, the average number of BB
nodes pruned by the solver is 26,563. It is worth mentioning that it is still better than the ones
obtained with θ = 0.99. Next, the average optimality gap for the instances where the optimal is
not obtained turns out to be 552.82% which indicates that we are far way from the optimal and
extra computational efforts are required for such instances. It is important to note that during our
preliminary experiment we observed that the solver starts relatively performing better in branching
process when the optimality gaps go below 100% and converge the optimal quicker. However, it is
hard to get an insight on the performance of a black-boxed solver. At this point, we believe that
the more we decrease the θ value, the worse performance we will be observing since the problem
gets harder. Our intuition about what makes the problem harder is the existence of the feasibility
constraint which plays a role of a chance constraint. Whenever we decrease θ, the RHS also decreases
when considering the inequality as a less than or equal constraints.
4.6 Conclusion
In this chapter, we introduce a new centrality metric called the stochastic pseudo-star degree
centrality (SPSDC) for which we propose a non-linear binary optimization model. We study the
73
complexity of the problem and show that it is NP-complete on general graphs together with trees,
and windmill graphs. We implement a branch-and-Benders approach strengthened by the logic-
based Benders cuts and several other acceleration techniques (e.g., valid inequalities and generation of
multi-cuts). Our decomposition approach outperforms solving the model via a commercial solver to a
great extent in terms of both solution time and quality. Our test cases are generated according to the
small-world networks which resemble the real-world protein-protein interaction networks (PPINs).
The deterministic star degree centrality concept was shown to be an effective centrality metric in
order to detect essential proteins in PPINs and our proposed centrality metric can add to the set of
proteins to explore for essentiality..
In a future study, it might be worth examining new acceleration techniques as a result of
which BD can be used in solving large-scale PPINs. This would open a door to analyse the large-
scale biological networks in order to test the performance of this new centrality metric with respect
to detecting the essential proteins. In addition, it might be interesting to identify a new application
area where the SPSDC can be utilized. One good example might be to investigate the network
resilience in financial networks in order to detect the most important financial entities in a market.
74
Chapter 5
Optimizing the Response for
Arctic Mass Rescue Eventsâ
In this chapter, we propose an integer programming (IP) model to to respond to a large-scale
mass rescue event in the Arctic. In Section 5.1, we first motivate our work and discuss the necessity of
an optimization model for an Arctic mass rescue event. Section 5.2 summarizes the related research
and explains our contribution. Section 5.3 gives a brief problem description and provide the reader
with an illustrative example. The details of our optimization problem are provided in Section 5.4
where we explain each constraint in detail. We then introduce the solution methodologies and discuss
our attempts to solve the model in Section 5.5. Our experimental results and our findings are shared
in Section 5.6. The chapter is summarized and future research is presented in Section 5.7.
5.1 Introduction
The Arctic has been experiencing large-scale changes in the last decade with respect to
several aspects. Due to climate change and global warming, while the air temperature increases
(Przybylak and Wyszynski, 2020), sea ice thickness faces a rapid decrease (Shalina et al., 2020). In
addition, demographic changes including rises on both the number of non-indigenous people living
in the region and the birth rates are observed (Heleniak, 2020). Researchers indicate that unless
proper measures are taken, the infrastructure systems of the Arctic will be at risk by 2050 (Hjort
âThe paper has been accepted at Transportation Research Part E: Logistics and Transportation Review.
75
et al., 2018).
Maritime activities regarding tourism and the economy have been advancing as a result of
longer ice-free seasons (Messner, 2020; Ăsthagen, 2020). For instance, the Crystal Serenity , the
largest cruise ship to date to voyage in coastal Arctic waters, sailed between Anchorage, Alaska and
New York City through the Northwest Passage with 1000 passengers and over 600 crew members in
August 2016 and 2017 (Waldholz, 2016). In preparation for this event, a tabletop exercise (McNutt,
2016) was organized in collaboration with Crystal Cruises, the Canadian Coast Guard, Transport
Canada, the Department of the Defense (U.S. Air Force) and the U.S. Coast Guard (USCG) in
2016. This exercise identified gaps in Arctic maritime search and rescue resources and highlighted
the impacts of the resource gaps on evacuees being rescued. The conversations and activities sug-
gested the need for greater attention to Arctic mass rescue operations, and for greater visibility and
coordination of Arctic emergency response. We refer the interested reader to Elmhadhbi et al. (2020)
and Sarma et al. (2020) who highlight the importance of coordination between different emergency
responders during disaster response.
Recent changes in Arctic industrial activities, defense and tourism have amplified the need
for attention to resource availability and evacuee impacts during an Arctic mass rescue event (MRE).
Ship traffic and maritime activity in the region has increased, and will likely continue to increase in
the future (Ăsthagen, 2020) without agreements to limit the number of ships entering the region.
In 2021, in recognition of these trends, the U.S. Navy and the USCG, for the first time, issued a
joint Arctic strategy that cites expectations for increased Arctic maritime traffic due to commercial
shipping, natural resource exploration, tourism and military presence (Eckstein, 2021). Increased
Arctic maritime traffic occurs in waters that are largely uncharted because they have never been
ice-free in modern times. Only 4.1% of Arctic waters have been charted using modern multi-beam
sonar techniques (National Oceanic & Atmospheric Administration, 2021). Some waters were last
surveyed by Captain Cook using hand-held ropes and lead lines in the 17th century (Hoag, 2016).
Risks associated with maritime trade, and needs to consider personnel evacuation on ships, are
therefore significant and rising as maritime traffic increases, uncharted waters are increasingly ice-
free, and the size of passenger vessels is increasing (Statista Research Department, 2020).
Meanwhile, the oil and gas industry plays a major economic role in the economy of the region
and the lives of the people who live there (Morgunova, 2020). From a political perspective, both
Russia and China are seeking for economic benefits via expanding oil and gas exploration activities
76
and are making new investments in the region (Stepien et al., 2020; Ilinova and Chanysheva, 2020).
Yet, the oil price war between Russia and Saudi Arabia that took place in 2020 together with the
fallout from the COVID-19 pandemic has negatively impacted the oil drilling activities in the region
for United States (U.S) based-companies. For example, one of the largest oil companies, Conoco
Phillips announced that they would halt all drilling operations on the North Slope of Alaska (Hanlon,
2020). The Spring 2020 Revenue report released by the Alaska Department of Revenue forecasts
a $1.15 billion loss in revenue from oil in the current and next fiscal years (Alaska Department of
Revenue, 2020). Hence, from the U.S.âs side, new drilling activities are not expected to take place
in the near future although most of the existing oil-based activities continue to operate.
Arctic emergency response occurs in a setting that requires balancing activities related to
territorial disputes (Schofield and Ăsthagen, 2020); fishing and subsistence economies; endangered
species and wildlife habitats; industrial and commercial activity; and military operations (Allison
and Mandler, 2018; Ruskin, 2018; Humpert, 2019). Impacts from these activities can be particularly
significant in remote, seasonably variable, and infrastructure-poor settings with sparse populations
such as the Arctic.
The increasing number of visitors to the region is concerning due to the size of Arctic
communities. In Arctic Alaska, the largest community is Utqiagvik (formerly known as Barrow),
which has a population of 4335. The number of people on the Crystal Serenity was 34.6% of
Utqiagvikâs population and would exceed the population of most Arctic communities (see Table
5.1). Of further concern is that the health care system in Alaska was not designed for surges
resulting from potential Arctic MREs. There are currently 17 trauma centers in Alaska and only
two are Level II (Alaska Department of Health and Social Services, 2018) (Level I handles the
highest emergencies) and are located in Anchorage (700 miles away from Utqiagvik). The only
trauma centers in Arctic Alaska are in Nome, Kotzebue, and Utqiagvik (see Table 5.1). It is neither
reasonable nor desirable for the evacuees to stay in the Arctic Alaska communities for a long time
during a MRE. Communities in the Arctic are not equipped to host a large number of evacuees
for an extended period of time. In essence, responding to a MRE in Arctic Alaska becomes much
more difficult than responding to one in the continental United States since an influx of 1,600 people
would significantly strain the infrastructure of Arctic communities.
Maritime response operations require two sets of activities: (i) evacuating people from an
affected area to âsafe zonesâ (e.g., in our case, out of the Arctic), and (ii) providing them with
77
Table 5.1: Data on communities in Arctic Alaska (U.S. Bureau of the Census, 2019; Alaska Department ofHealth and Social Services, 2018)
Location Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik Crystal SerenityPopulation Number 3797 3245 692 247 237 584 4335 1600% of Pop. of Passengers 39.50% 46.22% 216.76% 607.28% 632.92% 34.60% 34.60% âTrauma Center Status Level IV Level IV â â â â Level IV â
the logistic support (i.e., relief commodities) throughout a period of time. In most maritime mass
rescues, once evacuees in distress are brought to shore, the response is often considered complete
since existing infrastructure typically has the ability to handle the influx of passengers. However, in
an Arctic MRE, two steps are required because of limited Arctic shelter, medical, food, and sanitary
infrastructure. Transporting evacuees from the cruise ship out of the Arctic by sea is neither feasible
nor preferred; for example, moving evacuees by sea to Anchorage from the North Slope of Alaska
could take more than 10 days, and this assumes the ship could hold and support the evacuees for that
length of time. As a result, maritime evacuation during this type of event comprises two aspects:
moving evacuees from the location of the evacuation (e.g., cruise ship) to local Arctic communities
and then out of the Arctic (e.g., into Anchorage, Alaska) and; providing evacuees their basic needs
through allocating resources and equipment. Such an evacuation process was seen most recently
in the grounding of the Akademic Ioffe, which ran aground about 45 miles away from Kugaaruk,
Canada (in the Arctic) on August 24, 2018 (Struzik, 2018). The sister ship of the Akademic Ioffe
reached it in 16 hours and brought all passengers to Kugaaruk (Humpert, 2018).
This chapter, which is the first work to model both maritime mass rescue evacuation and
logistics support, highlights the impacts and costs of resource constraints and unavailability, and
the impact on evacuees of those resource constraints during an Arctic MRE. Because an infusion of
evacuees in Arctic communities will strain the communitiesâ existing infrastructure and resources,
our model considers the communitiesâ capacities to handle the evacuees, given available shelter,
medical facilities and airport capacity, as well as system capabilities to bring resources and equipment
into the area to support the evacuees during the Arctic MRE. This work, therefore, captures the
characteristics of an infrastructure-poor setting such as the Arctic, and models the two requirements
of MREs (i.e., evacuation and logistics suport), which are unique research contributions. It is the first
work, to the best of our knowledge, to quantitatively assess disaster response to Arctic MREs, falling
into the broad area of âsmartâ disaster management (Neelam and Sood, 2020), where quantitative
tools are used to assess disaster response.
Outside of Arctic Alaska, the situation where there may be two phases of transportation
required for an evacuation would arise in other applications in remote regions, especially when
78
considering tourism. For example, evacuating tourists from sudden onset wildfires may involve
moving them immediately out of the area impacted by the event (e.g., using buses or cars) and
then sending them home from these safe locations using aircraft. A similar situation could arise in
popular remote trekking areas (i.e., the Himalayas) should avalanches occur preventing the trekkers
from leaving the remote area. In this case, helicopters may be used to move the trekkers out of the
remote region to local communities prior to sending them home. A major finding of our analysis
on Arctic MREs is that the transportation resources are a major bottleneck in the process, which
would also provide insights into these other applications.
The remainder of this chapter is organized as follows: Section 5.2 summarizes the related
research. Section 5.3 gives a brief problem description. The details of our optimization problem
are provided in Section 5.4. We then introduce the solution methodologies in Section 5.5. Our
experimental results and our findings are shared in Section 5.6. The paper is summarized and future
research is presented in Section 5.7.
5.2 Literature Review
An Arctic mass rescue operation is similar to evacuating people from an area either before,
during, or after a disaster , with important distinctions especially since the closest communities to
incident sites are relatively small and we still need to move the evacuees out of the Arctic due to
the reasons discussed in Section 5.1. The following studies are the areas most closely related to our
work.
5.2.1 Evacuation Models with Relief Distribution
At a high-level in evacuation models, evacuees are transported from an affected area to safe
zones, such as shelters, hospitals or distribution centers, and the required commodities are delivered
from major supply centers to support them. While Uster and Dalal (2017) develop a mixed integer
linear programming model with multi-objectives to help the integration of the evacuation process
and relief material distribution after a natural foreseeable disaster (e.g., a hurricane), Stauffer and
Kumar (2021) analyze the importance of taking the disposal cost of unused items into consideration
when making initial resource deployment decision before a predictable disaster. Sabouhi et al.
(2019) design an optimization model whose goal is to provide relief commodities to evacuees and
79
transport them to shelters in the aftermath of a natural disaster, along with making routing and
scheduling decision for the vehicles used during the evacuation. Setiawan et al. (2019) propose
three different models to determine the best distribution center locations to obtain the optimal relief
resource deployment after a sudden-onset disaster (e.g., an earthquake). In another study, Li et al.
(2020b) addresses a scenario-based hybrid robust and stochastic network design problem to identify
the best integrated logistics decisions in terms of relief commodity and casualty distribution. Shu
et al. (2021) propose a network design model making emergency support location and supply pre-
positioning decisions and design a cutting plane algorithm to solve it. Zhong et al. (2020) similarly
look at a network design model and a detailed vehicle routing problem to deliver pre-positioned
goods to key distribution points (which could include shelters).
There are several shortcomings of applying this previous work to an Arctic MRE. First,
none of these studies consider deprivation costs, which is critical in post-disaster humanitarian
logistics models in order to capture the actual impact of the event on people (HolguÄąn-Veras et al.,
2013). Second, they do not consider the potential to transport relief commodities between the âsafe
zonesâ during the response, which is important in our situation since we can move existing stockpiles
between Arctic communities. Third, these previous studies do not consider moving evacuees out of
the âsafe zonesâ (Arctic communities) towards another location (Anchorage) and measure the time
to reach this final location. To the best of our knowledge, our study is the first to consider all these
features in an optimization model for Arctic MREs.
5.2.2 Prioritizing Victims During a Disaster
The concept of effectively prioritizing victims from a disaster has been well-studied. The
idea is to quickly triage victims in order to group them together and prioritize who receives relief
commodities. Existing triage methods include START (Elbaih and Alnasser, 2020) and SALT (Mc-
Kee et al., 2020). Sung and Lee (2016) use a survival probability function to prioritize victims in
order to optimize the transport of victims in ambulances to available hospitals in a mass casualty
incident. Liu et al. (2019) develop a multi-objective optimization model that identifies temporary
medical service facility locations and distributes the casualties to those facilities by taking casualty
triage and limited resources into consideration. (Rambha et al., 2021) propose a stochastic model
to identify the optimal patient distribution at a hospital after a hurricane where patients are cate-
gorized based on risk levels. Finally, Farahani et al. (2020) survey the operations research literature
80
on mass casualty management and express the importance of on-site triage for successful disaster
management.
The limitations of this previous work is that it does not model how relief commodities
allocation decisions can impact the priority level of the victims (in our case, the evacuees). We
believe that modeling the role of deprivation time has on increasing priority levels is important and,
further, will help to better capture the impact of the event on the evacuees.
5.2.3 Modeling the Impact of Relief Commodities
It is likely that during a large-scale, non-routine event that there will be a surge in demand
for relief commodities and therefore, the allocation of the scarce relief commodities is of utmost
important in order to minimize the impact of the event. For example, RodrÄąguez-EspÄąndola et al.
(2020) propose a multi-objective, stochastic optimization model to mitigate the shortage seen in relief
aid, shelter and healthcare support during the disaster preparedness process. The authors show that
shelter allocation decisions play a significant role to cope with deprivation of relief resources and
its impact on evacuees. Li et al. (2018) employ a simulation model to emphasize the importance
of having explicit knowledge of the scarce vaccine inventory at hand in the case of an influenza
pandemic. The authors indicate that enhancing the visibility of inventory levels in vaccines brings
several benefits including increasing the vaccine allocation efficiency and decreasing the impact of
the pandemic.
Doan and Shaw (2019) discuss stochastic optimization techniques to allocate scarce relief
resources among multiple locations in the face of multiple, simultaneous disasters. This work high-
lights the influence of political aspects (e.g., inequities between different regions) during resource
allocation. Ramirez-Nafarrate et al. (2021) study a location-allocation problem to overcome the
trade-off between insufficient relief resources and limited response time, and provide a heuristic al-
gorithm to solve it. Lastly, we refer the reader to Ye et al. (2020) who provide an extensive review
on successful management of disaster relief inventory.
This previous literature demonstrates that relief allocation plays an important role in the
aftermath of a disaster. This is especially important in the Arctic context since it is expected that
existing resources and equipment in Arctic communities will not be able to support the evacuees
and, therefore, we must correctly plan how to allocate resources and equipment from a central hub
(such as Anchorage). It further stresses the importance of dynamically updating our allocations
81
over the duration of the response, factoring in the planned movements of evacuees out of the Arctic
communities.
5.2.4 Deprivation Costs in Humanitarian Logistics
In our application, the evacuees have demand for relief commodities and it is likely that
we will not be able to fulfill all demand. HolguÄąn-Veras et al. (2013) were the first to argue that
deprivation costs should be used instead of simply penalizing unmet demand as the former better
captures the true costs of human suffering. The authors discuss the ethical implications of prioritizing
the deprivation costs of the response as opposed to the logistics cost of the response. A key finding
is that the actual estimation of the true parameters of the deprivation cost is not a primary concern
- simply including a deprivation cost function is important. Following up on this work, Perez-
RodrÄąguez and HolguÄąn-Veras (2015) propose an innovative mathematical model to address the
challenges during inventory allocation in the aftermath of a disaster based on the notion of welfare
economics and deprivation costs. The objective of the model is to minimize the social cost incurred
during the response time and examine a heuristic method to solve this problem. In addition, Yu
et al. (2019) propose a nonlinear integer programming model to measure the performance of resource
allocation after a large-scale disaster by considering three metrics: efficiency, effectiveness, and
equity. The authors capture the effectiveness component through deprivation costs.
We will incorporate the concept of deprivation cost since it is more suitable and realistic
than to penalize unmet demands in a large-scale disaster. We discretize the deprivation cost function
and further consider situations in which fulfilling resource demands does not eliminate the entire
deprivation cost. While the model introduced by Perez-RodrÄąguez and HolguÄąn-Veras (2015) has a
non-linear and non-convex objective function, we propose an integer linear programming model (by
discretization) having a similar objective component which aims to minimize the impact of unmet
demands on the evacuees.
5.2.5 Arctic Alaska and Emergency Response
Any tactical operation performed in Arctic Alaska would face major challenges due to (i) the
remoteness of the region, (ii) the lack of infrastructure throughout the Arctic, and (iii) the difficulty
of operating in Arctic conditions. Thus, existing policies and approaches for a MRE would not be
82
fully applicable and must be adapted to understand an Arctic event. In the literature, there are a
few social (i.e., non-operations research based) studies conducted specifically for Arctic emergency
response events. Fjørtoft and Berg (2020) discuss and emphasize on the importance of preparedness
to sustain safer maritime and offshore operations in the Arctic Ocean. While Rogers et al. (2020)
provide arguments on the potential challenges that could be faced during the Arctic SAR events,
Pavlov (2020) discusses the issues and limitations expected to occur in oil spill incidents. Afenyo
et al. (2020) review risk assessments techniques for oil spills in the Arctic. Kelman (2020) examine
the need for and importance of settlement and shelter after an emergency response event in the
Arctic.
To our best knowledge, Garrett et al. (2017) are the first who develop an optimization model
for an Arctic emergency response event. The authors create a mixed-integer linear programming
model to understand how to site oil spill response resources to increase response capabilities in
Arctic Alaska. The oil spill response modeling introduced the concepts of follow-up tasks to deal
with the likely situation of missing deadlines of certain key response tasks. This directly models
the remoteness of the region where previous research would not be applicable. The researchers
address some policy questions, such as stockpile and infrastructure investments, that can be utilized
in long-term planning efforts. We complement this work by examining a different type of emergency
response, namely Arctic MREs. Future work in Arctic emergency response could consider the role
of unmanned vehicles (Aiello et al., 2020), especially given the harsh environments that the response
may be operating in.
5.2.6 Our Contribution
In this research, we create a mass rescue model whose objectives are to minimize the impact
of a maritime accident in the Arctic on the evacuees and minimize the average time required for the
evacuees to move out of the Arctic. We believe that this is the first work presenting an optimization
model designed specifically for an Arctic MRE, which is increasingly important as maritime activities
in the area are projected to increase in the near future.
Most importantly, our model and quantitative analysis can be used to assess gaps in Arctic
MRE capabilities and can thus be used to prioritize investments to improve these capabilities.
Beyond these technical contributions, our work is important since it introduces an important area
where future transportation will likely take place due to changes in the Arctic.
83
Although detailed passenger evacuation aboard vessels has been well-studied (e.g., Hu et al.
(2019)), models to assess the gaps in passenger evacuation in remote and infrastructure-poor set-
tings have received less attention despite its practical importance. In Arctic workshops and table-
top exercises, emergency response leadership acknowledged that an Arctic MRE would likely not
accommodate all passengers and would overwhelm Arctic villages because of inadequacies in evacu-
ation transport, support logistics, and medical, berthing, sanitary and housing requirements (Arctic
Domain Awareness Center, 2016). Tabletop exercises, such as the Arctic Incident of National Sig-
nificance (Arctic Domain Awareness Center, 2016) and Arctic Maritime Horizons Workshop (Arctic
Domain Awareness Center, 2021), help to lay out the challenges of Arctic emergency response. Our
work contributes to these exercises since it seeks to quantify the impact of inadequacies. It also
determines the gaps in planning exercises and preparation phases in terms of transportation and
logistics operations and reveals the importance of the role of optimization in emergency response in
the Arctic. Therefore, it moves beyond tabletop exercises and highlights the human costs associated
with large-scale disaster response in infrastructure-poor settings: evacuees will not be evacuated in
a timely manner, or at all, and there could be significant strain on local communities.
5.3 Problem Description
Arctic mass rescue events (MREs) require moving evacuees from a distressed ship, transport-
ing them to Arctic communities, and then transporting them out of the Arctic (we focus specifically
on moving them to Anchorage) to complete the operations. This needs to occur while supporting the
evacuees as well. Movement within our problem can be represented with a transportation network,
an example of which is in Fig 5.1.
The modeling process for our study involved observing tabletop planning exercises and
stakeholder interviews to form some of the core assumptions of our model. We were able to observe
the Northwest Passage Tabletop Exercise in 2017 involved a variety of stakeholders from Canada
and the United States and served as a planning exercise to understand the response to a MRE.
This helped to highlight some of the considerations that would go into decision-making in real-time.
We also asked initial scoping questions to officials in District 17 of the United States Coast Guard
(USCG), which covers the entire state of Alaska. These officials had significant experience in search
and rescue (including participating in the aforementioned tabletop exercise). We were also able
84
Anchorage
Nome
Kotzebue
Atqasuk
Point Lay
Point Hope
Utqiagvik
Wainwright
670 miles
695 miles
545 miles
535 miles
717 miles
Figure 5.1: Visualization of transportation network in North Slope
to answer important questions from the practitionerâs perspective in building our model and data
including:
⢠What would be the process of moving evacuees out of the Arctic? Answer: evacuate them using
vessels to Arctic communities and then use air assets to move them out of these communities.
⢠What type of assets would be used to transport evacuees to shore? Answer: a combination of
USCG vessels and vessels of opportunity.
⢠Where and how would evacuees be transported once on-shore? Answer: A combination of
federal, state, and privately-owned aircraft.
⢠How would the Air National Guard and the U.S. Air Force be involved? Answer: They would
be significant in terms of the logistics required to support evacuees with resources and assets.
5.3.1 Important Concepts Used in Modeling
In order to model the impact of the event on the evacuees, we introduce three important
discussions on priority levels of the evacuees, the relief commodities (classified as either resources
or equipment) and their role, and then how to model when evacuees are deprived of those relief
commodities.
5.3.1.1 Priority Level
The priority level of the evacuee is meant to model his/her medical status where a lower
priority status is associated with a lower severity. If the demand (needs) of an evacuee are not
fulfilled, then their priority level may increase. Alternatively, the level may decrease with appropriate
85
medical care (although we note that is not likely to occur during the event given the limitations of
health care facilities and number of medical personnel in Arctic Alaska). We aim to make logistics
decisions in order to minimize the deterioration on evacueesâ existing medical states and transport
them to Anchorage as soon as possible in order to provide service there. It is important to note that
having a higher priority level does not necessarily mean that it is best to provide relief commodities
to a person since it may be important to be proactive and to prevent the medical status of the
other evacuees from getting worse. Further, certain relief commodities may only be necessary for
certain priority levels. Our proposed modeling will focus on allocating relief commodities in order
to minimize the cumulative impact of the event across all evacuees.
5.3.1.2 Relief Commodities
Relief commodities are defined as items given to evacuees in order to meet their basic needs.
Example commodities include food, water, shelter, and bedding. Based on these examples, it is
clear that a finer categorization into resources and equipment is necessary to capture the differences
between consumable and non-consumable commodities. Resources are defined as commodities where
the evacuees will have a recurring demand for them. Equipment can be viewed as a âone-timeâ
demand that, once fulfilled, is satisfied. Further, equipment will become available once someone
assigned the equipment leaves the particular Arctic community, e.g., a bed can be reassigned to
another person. The re-allocation of equipment plays an important role in our model due to i) the
limited number of stock in the region, ii) non-consumability, and iii) the non-transportability of
certain equipment.
The demand for resources is likely similar for all priority levels, although missing the demand
may result in more severe impacts for higher priority levels (which we will discuss in the next section)
or may result in an evacuee increasing their priority level. However, the equipment needs for priority
levels will change since medical support (via a bed in a medical center) is necessary for the highest
priority level. This fact will complicate our models as the evacuee may be using equipment (e.g., a
normal bed) when they enter the highest priority level and only release the equipment once their new
equipment demand is met. We assume equipment demand is satisfied (except for medical support)
while the evacuees are in transit since assets are already equipped to a certain extent.
86
5.3.1.3 Modeling the Impact of Deprivation on the Evacuees
The idea of deprivation-based penalty costs (Perez-RodrÄąguez and HolguÄąn-Veras, 2015) is
to capture the fact that the longer an evacuee goes without having their basic needs (e.g., food and
water) met, the more impactful it is on the evacuee. For instance, a six-hour lack of water does not
have 1/4th
impact on a human body compared to having been without water for 24 hours. Hence,
the deprivation cost is computed as an exponential-like function of the discrete deprivation time
(HolguÄąn-Veras et al., 2013). Furthermore, note that assuming that met demands fully eliminate the
deprivation cost, which implies that all the impact of being without resources is alleviated, is not
realistic.
In order to best illustrate the concept of the hysteric case in the deprivation cost function,
Fig 5.2 depicts the costs as the time without a resource increases. Suppose an evacuee has not been
provided with water for eight time periods (from A to C). For simplicity, also assume that if the
demand is satisfied at time t corresponding period 8 (at point C), the deprivation time declines to
period 5 - in other words, to point B. Note that the curves D-E and C-B are identical. This implies
that though the demand is met, some amount of deprivation cost is still incurred due to the human
suffering as a result of high deprivation time.
Deprivation Time (in period)
Dep
rivation
Co
st
s=8
s=5
A
B
D
E
3 Time Periods
C
Figure 5.2: Illustration of deprivation cost function (adapted from Perez-RodrÄąguez and HolguÄąn-Veras(2015))
HolguÄąn-Veras et al. (2013) propose a continuous generic deprivation cost function shown in
Eq 5.1:
Îł(δit) = e(1.5031+0.1172δit) â e(1.5031) (5.1)
87
where δit is the deprivation time at node i representing an evacuee in time t. We will adapt this idea
to account for both the length of resource deprivation (defined as sr) and equipment deprivation
(defined as se) and the priority level adjustment (defined as p). We note that when the demand of
an evacuee for resources are met, we may not decrease sr all the way to one in order to capture the
hysteretic behavior.
Eq 5.2 demonstrates how to compute the deprivation time as a funtion of sr and se and Eq
5.3 defines the adapted deprivation cost function.
δt = Îąsr + (1â Îą)se, 0 ⤠ι ⤠1 (5.2)
Îş(p, δt) = e(1.5031+0.1172pδt) â e(1.5031) (5.3)
where Îą is a non-negative constant which is preferably set as close to 1 to emphasize the importance
sr since equipment deprivation is not nearly as impactful as resource deprivation. During so-called
shoulder season MREs, where a lack of access to heat and shelter can have detrimental health
impacts (Mak et al., 2011), we can tune Îą appropriately.
5.3.2 Objectives
There are many different criteria that may be used to evaluate the response. First, it is
necessary to examine the average evacuation time of the evacuees through the different âstagesâ of the
response efforts (i.e., off the ship and then out of the Arctic). Second, it is necessary to understand
the impact of the response on the evacuees, which will be measured through the use of deprivation
costs. Third, it may be necessary to understand the variable costs incurred during the response.
We now discuss each of these in more detail.
The average evacuation time consists of the time evacuees leave the cruise ship and the time
evacuees arrive at Anchorage. Given the fact that we are evacuating a distressed cruise ship, we
will enforce a penalty cost (in terms of time) for evacuees left on the cruise ship at the end of the
response horizon. The evacuees that are left on the cruise ship or in the Arctic communities would
still be evacuated but outside of the âdesiredâ target time of our planning horizon. In addition, we
seek to move evacuees out of the Arctic communities and, therefore, we impose a similar penalty for
evacuees in an Arctic community at the end of the response horizon. In most situations, it is likely
that the evacuation time criteria will be quite important since it helps measure when people return
88
to stable conditions.
During the evacuation, we aim to make sure that evacuees are properly taking care of,
as best as possible. With this purpose, we examine the current status of evacuees in each time
period. Modeling the current status of evacuees is conducted by a network called the âstatus
networkâ consisting of nodes (p,sr,se) where p represents the priority level, sr represents the time
without resources, and se represents the time without equipment. Each status is associated with
a deprivation cost where higher priority levels and deprivation times imply higher costs (using Eq.
(5.3))
Based on examining just these two criteria, we have a multi-criteria decision making problem
(MCDMP). We refer the reader to Triantaphyllou (2000) and Chankong and Haimes (2008) for more
details and further discussions on MCDMPs. In the MCDMP evacuation literature, work has used
the weighted sum method (Stepanov and Smith, 2009) and the Îľ-constraint method (Jenkins et al.,
2019). In preliminary modeling efforts, we considered the deprivation costs and evacuation times as
separate objectives and explored the efficient frontier between these two objectives using a weighted
objective. However, there were only two efficient solutions: (1) the one the we present in this paper
that focuses on the evacuation objective and then doing the best to support the evacuees during
this response and (2) one that evacuees would stay on the ship as long as possible to consume
resources/equipment there since it is well-stocked. The solution (1) found that there was enough
time and available air cargo capacity to âprepâ the villages for the incoming evacuees in order to
support their basic needs; solution (2) was not practical since there is a desire to move the evacuees
off the distressed ship as quickly as possible, which was confirmed by our partners. In rare events,
the evacuation might not begin immediately upon rescue ships arriving at the incident location since
hasty evacuation might cause detrimental cascading events (e.g., during poor weather). In this case,
no evacuation decisions could be made until the poor weather lifted and our model would âstartâ
once these decisions begin.
In general, passenger vessel evacuation principles and operations are codified in interna-
tional agreements through the International Maritime Organization (IMO), the branch of the United
Nations that regulates global maritime shipping. The IMO Polar Code (International Maritime Or-
ganization, 2016), which the U.S. is a signatory, defines the international regulations for maritime
operations in the Arctic. The IMO (United Nations (2020)), in their Safety of Life at Sea principle,
discusses that human life takes precedence over all other considerations in an evacuation. In general,
89
we have followed this code in examining our modeling process, although we should discuss MRE
costs.
We now discuss the operational/physical costs incurring during the transportation of evac-
uees. We note that much of these costs are paid pre-response (e.g., if the USCG responds, the
personnel in the response are salaried and, as a second example, stockpiles of dedicated response
resources are often maintained). In terms of variable costs, it is initially assumed that the responsi-
ble party, likely the operator of the cruise ship, will assume the the costs of search, rescue, recovery
and salvage operations. However, the U.S. Oil Pollution Act of 1990 required that when the RP is
not solvent, able to assume the costs or cannot be located, the event is federalized and the response
and rescue operations are funded by federal funding, including the Harbor Maintenance Trust Fund
(US EPA, 2020).
In terms of examining costs, we focus on variable costs associated with the response. For
example, if a plane carries relief commodities to a village and leaves the location without taking any
evacuee, we consider such operation a âcostâ since it has not moved evacuees out of the Arctic. On
the other hand, if the plane leaves its location with evacuees on board, then incorporating operating
costs to the objective would only change the trade-offs between costs and evacuation times if we chose
to increase the evacuation time portion of the objective. We ran some experiments and discovered
that when we restrict the number of air operations, where a plane goes into a village with resources
and/or equipment and leaves without picking anyone up, the solutions obtained remains the same
in every incident compared to the âoriginalâ setting (see Section 5.6.2). This implies that the model
produces the same objective with or without incorporating the operational costs or air flights. In
other words, even if we associate each air operation with a cost, the model would produce the same
/ similar solutions in each incident as the ones we have obtained unless we prioritized cost above
evacuation. A similar observation would occur should we begin limiting the number of ships used
in moving passengers from the cruise ship to the villages. Therefore, despite these costs being a
potential criteria to evaluate the response, they do not need to be examined in more detail in our
experiments.
5.3.3 Assumptions
We now discuss some of the underlying assumptions within our model. We examine a
deterministic planning environment which implies that the priority levels of evacuees as well as
90
the number of assets involved in the response are known in advance. We assume that there are
deployment times for the assets to model the fact that they may need to prepare to help with the
response. We assume partial allocation amongst evacuees within the same group (in a flow network)
and that equipment demand may be met in transit with the exception of medical equipment demand.
We also assume that there is no financial restriction on procurement and transportation of any
resource and equipment (e.g., see the discussion in Section 5.1). We also assume that there is a
location (for example, Anchorage, Alaska) that has enough resources to fully support evacuees once
they arrive there. This means the response for that evacuee is âcomplete.â As for the cruise ship,
we assume that there is an adequate amount of resource and equipment stock for a certain amount
of time to take care of the evacueesâ needs on board. We lastly assume that we will not distribute
resources for consumption during travel (i.e., in transit).
Transportation and allocation decisions are performed at the end of each time period. Hence,
the evacuation event is initiated at t = 1. If the resource demand is not satisfied for an evacuee in
a time period, then sr will increase by 1 and may cause a âjumpâ in priority level. This assumption
is considered realistic since resources to be dispatched (i.e., water and food ) have vital importance
in terms of the impact on a human. If se > 1 and equipment demand is not met, then se will
increase by one; however, equipment will not cause an increase in medical status. If se = 1, then
we have that the evacuee has equipment and, therefore, we can view se = 1 as an absorbing state,
i.e., your equipment demand will remain satisfied. If resource demand is satisfied, then sr will
decrease according to the flow arcs (Section 5.3.4) connecting (p,sr,se) nodes in the status network.
If equipment demand is satisfied, then se is set to 1.
There are five decisions that can be implemented on an evacuee in a in a time period, which
determines their status in the next period: i) An evacuee may receive all the required resources and
equipment, ii) An evacuee may receive neither resources nor have its equipment demand met, iii)
An evacuee may only receive the required resources but not have their equipment demand met, iv)
An evacuee may have their equipment demand met but not be provided with resources, or v) An
evacuee may be transported to another location - either a community or Anchorage via an asset.
This implies that the equipment demand is satisfied. We create five sets of flow arcs to utilize in the
balance constraints (Section 5.4.2) in order to model the impact of these situations on the evacuees.
In the next section, the flow arcs designed to model the status of evacuees are introduced. We
further explicitly discuss how priority levels might change after each decision.
91
5.3.4 Flow Arcs
We design five different sets of flow arcs to understand how the five possible decisions
impacting an evacuee (based on resource and equipment allocation decisions) will impact their
status. Remember that each (p,sr,se) is represented as a node in the status network. An arc is
presented between node (p,sr,se) and (p,sr,se) if the decision represented by the corresponding arc
set causes a status change from (p,sr,se) to (p,sr,se). We will use (p,sr,se) to represent the status
change of the evacuee. The first four arc sets are taken into consideration when an evacuee is in a
location. The sets of flow arcs are:
i. The Resource Satisfied Set (ERSS) : When an evacuee receives only the required resources, the
ERSS is utilized to decide the (p,sr,se) situation of the evacuee in the following time period.
We will always have that p = p (since priority level cannot increase due to unmet equipment
demand) and sr < sr. If se = 1, then se = 1. Otherwise, se = se + 1.
ii. The Equipment Satisfied Set (EESS): The EESS is utilized when equipment is the only com-
modity allocated to an evacuee. In this case, we have that se = 1 and sr = sr+1. The priority
level, however, may increase by 1, i.e, p ⤠p ⤠p + 1 if sr was the last time period before a
priority level jump. There is an exception, though, where if p is the transition priority level
(meaning that a person needs medical support but has yet to be assigned to a medical shelter)
than satisfying the equipment demand also has the priority level jump to a different priority
level.
iii. Both Resource and Equipment Satisfied Set (EBSS): Both resource and equpiment demands
are met. In this case, we have that p = p, sr ⤠sr (it may be equal if sr = 1), and se = 1.
iv. Both Resource and Equipment Non-Satisfied Set (EBNS): The worst-case scenario is not being
able to allocate any resource and equipment to an evacuee. The EBNS is utilized to determine
the p-sr-se condition of an evacuee if no resource and equipment is provided to him. We have
that sr = sr + 1, se = se + 1, and p ⤠p ⤠p + 1 where p = p + 1 if sr was the threshold for
the priority level jump.
v. Travel Set (ETS): It is unlikely that the evacuees would receive any resources while traveling on
rescue ships. Therefore, when an evacuee is being transported with an asset, the corresponding
sr increases based on the travel time. If an evacuee is being transported from location i to
location j via asset a with travel time Ďija â Z+, then the evacueeâs resource period increases
92
by Ďija and becomes sr+Ďija once the evacuee reaches location j. Note that se keeps increasing
only for those who are in either the highest priority level or transition priority level, since no
medical service can be provided on an asset. Lastly, the priority level might go up if sr reaches
the same bound set in the EESS .
5.3.4.1 Illustrative Example
We provide an illustrative example to elaborate on how the allocations decisions are con-
ducted and start our discussion with the assumptions specifically made for this example. We only
consider allocating one resource for the sake of simplicity. Thus, only resource allocation decisions
are taken into consideration: (i) an evacuee receives the required resource or (ii) an evacuee cannot
receive the required resource.
Note that all the decisions (i.e., transportation and allocation) are made at the end of each
time period. We are currently in time period 4 of the evacuation and seven evacuees have reached
communities. There are two priority levels and three communities. We focus on a single type of
resource (e.g., food) so we are only keeping track of sr and travel between the communities requires
one time period.
sr=1 sr=2 sr=3 sr=4 sr=5
p=2
p=1
Figure 5.3: Evacuees in community 1 at time 4
sr=1 sr=2 sr=3 sr=4 sr=5
p=2
p=1
Figure 5.4: Evacuees in community 2 at time 4
Fig 5.3 and Fig 5.4 illustrate the status of the seven evacuees across the two communities
they are located. In Community 1, we have one evacuee with (p = 1, sr = 2), three evacuees
with (p = 1, sr = 3), and one with (p = 2, sr = 4). In Community 2, we have two evacuees with
(p = 2, sr = 3). In Community 1, there is enough food to satisfy three evacueesâ demands. There is
not enough food in Community 2 to satisfy the demands of the evacuees.
In order to understand the (p,sr) status of the evacuees in the next time period, t = 5,
resource allocation decisions are made and evacuees move along arcs represented in Fig 5.5 and Fig
5.6. Note that the âpriority jumpâ occurs when sr = 3.
The following allocation decisions were made in time period t = 4 resulting in the movements
pictured in Fig 5.7 and Fig 5.8:
93
sr=1 sr=2 sr=3 sr=4
p=2
sr=5
p=1
Figure 5.5: Movements when resource demand isnot satisfied
p=2
p=1
sr=1 sr=2 sr=3 sr=4 sr=5
Figure 5.6: Movements when resource demand is sat-isfied
sr=1 sr=2 sr=3 sr=4 sr=5
p=2
p=1
Figure 5.7: Evacuees in community 1 at time 5
sr=1 sr=2 sr=3 sr=4 sr=5
p=2
p=1
Figure 5.8: Evacuees in community 3 at time 5
⢠In Community 1, food is allocated to the evacuee at node (p = 2, sr = 4) due to their higher
priority level and this person will transition to (p = 2, sr = 2) in the next time period. The
two other units of food are distributed to two evacuees in (p = 1, sr = 3), which prevents them
from jumping the next priority level (see Fig 5.7).
⢠The third evacuee in Community 1 with (p = 1, sr = 3) is transported to Community 3 by a
plane. When this evacuee arrives in Community 3, his status level will be (p = 2, sr = 4) since
he reached the jump (see Fig 5.8). The evacuee in Community 1 with (p = 1, sr = 2) will not
receive food and therefore transition to (p = 1, sr = 3) in the next time period.
⢠The two evacuees who are in Community 2 depart the community towards Community 1. They
do no receive food and, therefore, they will arrive in Community 1 in the next time period at
(p = 2, sr = 4). The logic behind such a transportation decision could be that the âgroupingâ
of evacuees will make it easier (and quicker) to move the group to Anchorage in the coming
time periods, thus making better use of plane capacities. For example, we may then choose
to transport all 6 evacuees in Community 1 to Anchorage in time period 5 (thus arriving in
time period 6) where moving the 2 evacuees directly to Anchorage would result in the other
4 (currently in Community 1) not arriving in Anchorage until time period 7 since we must
travel from Community 2 to Anchorage then to Community 1 and then back to Anchorage.
The average evacuation time with the âgroupingâ would be 6 while the average evacuation time
would be 6.33.
The summary of all the decisions conducted between time 4 and 5 along with the consequences
of these decisions are provided in Table 5.2. Note that transportation decision also implies a non-
94
Table 5.2: Decisions conducted at the end of time 5 and their consequencesEvacuee 1 Evacuee 2 Evacuee 3 Evacuee 4 Evacuee 5 Evacuee 6 Evacuee 7
Beginningof Time 4
Location Comm. 1 Comm. 1 Comm. 1 Comm. 1 Comm. 1 Comm. 2 Comm. 2Priority 1 1 1 1 2 2 2Period 2 3 3 3 4 3 3
End ofTime 4
DecisionNon
SatisfiedSatisfied Satisfied Transported Satisfied Transported Transported
Beginningof Time 5
Location Comm. 1 Comm. 1 Comm. 1 Comm. 3 Comm. 1 Comm. 1 Comm. 1Priority 1 1 1 2 2 2 2Period 3 1 1 4 2 4 4
satisfied demand.
The importance of modeling these allocation decisions, rather than allowing a greedy allo-
cation of resources, is that it helps decrease the impact of the event on the evacuees. For example,
we have examined a test scenario with 75 evacuees in the lowest priority level are in a village having
144 available units of a relief resource over a horizon of six periods. If a greedy allocation was used,
i.e., allocating this resource whenever there is a demand for it, then we obtain a deprivation cost
of 438.87 and 69 evacuees jumps to the next priority level. On the other hand, when using our
optimization model to allocate resources, we observe that not a single evacuee jump to next priority
level and the total deprivation cost turns out the be nearly the half of the greedy one (i..e, 283.89).
The use of the model allows us to allocate relief resources efficiently and identify the bottlenecks in
the logistical decisions.
5.4 An Optimization Model for Arctic MREs
We present the optimization model for Arctic MREs in this section. Our model and analysis
assumes that there is a centralized decision-maker (or, equivalently, full coordination and awareness
by all involved agencies). This is reasonable as we are using it to assess capability gaps and under-
stand where vulnerabilities exist in potential response efforts. This further means that we do not
need to specifically consider the areas of responsibility for an individual organization.
It is our goal to capture all features of the problem to truly identify âgapsâ in response
capabilities. In our study, the majority of the parameters presented in Table 5.5 (e.g., airport
capacities, hosting capacities) can be gathered from existing data sources. With this regard, we
provide a wide range of what-if analysis (see Section 5.6.3) to understand key factors surrounding
policies within Arctic Alaska. In terms of the deprivation cost function, and its parameters, it is hard
to estimate but HolguÄąn-Veras et al. (2013) discuss that simply including this type of cost function
is often sufficient for modeling purposes (as opposed to capturing its exact parameters).
95
The definitions of sets, variables, and parameters are shown in Table 5.3, Table 5.4, and
Table 5.5, respectively. Note that we use C. Ship, Anc., P. Shelter, and Med. for the cruise ship,
Anchorage, portable shelter, and medical support, respectively as abbreviations.
Table 5.3: Set definitionsSet DefinitionA Transportation assets (fixed-wing aircraft, large and small ships)Aa Planes (fixed-wing aircraft)CR Consumable resources (water, food)RE Reusable equipment (portable shelters, sleeping bags, medical support)T Time periodsSr Periods representing the amount of time passed without access to resourcesSe Periods representing the amount of time passed without access to equipmentC Locations (the cruise ship, communities and Anchorage)V CommunitiesP Priority levelsV Set of nodes where each node is represented by priority level p â P and periods sr â Sr, se â SeERSS Set of arcs showing the transitions between each pair of node u = (pi, s
rj , s
ek) and node v = (pl, s
rm, s
en)
where u, v â V, for satisfied resource and non-satisfied equipment demandsEESS Set of arcs showing the transitions between each pair of node u = (pi, s
rj , s
ek) and node v = (pl, s
rm, s
en)
where u, v â V, for non-satisfied resource and satisfied equipment demandsEBSS Set of arcs showing the transitions between each pair of node u = (pi, s
rj , s
ek) and node v = (pl, s
rm, s
en)
where u, v â V, for satisfied resource and equipment demandsEBNS Set of arcs showing the transitions between each pair of node u = (pi, s
rj , s
ek) and node v = (pl, s
rm, s
en)
where u, v â V, for non-satisfied resource and equipment demandsETSĎ Set of arcs showing the transitions between each pair of node u = (pi, s
rj , s
ek) and node v = (pl, s
rm, s
en)
where u, v â V, in Ď time periods of transitARSN (p, sr, se) The set of arcs ARSN (p, sr, se) = (pâ˛, sr â˛, seâ˛) |
((pâ˛, sr â˛, seâ˛), (p, sr, se)
)â ERSS
AESN (p, sr, se) The set of arcs AESN (p, sr, se) = (pâ˛, sr â˛, seâ˛) |((pâ˛, sr â˛, seâ˛), (p, sr, se)
)â EESS
ABSN (p, sr, se) The set of arcs ABSN (p, sr, se) = (pâ˛, sr â˛, seâ˛) |((pâ˛, sr â˛, seâ˛), (p, sr, se)
)â EBSS
ABNN (p, sr, se) The set of arcs ANSN (p, sr, se) = (pâ˛, sr â˛, seâ˛) |((pâ˛, sr â˛, seâ˛), (p, sr, se)
)â EBNS
ATNĎ (p, sr, se) The path set ATNĎ (p, sr, se) = (pâ˛, sr â˛, seâ˛) |((pâ˛, sr â˛, seâ˛), (p, sr, se)
)â ETSĎ where d((pâ˛, sr â˛, seâ˛),
(p, sr, se)) = Ď states that â paths of length Ď from (pâ˛, sr â˛, seâ˛) to (p, sr, se)
5.4.1 Objective function
The objective function of our mass rescue operation model is:
96
Table 5.4: Variable definitionsVariable DefinitionIrit the amount of resource r â CR in location i â C at time t â TBeit the amount of equipment e â RE in location i â C at time t â Tgrijat the amount of resource r â CR sent from location i â C to location j â C via asset a â A at time t â Theijat the amount of equipment e â RE sent from location i â C to location j â C via asset a â A at time t â Tfpsrseijat the number of people in priority p â P with periods sr â Sr, se â Se sent from location i â C to location
j â C via asset a â A at time t â TXait whether asset a â A is in location i â C at time t â T . If a is in i, then Xait = 1, Xait = 0 otherwiseZait whether asset a â A stays in location i â C at time t â T . If a stays in i, then Zait = 1, Zait = 0 otherwiseYaijt whether asset a â A leaves location i â C at time t â T heading towards location j â C. If a departs, then
Yaijt = 1, Yaijt = 0 otherwiseDeipsrset the amount of equipment e â RE in location i â C used for people in priority p â P with periods sr â Sr
, se â Se at time t â TKripsrset the amount of resource r â CR in location i â C used for people in priority p â P with periods sr â Sr,
se â Se at time t â TQpsrseit the number of people in priority p â P with periods sr â Sr, se â Se who require resource and equipment in
location i â C at time t â TBSpsrseit the number of people in priority p â P with periodssr â Sr, se â Se whose resource and equipment demand is
met in location i â C at time t â TBNpsrseit the number of people in priority p â P with periodssr â Sr, se â Se whose resource and equipment demand is
not met in location i â C at time t â TESpsrseit the number of people in priority p â P with periods sr â Sr, se â Se whose resource demand is not met, while
equipment demand is met in location i â C at time t â TRSpsrseit the number of people in priority p â P with periods sr â Sr, se â Se whose resource demand is met while
equipment demand is not met in location i â C at time t â T
Table 5.5: Parameter definitionsParameter DefinitionÎąrp the amount of resource r â CR required to satisfy an evacueeâs demand in priority p â PÎśep the amount of equipment e â RE required to satisfy a evacueeâs demand in priority p â PĎri the amount of resource r â CR positioned in location i â C at time t = 1Ξei the amount of equipment e â RE positioned in location i â C at time t = 1νpsrsei the number of evacuees in priority p â P with periods sr â Sr, se â Se being in location i â C at time t = 1Âľa the maximum cargo capacity of asset a â AΨa the maximum passenger capacity of asset a â AĎai whether location i â C is the closest location to asset a â A at time t = 1.Ίa the travel time of asset a â A to the closest locationθai whether asset a â A can land on location i â C. If a can land, then θai = 1, θai = 0 otherwiseĎaij the travel time of asset a â A from location i â C to location j â CĎr the weight of one unit resource r â CRÎľe the weight of one unit equipment e â REĎi the ground capacity of location i â CÎşpsrse deprivation cost for a person in priority p â P with periods sr â Sr, se â SeÎpsrseijl cumulative in-transit deprivation cost for an evacuee in priority p â P with periods sr â Sr, se â Se traveling
from location i â C to location j â C which takes l time unitĎi the available capacity of location i â C to host evacueesÎłi the available capacity of public spaces in location i â Cpâ˛max the transition priority level in which an evacuee needs medical support but has yet to be assigned to a medical
facilitytlim the earliest time period when an evacuee can be in the transition prioritypmax the highest priority level
97
minimize: (5.1)âpâP
âsrâSr
âseâSe
âiâC/
âAnc.â
âtâT
ÎşpsrseQpsrseit +âpâP
âsrâSr
âseâSe
âiâC
âjâC
âaâA
âtâT
ÎpsrseijĎijafpsrseijat
+âpâP
âsrâSr
âseâSe
âiâV
2|T |Qpsrsei|T | +âpâP
âsrâSr
âseâSe
3|T |QpsrseâC. Shipâ|T |
âpâP
âsrâSr
âseâSe
âiâC
âaâA
âtâT
(t+ ĎiâAnc.âa)fpsrseiâAnc.âat+
âpâP
âsrâSr
âseâSe
âjâC
âaâA
âtâT
tfpsrseâC. Shipâjat
The objective function has six components. The first two components examine the total
deprivation costs associated with evacuees in each location excluding Anchorage and in transit,
respectively. The following two components help to drive evacuees, if possible, to Anchorage and off
of the cruise ship, respectively, by incurring penalties for those that remain in the communities or
on the cruise ship at the end of the planning horizon. The fifth component is focused on the total
evacuation time of evacuees arriving in Anchorage while the sixth component is focused on the total
evacuation time of moving people off of the cruise ship.
5.4.2 Constraints
We present the constraints based on two categories: those on how we use the assets to move
evacuees and resources and those modeling the allocation decisions and their impact on the status
on the evacuees.
5.4.2.1 Asset Constraints
The asset constraints presented in this section are grouped into two categories. The first
one are capacity-based constraints and the second one focuses on initial assignments and routing of
the assets.
Capacity Constraints
âeâRE\âMed.â
Îľeheijat +ârâCR
Ďrgrijat ⤠¾aYaijt âa â Aa,âi â C,âj â C,ât â T (5.2)
Constraint (5.2) ensures that the total weight of resources and equipment carried by a plane does
98
not exceed its capacity. Medical support is not considered due to fact we assume it cannot be
transported.
âpâP
âsrâSr
âseâSe
fpsrseijat ⤠ΨaYaijt âa â A,âi â C, âj â C,ât â T (5.3)
Constraint (5.3) ensures that an asset cannot carry more evacuees than its passenger capacity.
âaâAa
Xait ⤠Ďi âi â C, ât â T (5.4)
Constraint (5.4) guarantees that the number of planes landing at an airport does not violate the
airport capacity in a location (i.e., the communities and Anchorage) during a time period.
Positioning and Travel Constraints
âtâT :t<Ίa
Xait = 0 âa â A,âi â C (5.5)
Xait=Ίa = Ďai âa â A,âi â C (5.6)
Constraints (5.5) and (5.6) make the initial assignment of each asset by taking the deployment times
into consideration. This ensures that the asset goes to the closet (acceptable) community to prepare
for deployment.
âiâC
Xait ⤠1 âa â A,ât â T (5.7)
Constraint (5.7) implies that an asset can be located at most in one location during a time period.
Yaijt ⤠θaj âa â A,âi â C, âj â C,ât â T (5.8)
Constraint (5.8) prevents an asset from landing at locations not meeting its required specifications.
Xait = Zait +âjâC
Yaijt âa â A,âi â C, ât â T (5.9)
Constraint (5.9) ensures that in each time period, an asset either stays in its location or travels to
99
another one.
Xait = Zai(tâ1) +âjâC
Yaji(tâĎaji) âa â A,âi â C,ât â T/1 (5.10)
Constraint (5.10) ensures that if an asset is at a location in t then either asset a stayed in location
i at time tâ 1 or asset a left location j at time tâ Ďaji to arrive at location i.
5.4.2.2 Resource and equipment allocation and its impact on the status of the evacuees
The constraints introduced in this section capture the resource and equipment allocation
decisions to the evacuees and the influence of this allocation on their status.
Resource and Equipment Balance Constraints
Iri(t=1) = Ďri ââpâP
âsrâSr
âseâSe
Kripsrse(t=1) âr â CR, âi â C (5.11)
Constraint (5.11) initiates the resource inventories in each location at t = 1. Note that no trans-
portation decision is conducted during the first time period. This is because each asset takes at least
one time unit to be assigned to the initial locations (i.e., deployment time).
Bei(t=1) = Ξei ââpâP
âsrâSr
âseâSe
Deipsrse(t=1) âe â RE \ âP. Shelterâ âi â C (5.12)
BâP. Shelterâi(t=1) = Îłi + ΞâP. Shelterâi ââpâP
âsrâSr
âseâSe
DâP. Shelterâipsrse(t=1) âi â C (5.13)
Constraints (5.12) and (5.13) position equipment in each location at t = 1, including incorporating
the public spaces located into each community into their âshelterâ inventory level (Constraint (5.13)).
Irit +âpâP
âsrâSr
âseâSe
Kripsrset +âjâC
âaâA
grijat = Iri(tâ1) +âjâC
âaâA
grjia(tâĎaji) âr â CR, (5.14)
âi â C,ât â T \ 1
Constraint (5.14) is the resource inventory balance equation. At time t, the inventory level in each
location is equal to the amount of the resources remaining from the previous time period and the
resources transported from other locations. Furthermore, resources in the current location can be
100
carried to the other locations at time t and can be distributed to the evacuees.
Beit +âpâP
âsrâSr
âsâSe
Deipsrset +âjâC
âaâA
heijat = Bei(tâ1) +âjâC
âaâA
hejia(tâĎaji)+ (5.15)
âpâP\
pmax,pmaxâ˛
âsrâSr
âjâC
âaâA
fpsr(se=1)ija(tâ1) +âsrâSr
âseâSe
âjâC
âaâA
fpmaxâ˛srseija(tâ1)+
âsrâSr
âseâSe
BSpmaxâ˛srsei(tâ1) +
âsrâSr
âseâSe
ESpmaxâ˛srsei(tâ1) âe â RE \ âMed.â,âi â C,
ât â T \ 1
BâMed.âit +âpâP
âsrâSr
âsâSe
DâMed.âipsrset = BâMed.â,i,tâ1 +âsrâSr
âjâC
âaâA
fpmaxsr(se=1)ija(tâ1)
(5.16)
âi â C,ât â T \ 1
Constraints (5.15) and (5.16) are an equipment inventory balance equations similar to Constraint
(5.14). However, since equipment is considered non-consumable, the equipment of those who depart
the location during the previous time period become available at time t. Further, recall that those
in the transition priority (i.e., pmaxâ˛) will release their ânormalâ equipment once they are assigned
the medical support necessary for their priority level (e.g., they will move from a bed to a bed in
the medical center). Further, as mentioned previously, medical support is only provided in medical
centers, which are non-transportable. Hence, an individual equipment balance constraints (see
Constraint (5.16)) is generated for medical support.
BâP. Shelterâit +âpâP
âsrâSr
âseâSe
DâP. Shelterâipsrset +âpâP\
pmax,pmaxâ˛
âsrâSr
âseâSe
BSpsr(se=1)it+ (5.17)
âpâP\
pmax,pmaxâ˛
âsrâSr
âseâSe
ESpsr(se=1)it +âsrâSr
âseâSe
BNpmaxâ˛srseit +
âsrâSr
âseâSe
RSpmaxâ˛srseit ⼠γi
âi â C, ât â T
Constraint (5.17) ensures that public spaces are not transported to other communities by ensuring
that the total capacity of a location never goes down the true capacity. In particular, the left hand
side of the constraint sums up the inventory of shelter at location i carrying over into the next period,
101
the amount of shelter assigned in t, the amount of people in normal priority levels (not pmax or pâ˛max)
that currently have shelter (se = 1), and the amount of people at pâ˛max that are currently using the
normal shelter. The constraint ensures this summation is great than or equal to the capacity of
public space.
Evacuees Balance Constraints
Qpsrseit=1 = νpsrsei âp â P,âsr â Sr,âse â Se,âi â C (5.18)
Constraint (5.18) assigns the initial populations in each location. Clearly, the ship is the only
location where evacuees are located at t = 1. We further constrain the number of evacuees that can
be in an Arctic community at a particular time:
âpâP
âsrâSr
âseâSe
Qpsrseit ⤠Ďi âi â C,ât â T (5.19)
We now describe the constraints governing the transitions of the evacuees into different statuses
both out of and into different time periods. We first present the constraints where se = 1, i.e., the
equipment demand has already been met for all priorities besides the transition priority (pâ˛max).
Qpsr(se=1)it = BSpsr(se=1)it + ESpsr(se=1)it +âjâC
âaâA
fpsr(se=1)ijat âp â P \ pmaxâ˛, (5.20)
âsr â Sr,âi â C, ât â T
Qpsr(se=1)it =â
(pâ˛,srâ˛,seâ˛)â
ABSN (p,sr,se=1)
BSpâ˛srâ˛seⲠi(tâ1) +â
(pâ˛,srâ˛,seâ˛)â
AESN (p,sr,se=1)
ESpâ˛srâ˛seⲠi(tâ1) âp â P \ pmaxâ˛, (5.21)
âsr â Sr,âi â C, ât â T \ 1
Constraint (5.20) implies that after receiving the required equipment, evacuees either stay in se = 1
by continuing to âreceive equipmentâ (i.e., once equipment demand is met, no extra allocation
decision is done after the first assignment) or they move to another location. Constraint (5.21)
states that evacuees can be in the absorbing equipment state if and only if they receive the required
equipment (i.e., BSpsrseit and ESpsrseit) during the previous time period. As mentioned before,
evacuees cannot arrive at a location with se = 1 and, therefore, transportation decisions are not
included in Constraints ((5.20)-(5.21)). We now turn our attention to the constraints for se 6= 1 and
102
all priorities besides the transition priority (pâ˛max).
Qpsrseit = BSpsrseit +BNpsrseit + ESpsrseit +RSpsrseit +âjâC
âaâA
fpsrseijat (5.22)
âp â P \ pmaxâ˛,âsr â Sr,âse â Se \ 1,âi â C,ât â T
Qpsrseit =â
(pâ˛,srâ˛,seâ˛)â
ABNN (p,sr,se)
BNpâ˛srâ˛seⲠi(tâ1) +â
(pâ˛,srâ˛,seâ˛)â
ARSN (p,sr,se)
RSpâ˛srâ˛seⲠi(tâ1)+ (5.23)
âjâC
âaâA
â(pâ˛,sr
â˛,seâ˛)â
ATNĎaji(p,sr,se)
fpâ˛srâ˛seⲠjia(tâĎaji) âp â P \ pmaxâ˛,âsr â Sr,âse â Se \ 1,âi â C,
ât â T \ 1
Constraint (5.22) indicates that any of the five allocation and/or transportation decisions can be
made for evacuees with p 6= pâ˛max and se 6= 1: they can have both their demands satisfied, BSpsrseit,
they can have neither demands satisfied, BNpsrseit, the can have just their equipment demand
satisfied, ESpsrseit, they can have just their resource demand satisfied, RSpsrseit, or they can be
transported out of i, fpsrseijat. Constraint (5.23) captures how evacuees can end up in location i at
time t with a particular status where p 6= pâ˛max and se ⼠2: they can have both demands unsatisfied,
they can have just their resource demand satisfied, or they can arrive from another location. We
now present the constraints governing the behavior of evacuees with the transition priority level,
pâ˛max.
Qpmaxâ˛srseit = BSpmax
â˛srseit +BNpmaxâ˛srseit +RSpmax
â˛srseit + ESpmaxâ˛srseit+ (5.24)â
jâC
âaâA
fpmaxâ˛srseijat âsr â Sr,âse â Se,âi â C,ât â T \ 1, . . . , tlim
Qpmaxâ˛srseit =
â(pâ˛,sr
â˛,seâ˛)â
ABNN (pmaxâ˛,sr,se)
BNpâ˛srâ˛seⲠi(tâ1) +â
(pâ˛,srâ˛,seâ˛)â
ARSN (pmaxâ˛,sr,se)
RSpâ˛srâ˛seⲠi(tâ1)+ (5.25)
â(pâ˛,sr
â˛,seâ˛)â
AESN (pmaxâ˛,sr,se)
ESpâ˛srâ˛seⲠi(tâ1) âi â C, ât â T \ 1, . . . , tlim,âsr â Sr,âse â Se
The first difference is that once equipment demand is met in the transition priority level, then we
move the evacuee into the highest demand level (recall that the transition priority level is meant to
103
represent an evacuee who already has ânormalâ equipment demand met but then requires ânormal
plus medicalâ equipment demand). The second difference in these constraints is that no evacuee can
reach a location in a transition priority thus altering Constraint (5.25). There are two reasons for
this: i) equipment demand of evacuees in normal priority levels is satisfied in transit, ii) evacuees
in transition priority getting on an asset transition to the highest priority level due to releasing the
current equipment being held. One can arrive into the status by having their equipment demand
satisfied if they were making the jump to the transition priority level.
Allocating Resources and Equipment to Demand Constraints
Îąrp(BSpsrseit +RSpsrseit) = Krpitsrse âp â P \ pâ˛max, r â CR, âsr â Sr,âse â Se \ 1,
(5.26)
âi â C, ât â T
ÎąrpBSpsr(se=1)it = Krpitsr(se=1) âp â P \ pâ˛max,âsr â Sr, r â CR, âi â C (5.27)
Constraints (5.26) and (5.27) connect the satisfied flow decisions for resources and equipment, re-
spectively, to evacuees in location i at time t with a certain status to the allocation decisions made
for evacuees in that location at that time with that status. Note that since Constraint (5.27) is
created for those who are in se = 1, it does not contain the RS component since the equipment
demand is satisfied for those with se = 1.
Îąrpmaxâ˛(BSpmax
â˛srseit +RSpmaxâ˛srseit) = Kripmax
â˛srset âr â CR, âsr â Sr,âse â Se,âi â C,
(5.28)
ât â T \ 1, . . . , tlim
Îśepmaxâ˛(BSpmax
â˛srseit + ESpmaxâ˛srseit) = Deipmax
â˛srset âe â RE, âsr â Sr,âse â Se,âi â C,
(5.29)
ât â T \ 1, . . . , tlim
Constraints (5.28) and (5.29) connect the satisfied flow decisions for resources and equipment, re-
spectively, to evacuees in a location at a time t in a certain status in the transition priority level to
the amount that is allocated to evacuees at that status in transition priority level at that location
104
at that time.
Îśep(BSpitsrse + ESpitsrse) = Deisrsept âp â P \ pâ˛max,âsr â Sr,âse â Se \ 1,âe â RE,
(5.30)
âi â C,ât â T
Constraint (5.30) ensures that equipment is allocated to satisfy the demand of those evacuees who
will have their equipment demands satisfied in location i at time t. The last constraints focus on
variable restrictions.
Irit, Teit,Kripsrset, Deipsrset, grijat, heijat, fpsrseijat, Qpsrseit, BSpsrseit, BNpsrseit, (5.31)
ESpsrseit, RSpsrseit,mijat, wit â Z+ âr â CR, âe â RE, âi â C, âp â P,âsr â Sr,âse â Se,
âa â A,ât â T
Xait, Zait, Yijat â 0, 1 âa â A,âi â C, âj â C,ât â T (5.32)
5.5 Overview of Solution Methodologies
The mathematical model is a large-scale IP that has characteristics similar to problems in
evacuation and resource allocation. It is, therefore, important to recognize that solving our model
directly with a commercial solver may be time-prohibitive and that customized solution approaches
may be necessary. We describe two heuristic approaches for identifying quality solutions quickly.
As shown in the Appendix, solving the IP using a warm-start heuristic solution outperforms solving
the IP directly (see Section A.2).
5.5.1 Conservative One-by-One Heuristic (COBOH)
We approach the problem by asking the following question: How can the model be solved if
we consider the problem through a practitionerâs eyes?. The focus would likely be on the allocating
assets to move the evacuees around and then using availability capacity to bring relief commodities
when possible. A practitioner naturally would carry the evacuees to the closest available villages
via the available ships in a greedy manner. The practitioner would use the planes to transport
everyone to Anchorage. In other words, first a ship carries a certain amount of evacuees to a village,
105
then a plane takes action and carries the evacuees to Anchorage at some point later in time. This
pair of operations can be repeated in an iterative way by taking all the capacity constraints into
consideration.
This heuristic focuses on the transportation decisions only and we then optimize the resource
allocation decisions with fixed transportation decisions. In other words, we look at the best possible
use of the response resources once we know when we planned on evacuating passengers from the
cruise ship to the villages and the villages to Anchorage. Therefore, we are examining best possible
resource allocation decisions whereas, in practice, triage and rationing may be implemented to make
these decisions. The insight that we expect to obtain from the heuristic approach is: i) when we
really need an OR model for an Arctic MRE, and ii) the benefits of applying the complex model to
determine the response decisions rather than focusing solely on the transportation decisions. The
pseudocode of the heuristic can be found in the Appendix (see Section A.1). Here, we focus on
explaining the heuristic in an informal way.
Each asset is assigned to the initial location by taking the deployment times into consid-
eration. As a second step, all available ships are routed towards the cruise ship to assist with the
evacuation. We then start our iterative method and examine the ship and plane sets in sequence.
For every ship, we calculate the maximum number of evacuees that can be transported to
each village, which is reachable from the cruise ship, together with the earliest arrival time. A higher
ratio implies that we can carry more evacuees within a shorter time period to a village. Each ship
links with one village which has the highest ratio for that ship. The amount of evacuees that can be
carried to a village depends on the hosting capacity of a village, the passenger capacity of the ship,
and the number of evacuees currently at the village. Once every ship is associated with a village, we
pick the ship with the earliest arrival time. In the case of a tie, we prefer the ship with the higher
ratio.
We then proceed to make a transportation decision for a plane. Each plane is examined
one by one and a similar analysis is conducted with slight changes. We compute the last minimum
number of positive population within the total evacuation time horizon for every village after the
earliest possible departure time. The calculation of the minimum number of evacuees that can
be transported from a village is a significant step since it ensures feasibility with respect to the
population numbers in the locations.
The number of evacuees that can be carried by the plane is set as the minimum of population
106
amount in terms of evacuees and the passenger capacity of the plane. Then, we essentially proceed in
the same way as the ship portion of the heuristic except that we also examine the airport capacities
and make sure that the constraint related to airport capacities is not violated. If a plane cannot
be associated with any village, then there is no way to utilize the plane for the remainder of the
horizon. In this case, we make an idle transportation decision for that plane. The asset either stays
in its current location or moves to another location while checking the airport capacity.
If a ship or a plane reaches the end of the time horizon, we eliminate the corresponding asset
from consideration. The iterative method is continued until both the ship and plane sets become
empty implying that there is no more transportation decisions required or all evacuees arrive in
Anchorage. Once the heuristic is over, we obtain a full set of transportation variables. We refer
to this heuristic as the one-by-one heuristic since we are allocating assets individually, in what is
essentially a greedy manner.
5.5.2 Optimizing Transportation Heuristic (OTH)
In the second method, we focus on optimizing just the transportation decisions focusing
on the evacuation first and then optimizing support decisions based on these âgreedyâ evacuation
decisions. In particular, we move all the transportation variables related to the assets and evacuees
(i.e., Xaij , Yaijt, and fpsrseijat) into an optimization problem and ignore the ones related to the relief
materials (i.e., grijat and heijat). With this conversion, we intend to optimize the transportation
decisions. We first define two new variables shown in Table 5.6 that focus on the number of people in
a location and/or being transported and we âremoveâ the fpsrsejiat variables since they are dictated
by relief/support decisions.
Table 5.6: New variables definedVariable Definitionwit the number of people staying in location i â C at time t â Tmijat the number of people leaving location i â C to go to location j â C at time t â T
We rearrange the last four components of the Objective Function (5.1) and update the
correlated constraints. Then, the new modified evacuation IP (EvacIP) focusing on evacuation can
be represented as:
(EvacIP): maxâiâC
âaâA
âtâT
(t+ ĎiâAncâa)miâAncâat +âjâC
âaâA
âtâT
(t)mâC. Shipâjat+ (5.33)
107
âiâV
2|T|wi|T | + 3|T|wâC. Shipâ|T |
s.t. mijat ⤠ΨaYaijt, âa â A,âi â C,âj â C, ât â T (5.34)
wi(t=1) =âpâP
âsrâSr
âseâSe
νpsrsei âi â C (5.35)
wit +âjâC
âaâA
mijat ⤠Ďi, âi â C, ât â T (5.36)
wi,t +âjâC
âaâA
mijat = wi(tâ1) +âjâC
âaâA
mjia(tâĎaji) âi â C,ât â T (5.37)
(5.4)â (5.5)â (5.6)â (5.7)â (5.8)â (5.9)â (5.10)â (5.32)
wit â Z+, âi â C, ât â T (5.38)
mijat â Z+, âi â C, âj â C,âa â A,ât â T. (5.39)
Note that Constraints (5.34), (5.35), and (5.36) replace Constraints (5.3) , (5.18), and (5.19),
respectively and play the same role. The left hand side of Constraint (5.37) captures the number of
evacuees staying and leaving location i at time t. The right hand side of the constraint determines; i)
how many evacuees stayed in location i at time tâ 1, and ii) how many evacuees left other locations
to arrive at i at time tâ Ďaji.
In particular, EvacIP answers the following question : What happens when we prefer to
focus on the evacuation decisions without worrying about distributing any relief sources? We then
use the model as a heuristic approach and warm start the original IP model via the partial solution
obtained through this EvacIP.
5.6 Computational Study: Data Set Description and Baseline
Analysis
The objective of our computational study is to analyze different potential response events
in Arctic Alaska and obtain insights into policy questions by applying our novel MRE model. All
the experiments are conducted in the Optimization Programming Language (OPL) using CPLEX
12.8.1 as the IP solver on an Dell machine with an Intel Core i7-8700 CPU at 3.20GHz, 64 GB Ram.
108
5.6.1 Case Study Description
In this section, we discuss the data collected for our case study. We utilize online sources
that are publicly available from institutions operating in Arctic Alaska. We have discussed the
application and data with experts from the region. The case studies were also created based on
discussions with District 17 of the USCG, although the data and model have yet to be fully verified
and validated with them.
We separate our test cases into five incident locations. The purpose is to identify the areas
where there are âcapability gapsâ for MREs from Anchorage through the Northwest Passage. The
largest cruise ship that has entered the region is the Crystal Serenity (Waldholz, 2016) and have
used its planned route shown in Figure 5.9 for selecting five incident locations. We are interested in
the region starting from the Bering Strait, through the Chukchi Sea, and into the Beaufort Sea. The
number of evacuation time periods is set equal to sixteen and there are three priority levels where
we assume that 65%, 25% and 10% of the evacuees are at level 1, 2, and 3. We consider MREs
where there are 800, 1200, and 1600 people on the cruise ship.
1
3
4
2
5
Incident Location Coordinates
1 Bering Strait 65° 50'59.9"N 168° 27'31.7"W
2 Chukchi Sea 67° 24'08.0"N 167° 56'48.3"W
3 Chukchi Sea 69° 44'00.7"N 166° 54'59.8"W
4 Chukchi Sea 71° 15'24.1"N 160° 23'27.4"W
5 Beaufort Sea 71° 26'05.8"N 154° 57'50.3"W
Figure 5.9: Incident locations selected on the Crystal Serenityâs planned routes (Waldholz, 2016)
In our data, Utqiagvik, Nome, Kotzebue, Point Hope, Point Lay, and Wainwright are the
communities where the evacuees can be used in an Arctic MRE. Note that Point Hope, Point Lay,
and Wainwright are relatively small (i.e., ones that have a population of fewer than 1000 people)
but are included since they are located in the North Slope Borough, which has a robust emergency
management department (Brooks, 2020). Each community has an airport implying that it is feasible
to take off and land there via certain planes (Federal Aviation Administration, 2019). It would be
inappropriate for large planes to use airports in the small villages due to their short runways. For
instance, Point Hope, Point Lay, and Wainwright each have a single runway of no longer than 4,500
109
feet (GCR, 2017), which does not fit the normal requirements of landing the HC-130H or Boeing
737-700. Communities also have a number of small boats and vehicles that can be used for local
transport (i.e., to shuttle evacuees from offshore ships to shore or helping to move from the shore
to the airport). Since there is no capacity issue regarding such local transportation operations, we
do not model them. We also consider the potential use of the inland community of Atqasuk as a
pre-positioning site for resources and equipment. Each community has a carrying capacity standing
for the number of evacuees that can be hosted, which is set equal to 40% of its population (see Table
5.7).
Table 5.7: Populations and capacities in locationsLocation Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik AnchorageNum. of People 3,841 3,266 709 269 244 584 4,438 294,356Carrying Capacity 1,536 1,306 283 107 97 233 1,775 âAirport Capacity 3 3 1 1 1 1 3 5
The distance between each location is calculated by Google Maps in miles. Travel routes
are separated into two categories: i) sea distance (i.e., travel that accounts for the shoreline) and ii)
air distance. Discussions with USCG suggested that transportation directly from the cruise ship to
Anchorage is undesirable since (1) the ships that the evacuees would be moved to (including sister
ships) are not designed for passenger travel and (2) it could take a significant amount of time to
reach Anchorage from the Arctic via the sea. To obtain the number of time periods travel requires,
we calculate the travel time between two locations via each asset byâ(distance)/(6Ă cruise speed)
â.
We assume there are 6 hours per time period.
The available assets play an important role in transportation and logistics operations. Ex-
amining various tabletop exercises (Coast Guard News, 2016; McNutt, 2016) have led us to incor-
porate a set of available assets owned by USCG, Alaska Air National Guard (AANG), North Slope
Borough (NSB), U.S. Air Force (USAF), and the commercial airlines operating in the region (i.e.,
Alaska Airlines and Ravn Alaska).
Moving all evacuees directly out of the Arctic (where evacuees can be supported), or even
into a single Arctic community, is not practical since the existing infrastructure and transportation
assets may not be capable of providing sufficient support or it may not be desirable to have evacuees
on those assets for long periods of time. While planes are utilized to transfer people from the
communities to Anchorage, as well as to deliver commodities to the communities, rescue ships are
only used to carry people from the cruise ship to the communities. As a result, cargo capacity of
110
ships are set as zero. Aircraft are not considered in the operation of taking people off the cruise
ship. Note that helicopters are not specifically modeled in the set of air assets, although they would
play an important role in the response in transporting high-priority evacuees off the cruise ship,
possibly lightering passengers from the cruise ship to rescue ships, and moving responders onto the
ship. The main reason for not including helicopters is that they would be used in conjunction with
rescue ships except in cases of extreme medical duress.
Since military and commercial assets require mobilization time, we assume that planes
owned by USAF and commercial airlines will be available to support the MRE within 24 hours of
the event. USCG, AANG, and NSBâs assets tend to be dedicated to this type of emergency more
than USAF and commercial airlines.
Given the assumptions mentioned, we include relevant planes that we believe would be
available for the response (see Table 5.8). In our baseline experiment, we have examined the length
of the runway required for each plane to land and have disallowed the landing of large planes, such
as the HC 130H, Lockheed HC-130, and Boeing 737-700 in the small villages (Point Lay, Point
Hope, and Wainwright). This restriction will be lifted in certain analyses, which would represent
investments to the runways of these small villages. However, we do note that who pays the costs
associated with the response is a question outside the scope of this paper (i.e., the cruise ship
company or its insurance may pay the costs of the response back to the federal government). We
will create scenarios in a way that only a subset of these assets are ready to use to observe the
impact of different asset types. For ships, the passenger capacity is set as the maximum number of
crew members allowed on board. For each plane, we use 60% of its cargo capacity to ensure that no
problem is faced during loading resources and/or equipment without looking at the detailed packing
plan.
We now discuss some remaining assumptions used when creating this data set. We focus
on situations where ships are the only asset type that can carry evacuees from a cruise ship to the
communities, because planes cannot land on ships. We assume evacuees need equipment such as
shelter (either in âpublic spaceâ, such as a school or portable shelter) and sleeping bags. Public space
is different than a portable shelter since it cannot be transported. We further assume that the time
to refuel an asset is sufficiently small compared to the travel time of the asset and, therefore, does
not need to be accounted for in our model.
Assets are assigned to the closest locations, which is assumed to be known a priori, when
111
Table 5.8: List of assets (Griner, 2013; USCG, 2016; Office of Aviation Forces, 2019; Sherman, 2000; UnitedStates Air Force, 2008; Alaska Airlines, 2020; Brady, 2019; RavnAir Alaska, 2020; Cessna, 2019)
Asset Type OwnerAvailableNum.
Num. inBasline
PassengerCap.
CargoCap. (lbs)
CruiseSpeed (mi.)
HC 130H Aircraft USCG 2 2 92 51,000 374Lockheed HC-130 Aircraft AANG 1 1 20 30,000 251Learjet 31A Aircraft NSB 2 2 6 2,000 441Boeing 737-700 Aircraft Alaska Airlines 1 1 124 16,505 460Beechcraft 1900C Aircraft Ravn Alaska 1 1 12 2,030 250WLB 206 Buoy Tender USCG 1 1 86 0 17.3WLB 212 Buoy Tender USCG 1 1 86 0 17.3WLM 175 Buoy Tender USCG 1 1 24 0 13.8282 WMEC Endurance Cutter USCG 1 1 99 0 13.8378 WHEC Endurance Cutter USCG 1 1 160 0 12.7154 WPC Fast Response Cutter USCG 2 1 24 0 32.2
the rescue event is initiated. The majority of the planes are located around Anchorage and Kodiak
with a few exceptions. For instance, the Learjet 31A type aircraft is often positioned in Utqiagvik
(Griner, 2013) and will be deployed there. These initial locations for the planes are kept the same
regardless of the incident area throughout our analysis. On the other hand, since Coast Guard ships
are actively used for normal operations, we prefer not to fix a location to ships. Ships are deployed
to their initial locations based on the incident area meaning that initial locations may vary across
instances. For example, data for the incidents can capture the case where certain ships (e.g., sister
ship(s)) âmoveâ with the cruise ship in order to respond to an incident.
The stock levels of available resources and equipment in each location are illustrated in Table
5.9. We assume that the cruise ship would have enough supplies to satisfy all evacueesâ demands for
the first six time periods of the response. We assume that no resource can be taken out from the
ship when an evacuee is placed on a ship. Equipment demand will be satisfied while the evacuees are
on the ship. Given the relative population sizes of Utqiagvik, Kotzebue and Nome, we assume that
there is some level of water and food that can be used in the MRE. In addition, as a result of having
Level 4 Trauma Centers, there is medical support in Utqiagvik, Nome, and Kotzebue. We assume
a large stockpile in Anchorage for all the commodity types. Public facilities (e.g., churches, sport
centers etc.) could be utilized in response events in lieu of portable shelters and will be included in
our analysis. We also note that the level of resources available in Anchorage are at least an order
of magnitude larger than those available in the villages and, therefore, for the purposes of modeling
the MRE, we do not need to capture allocation decisions there.
We then share the list of resources and equipment together with the unit weight and the re-
quired amount for each priority level in Table 5.10. Lastly, Table 5.11 presents the initial deployment
locations for each ship used in the baseline experiment according to each incident location.
112
Table 5.9: Initial inventory in each locationName Nome Kotzebue Point Hope Point Lay Atqasuk Wainwright Utqiagvik AnchorageWater 150 150 0 0 300 0 200 5000Food 150 150 0 0 300 0 200 5000Sleeping Bag 0 0 0 0 75 0 0 500Portable Shelter 0 0 0 0 50 0 0 150Public Space 200 200 50 50 50 50 250 2000Medical Support 25 25 0 0 0 0 45 400
Table 5.10: Resource and equipment list(Division of Homeland Security & Emergency Management, 2019; World Health Organization,
2019)Name Type Weight (lbs) Priority Level 1 Priority Level 2 Priority Level 3ⲠPriority Level 3Water Resource 1.54 1 2 3 3Food Resource 2.35 1 1 1 1Sleeping Bag Equipment 7.50 1 1 0 0Portable Shelter Equipment 26.40 1 1 0 0Medical Support Equipment â 0 0 1 1
Table 5.11: Initial deployment locations for shipsShip Incident 1 Incident 2 Incident 3 Incident 4 Incident 5WLB 206 Point Lay Nome Nome Utqiagvik UtqiagvikWLB 207 Utqiagvik Point Lay Utqiagvik Nome KotzebueWLB 212 Point Hope Point Lay Wainwright Point Lay Point LayWLM 175 Point Hope Point Hope Point Hope Point Hope Point Lay282 WMEC Kotzebue Point Hope Point Hope Wainwright Wainwright378 WHEC Nome Nome Utqiagvik Utqiagvik Utqiagvik154 WPC Kotzebue Kotzebue Kotzebue Kotzebue Kotzebue
Lastly, we represent the flow arcs designed for the evacuation balance constraints. If an
evacuee receives equipment at any time period, the corresponding se becomes one regardless of the
previous sr and se. Recall that this implies that evacueeâs equipment demand is fully met. On
the other hand, if the resource demand is satisfied, then the changes, which are demonstrated in
Table 5.12, take place according to the evacueeâs priority level. Since a priority level symbolizes the
seriousness of an evacueeâs medical situation, decreases in periods occur slower for higher priority
levels. As for the changes in priority levels, alternation takes place mainly based on the value of sr.
The situations where an evacueeâs priority level increase are shown in Tables 5.13 and 5.14 (i.e., arc
jumps taken place between the layers).
Table 5.12: Changes in sr when resource demand is metPriority Level TransitionPriority Level 1 sr â 1Priority Level 2 If sr ⤠4, then sr â 1 ; otherwise sr â sr â 3Priority Level 3 If sr ⤠3, then sr â 1 ; otherwise sr â sr â 2
113
Table 5.13: Jumps in ABNN (p, sr, se)Priority Level Jump (from â to)Priority Level 1 (p = 1, sr = 4,se ⼠2) â (p = 2, sr = 5,se + 1)Priority Level 2 (p = 2, sr = 8,se ⼠2) â (p = 3, sr = 9,se + 1)Priority Level 3 â
Table 5.14: Jumps in AESN (p, sr, se)Priority Level Jump (from â to)Priority Level 1 (p = 1, sr = 5, se ⼠1) â (p = 2, sr = 6,se = 1)Priority Level 2 (p = 2, sr = 9,se ⼠1) â (p = 3, sr = 10,se = 1)Priority Level 3 â
5.6.2 Baseline Experiment
During our experiments, we set a time limit of 60 minutes. If the solution method does not
converge to the optimal solution within the time limit, the best solution obtained by then together
with its optimality gap is reported. Lastly, for each experiment, we examine a total of fifteen different
scenarios consisting of five different incident locations along with three different levels of evacuees.
We start our discussion with a baseline experiment where we assume that there is a sufficient set of
resources and conditions (e.g., travel times) are in an ideal setting. We will vary certain parameters
from this baseline (e.g., when planes are available) in examining critical aspects of the response. For
this baseline, we consider only a subset of the previously described assets during the response (see
Table 5.8).
We discuss the computational performance of various approaches to solve the problem in
the Appendix (see Section A.2). Of note, warm-starting CPLEX with either heuristic significantly
outperforms directly solving the model. Further, the intuitive COBOH results in solutions with gaps
well over 10%, thus indicating the importance of using optimization to examine response efforts.
Based on this analysis, we will conduct the remaining experiments by warm-starting the IP with the
solution identified by OTH.
We now provide detailed analysis on the baseline experiment. Fig 5.10 depicts the objective
values for each scenario. The total average evacuation time is computed as the sum of the average
time to leave the cruise ship and the average time to arrive at Anchorage. Recall that lower objectives
indicate a more âsuccessfulâ response since we are focusing on minimizing the total of evacuation time
and the impact on the evacuees.
It is important to mention that the model is able to successfully complete the response
event in all the scenarios except Incident 1-1600 and Incident 3-1600. The reason for the failure in
Incident 1-1600 lies behind the fact that the travel times from Incident 1 to the closest communities
114
0
10000
20000
30000
40000
50000
60000
70000
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Baseline experiment
Total Average Evacuation Staying in Ship Staying in a Village
Depr.Cost for Commodities Depr. Cost During Travel
Figure 5.10: Objective values in the baselineexperiment
580
822
1265220
378
268
650
1015
1398
193
538405
150
185
202
178
354
332
86
110
17224
24148 160
24 46 72
343
198 331
776
1052
1440
776
1154
1528
0
200
400
600
800
1000
1200
1400
1600
1800
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Nome Kotzebue Point Hope Point Lay Wainwright Utqiagvik
Figure 5.11: The villages used in the baselineexperiment
of Kotzebue and Nome are longer compared to the other incident areas. In fact, the model does not
transport any evacuee to other northern communities (see Fig 5.11) since there exist sufficient hosting
and airport capacities in both communities. As for Incident 3, since large planes, which comprise
87% of the total capacity provided by all the planes, cannot be utilized in the small villages, the
model transports a number of evacuees to the farther communities (e.g., Kotzebue and Utqiagvik).
This results in higher travel times when using ships which also causes another negative impact since
it takes more time to evacuate people from the cruise ship.
Further, we observe high penalty costs due to leaving some evacuees in the villages in
Incident 3 when there are 1200 and 1600 passengers. The underdeveloped runways in the airports
prevent the use of large planes from landing in the small villages close to Incident 3, thus delaying
transport of evacuees or causing them to go to large villages far away from the incident location.
For instance, the number of evacuees who cannot not make it to Anchorage and have to stay in in
Point Hope and Point Lay by the end of the rescue operation is equal to 200 and 267 for 1200 and
1600 passengers, respectively. These people would not stay in these villages indefinitely but there
are significant penalties for them being there at the end of the horizon.
Overall, the transportation decisions have the greatest influence on the objective. When
evacuating people from the incident and from the local communities is delayed, it not only increases
the total evacuation time, but it exponentially increases the total deprivation costs due to the limited
amount of available resources. However, the bottleneck is the transportation decisions concerning
passengers. We observed that the planes are able to move resources and equipment into the villages
at or before the time evacuees arrive into the village and, therefore, the âarrival timeâ into the village
has the most impact on deprivation costs. Therefore, resources and equipment enter the Arctic
quickly enough in support of a rescue operation. Although we did not specifically model the concept
of an Arctic fulfillment package, our results show that if these packages are the quickest way to
115
provide resources and equipment, then they play an important role in the response.
It can be seen that the worst response performances are observed in Incidents 1 and 3. This
clearly indicates that there exists a strong correlation between the closeness of the incident area to
the local communities and the capacities. Even though we have plenty of capacity around Incident
1, as a result of long travel distances, the evacuees suffer and the rescue event is challenging. While
we have close communities located around Incident 3, these small villages have limitations on how
they can be used during the response (e.g., the types of planes that can land there). Hence, the
rescue event is still challenging.
Another significant finding is that the response to Incident 4 is slightly worse than the
response to Incident 5 in each scenario (i.e., 800, 1200, and 1600 passengers). This is somewhat
counter-intuitive in the sense that Incident 4 could more easily take advantage of Point Lay, Wain-
wright and Utqiagvik. However, the response to Incident 5 performs better since the incident is
closer to the larger community of Utqiagvik.
Lastly, we provide the list of the villages used in each incident location in Fig 5.11, where
each column shows the villages together with the number of evacuees transported there. Overall,
Utqiagvik and Kotzebue are important large communities and Point Hope stands as a significant
small community. We will now analyze how the transportation decisions are affected based on
different situations faced by the response.
Managerial insights: It would be important to either increase the number of ships around
Nome and Kotzebue or locate ships that improve upon capacity or speed in order to address the
âresponse gapâ in this area. It could also be quite useful to incorporate some infrastructure devel-
opments in small villages to be able to utilize larger aircraft and/or host more evacuees. In our
remaining analysis, we will focus on the impacts of such decisions.
5.6.3 âWhat Ifâ Analysis
In this section, we focus on examining various what-if scenarios that alter the data associated
with our baseline experiment to understand key issues around response capabilities. Our experiments
address: (i) the improvement in response when new infrastructure is developed in the Arctic, (ii)
the impact on response when it faces challenges (e.g., weather), and (iii) situations that combine (i)
and (ii).
116
5.6.3.1 Experiment 1: Improving Infrastructure in the Arctic
There may be opportunities to invest in improving infrastructure in order to increase the
âslackâ in these systems so that they may be able to better handle emergency response. We look
to answer the following question: âHow much positive effect may be seen when airport and hosting
capacities are increased and the runway lengths are upgraded in the small villages?â. Here, the
airport and hosting capacities are increased by one and 20%, respectively, and runways lengths are
upgraded in order to land any type of aircraft.
0
10000
20000
30000
40000
50000
60000
70000
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Total objective
Baseline Exp 1
Figure 5.12: The total objective values in the base-line experiment and Experiment 1
595.161003.45
1434.06
2055.16
3133.383326.77
116.14 314.64572.39
836.44
0
500
1000
1500
2000
2500
3000
3500
800 1200 1600 800 1200 1600
Incident 2 Incident 3
Baseline Exp 1
Figure 5.13: The deprivation costs incurred duringtravel in the baseline experiment and Experiment 1
First, we point out that the investments that are proposed for the small villages did not
improve the response for Incidents 1 and 5 (see Fig 5.12). Reaching the small villages still takes
the same amount of time via the ships. Hence, utilizing the closest villages, which are known to
have high capacities, under the baseline is still preferred. For example, although Wainwright has
improved its capabilities, Utqiagvik still has significant response capacity and we use it as the center
of the response in Incident 5.
We do see an improvement of between 15%-25% and 30%-45% in response capabilities
for Incidents 2 and 3, respectively (see Fig 5.12). For Incident 2, while on average 84% of the
evacuees are transported to Kotzebue in the baseline experiment, this ratio drops sharply to 7% in
Experiment 1. Point Hope becomes a more appealing location to move the evacuees since there are
major infrastructural improvements in the small villages. We observe a significant decrease in the
deprivation costs during travel (see Fig 5.13). This is because all the ships can reach the cruise ship
from Point Hope within one time period while it takes, on average, 1.8 periods to reach Kotzebue.
We observe a very similar pattern in Incident 3 and the model no longer transports any evacuee to
Utqiagvik. Further, improvements in the objective value in Incident 3 occur due to the fact that no
117
evacuee is left in the villages as a result of the improvements to the airports.
As for Incident 4, Wainwright becomes nearly as important as Utqiagvik by hosting roughly
half of the total evacuees. As a result, the total average evacuation time and the deprivation costs for
commodities decrease. Yet, we observe only a 3.5% decline on average in terms of the total objective.
We believe that the reason behind such a small decrease is the fact that though Wainwright is highly
utilized, it takes longer time to reach Wainwright compared to Utqiagvik from the cruise ship. For
example, while we do not observe any deprivation costs during travel in the baseline experiment in
Incident 4, this trend changes in Experiment 1.
Managerial insights: We believe that infrastructure investments in terms of both improving
the runways and increasing the hosting capacities in Point Hope and Wainwright would be quite
beneficial. Both communities could play an important role in different incidents due to their central
locations in areas between larger Arctic communities. However, our results suggest that additional
infrastructure investment in some communities may have limited benefit, as incidents close to Nome,
Kotzebue, and Utqiagvik are responded to better than those incidents in more remote areas.
5.6.3.2 Experiment 2: Restricting the Air Transportation as a Result of the Bad
Weather Conditions
We ask the following question: âWhat is the (negative) impact to response capabilities when
air operations are impacted by weather conditions?â. To answer this, we introduce a new constraint
such that no flight is operated between t = 1 and t = 8, which helps to model a storm that would
ground air operations.
We provide a comparison of the objective values with the baseline experiment in Fig 5.14.
When air operations are restricted as a result of bad weather conditions seen in the region, the model
fails to bring everyone to Anchorage when there are 1600 passengers in every incident, which is why
there is a high penalty cost due to leaving some evacuees in the villages. In addition, the model
leaves 57 more evacuees in the cruise ship in Incident 3 when there are 1600 passengers. It is worth
mentioning that airport restrictions in the small villages create a bottleneck and remain tight after
the air operations are started. Therefore, this indicates that investments to improve the airports in
the small villages would be quite beneficial in a response.
Restricting the air transportation has another negative impact since we can no longer move
resources and equipment into villages. This results in a significant increase in the deprivation costs
118
0
10000
20000
30000
40000
50000
60000
70000
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Baseline Exp 2
Figure 5.14: The total objective values in the base-line experiment and Experiment 2
48.92%
38.57%
31.42%
57.02%
49.10%
41.40%
26.88%21.10%
13.39%
104.85%
91.51% 92.64%
115.44%
97.32% 98.70%
0%
20%
40%
60%
80%
100%
120%
140%
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Figure 5.15: The percentage increase in the objectivein Experiment 3 compared to the baseline experiment
for commodities in every scenario (i.e, 63% on average). For instance, the deprivation cost increases
more than five times in Incident 5-1600 (e.g., from 2000.83 to 10,620.59). This helps to indicate that
resource and equipment stockpiles are not sufficient to support evacuees for long periods of time
without replenishment.
In terms of the impact on response capabilities, Fig 5.15 provides the increase in the objective
functions across all incidents from the baseline to this particular situation. We can view large gaps
as significantly decreasing response capabilities. Incidents 4 and 5 are most impacted in terms of
an increase to the objective. This is because we were able to evacuate people quickly through
Utqiagvik for these incidents in the baseline but since the planes are now grounded, we now need
to have evacuees wait in this community. Incident 3 experiences the relative smallest increase. This
is due to the fact that the arrival times into the villages from the distressed ship were a significant
part of the objectives and this does not change as air operations are grounded. We observe that the
percentage increase decreases in a linear way in Incidents 1,2 and 3 while the number of passengers
are increasing. Meanwhile, the objective value rises approximately half in Incident 3 and the response
becomes nearly identical with Incident 1 in terms of the objective values.
Managerial insights: It could be useful to stockpile more relief commodities to be used
during an emergency response event in larger villages including both Kotzebue and Utqiagvik when
infrastructure development is not possible. The response tends to favor utilizing larger villages to
transport evacuees. Hence, if an infrastructure improvement is not possible in the region, then
holding extra relief commodities in larger villages would be preferred as an alternative in order to
ensure longer support for evacuees as they arrive into the larger communities or as resources are
transported to the smaller communities where evacuees may be.
119
5.6.3.3 Experiment 3: Decreasing the Speed of Ships Due to Navigating with Sea Ice
Weather conditions do not only cause problems in the air operations but also ships traveling
in the sea. In particular, there may be sea ice in and around the ships as they travel in the Arctic.
The USCG owns polar-class icebreakers should ships become iced-in. Navigation in the uncertain
conditions surrounding sea ice may also reduce the speed at which ships can travel. We, therefore,
examine the response under conditions where the ship travel times would increase to move between
the cruise ship and the villages. In this case, the travel time of each ship is increased by one. One
important observation in this case is that the model fails to evacuate everyone from the cruise ship
in all the incident locations with 1600 passengers as shown in Fig 5.16. Note that the model does
not utilize a different transportation path for the evacuees in any of the incidents and use the same
villages as presented in Fig 5.11, but may leave more evacuees in the cruise ship.
67
336
218
618
95
495
2
402
802
163 163
0
100
200
300
400
500
600
700
800
900
1200 1600 1200 1600 800 1200 1600 1600 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Baseline Exp 3
Figure 5.16: The number of evacuees stayed in the C.ship at |T | in the baseline experiment and Exp. 3
18
22
18
23
17 17
15
21
14
19
13 13
15 15
11 11
15
18
14
18
0
5
10
15
20
25
1200 1600 1200 1600 1200 1600 1200 1600 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Baseline Exp 3
Figure 5.17: The total number of tours completed bythe ships in the baseline experiment and Exp. 3
The impact to response capabilities can be explained by examining the number of âtoursâ
that are made from the cruise ship to the villages by ships. We define a tour as the travel from a
village to cruise ship and from cruise ship to a village for a ship. Fig 5.17 compares the number of
tours under the baseline and this experiment with 1200 and 1600 passengers. Note that the number
of tours does not really change with 800 passengers due to the available total passenger capacity
provided by the ships. Under Experiment 3, the total number of tours made by the ships ends up
significantly decreasing from their baseline when there are 1600 evacuees for Incidents 1,2, and 3
(i.e., 35% decrease) and slightly decreases for Incidents 4 and 5. In these cases, we no longer have
the capacity to evacuate everyone from the cruise ship within the planning horizon. This implies
that a larger number of ships may need to be present in challenging navigation conditions (which
causes its own problems) in order to achieve the same response as our baseline experiments.
120
Furthermore, Incident 2 and 4 are negatively impacted as a result of the evacuees who are
left in the villages. For instance, the number of evacuees sent to Point Hope increases by around
40% with 1200 and 1600 passengers compared to the baseline experiment in Incident 2. Since the
ship speeds are decreased, the model sends more evacuees to Point Hope due to its closeness to
the incident area (i.e., Incident 2) in spite of its limitations. As for Incident 4, the number of the
evacuees sent to Wainwright doubles when there are 1600 passengers. It is worth mentioning that
no hosting capacity of a village is fully utilized here, confirming that the airport limitations are the
bottleneck (not hosting capacity) of the small villages.
Managerial insights: We have identified that a larger number of ships may need to be present
in challenging navigation conditions (which causes its own problems) in order to achieve the same
response as our baseline experiments. In addition, increasing airport capacities in terms of upgrading
the runways is more critical than investing in the hosting capacities. Therefore, our results suggest
airport improvements as a critical aspect of potentially improving emergency response.
5.6.3.4 Experiment 4: Increasing Infrastructure in Small Villages and Decreasing the
Speed of Ships
We now examine if improving the infrastructure systems in the small villages improve re-
sponse capabilities (similar to Experiment 1) when ships are moving slower than their ideal speed
(similar to Experiment 3). We will increase the airport capacities by one, improve the length of their
runways, and decrease the speed of ships by one time unit. We will then determine the improvement
in response capabilities from Experiment 3 from the infrastructure improvements. We do not see
any improvement in response capabilities from Experiment 3 for Incidents 1 and 5 (see Fig 5.18),
since almost all evacuees in these incidents are routed through Nome, Kotzebue, and Utqiagvik. We
see slight improvements in response capabilities in Incident 4 and major improvements in Incidents
2 and 3 (on average a 5.84%, 20.66% and 24.12% decrease, respectively), which are similar to the
ones from the baseline to Experiment 1.
The impact of the extra airport capacity together with the upgraded runways is twofold.
First, the model completely changes the transportation path for the evacuees in Incidents 2, 3 and
4. For example, while there are evacuees transported to Kotzebue and Utqiagvik in Incident 3, all
the evacuees left the cruise ship are transported only to Point Hope and Point Lay with Experiment
4. Second, in Incident 4, although it does not change the number of evacuees left in the cruise ship,
121
0
10000
20000
30000
40000
50000
60000
70000
80000
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
Incident 1 Incident 2 Incident 3 Incident 4 Incident 5
Exp 3 Exp 4
Figure 5.18: The total objective values in Experiment 3 and Experiment 4
it decreases the number of evacuees (i..e, from 184 to 24) that cannot make it out of the Arctic due
to the improvements in Wainwright.
We emphasize that Point Hope is the key village for both Incidents 2 and 3. When improving
the airport infrastructure in Pont Hope, the model no longer uses Kotzebue and Utqiagvik in Incident
3. More evacuees are carried to Point Hope in Incident 2 while Kotzebue is less used. This is
an important observation indicating that airport investments in Point Hope might be critical in
improving response capabilities.
Managerial insights: This experiment reveals that airport investments in Point Hope would
likely be important (depending on their feasibility) in improving response capabilities in the face
of challenging situations. Thus, we believe that Point Hope could be the key location in the entire
region for the infrastructure investments.
5.6.3.5 Experiment 5: Increasing the Number of Evacuees in Higher Priority Levels
Here, we increase the number of evacuees in Priority 2 (i.e., 35% of the total evacuees), and
decrease in Priority 1 (i.e., 55% of the total evacuees). Our goal is to test whether a) transportation
decisions would change, and b) logistic decisions would experience major changes. In this analysis,
the core transportation decisions remain the same and, therefore, the evacuation portion of the
objective remains the same. Therefore, priority is still given to this piece of the MRE. As expected,
we do see a slight increase in deprivation costs since more evacuees are at a higher priority level.
The only major increase occurs in Incident 3 (e.g., an increase of 11% for the deprivation costs when
evacuating 1600 people). This is because the evacuees stay longer in the Arctic and the penalty
from the change to the priority levels in this experiment accumulates.
Managerial insights: Although each air asset uses its maximum capacity, enough relief com-
122
modities cannot be carried when the priority levels have shifted. Hence, improving the infrastructure
in one of the small villages and/or pre-positioning relief commodities might be an effective solution
for such scenario.
5.7 Conclusion
In this chapter, we have focused on how to respond to a MRE in Arctic Alaska. Our contri-
butions to this area is that we propose a novel IP model whose main objective is to evacuate people
from the distressed ship to the local villages around the Arctic and transport them to Anchorage
while minimizing the negative impact of the event on them. We conduct extensive analysis of poten-
tial MREs along the route the Crystal Serenity traveled around Arctic Alaska. Our work helps to
focus on concerns about Arctic MREs that are increasingly likely to occur given the shift in Arctic
maritime transportation and tourism.
The human costs we identify and the emergency response gaps that are modeled are capable
of assessing situations broader than the U.S. Arctic and impact all Arctic nations to varying degrees.
This work helps to make a case that optimization models can help to address operational gaps for
Arctic MREs, where these gaps have been practically recognized but not modeled previous to our
efforts. Our paper models the tradeoffs that policy makers, regulators and transportation logistics
professionals must consider as transportation in remote and infrastructure-poor settings increases
due to climate and ecosystem changes.
Highlights obtained from our computational analysis are as follows. The accidents occurring
around Nome and Kotzebue (Incidents 1 and 2) had a major issue that not everyone would be able
to evacuate from the cruise ship within a reasonable evacuation horizon due to the long distances
between the incident and Arctic villages. This is due both to the speed which ships can travel and
the number of ships involved in the response. Therefore, in order to mitigate this vulnerability, it
is suggested that additional ships are made available to respond to an incident in this area (near
the Bering Strait). In addition, the response capabilities for these incidents are the least sensitive
to both infrastructure improvements and challenges in the response.
The most impactful change in improving response capabilities for Incident 2 is in improving
airport capacity and upgrading the runway of Point Hope since it is closer to Incident 2 than
Kotzebue but has significantly fewer people. In addition, this investment would help the response
123
to Incident 3. When infrastructure investments are made into the small villages, evacuees no longer
travel to the farther communities and Point Hope plays the central role during the rescue operation.
One recommendation to help improve response capabilities in the Arctic would be to invest in the
necessary capacities to have Point Hope (or a similar village in the area) play a more significant role
in the response. Note that Point Hope may not be the only option (it was in our case study since it
is in the North Slope Borough) for these potential upgrades. Wales sits at the smallest part of the
Bering Strait and could also significantly impact response capabilities.
We further observed how critical the village of Utqiagvik (the largest village in Arctic Alaska)
has in responding to MREs. The response to Incidents 4 and 5 route the majority of evacuees
through this village (although for Incident 4, it collaborates with Wainwright in the response).
These experiments indicate that the âgroundâ capacity of Utqiagvik is sufficient to move evacuees
through it during the response. We do assume that Coast Guard ships are relatively close to the
incidents when they occur, which helps to indicate that these ships, or others of similar size should
be in the area while cruise ship travels through it.
In terms of future work, it will be critical to understand the practical feasibility of the
optimized responses. Although the model was built based on discussions with subject matter experts,
the output of the model has not been carefully vetted with both those involved in the response and
those that represent the villages that would be impacted. This type of vetting may lead to the
discovery that certain core assumptions in the model should be updated. Community buy-in to the
optimization model will allow for its practical deployment. Further, we can improve upon this work
by modeling how infrastructure investments should be made across Arctic Alaska to best improve
our overall response capabilities. It is our long-term goal to build such infrastructure investment
models that not only account for response capabilities but also capture the benefits (or negative
impacts) of the infrastructure development on the communities which it is built.
124
Chapter 6
Conclusion
In this dissertation, we study i) a group-based centrality metric called star degree centrality
(SDC) under both deterministic and stochastic settings, and ii) an Arctic emergency response event.
We first introduce the SDC and stochastic pseudo-SDC (SPSDC) problems and then examine each
problem by proposing integer programming (IP) formulations, studying their complexity for different
network structures, as well as developing decomposition-based exact solution methods. We then
move into the emergency response events and examine the Arctic mass rescue events. We design a
large-scale IP model which integrates transportation and logistics decisions to rescue evacuees after
a maritime accident. We propose a heuristic solution method and provide a wide range of what-if
analysis.
125
Appendices
126
Appendix A Appendix A
A.1 Pseudocode of Conservative One-by-One Heuristic
In this section, we present the pseudo-code of Conservative One-by-One Heuristic
Algorithm 5: Initialization
Input: A, C1 lastT ime[i] := the last time period when asset i is used2 lastLoc[i] := the location of asset i at lastT ime3 for a â A do4 for c â C do5 if Ďa,i = 1 then6 Xa,c,Ďa â 17 lastT ime[a]â Ďa8 lastLoc[a]â c
Algorithm 6: ConservativeOneByOneHeuristic
Input: A,Aa, C1 Initialization(A, C)2 for v â A \ Aa do3 SendAsset(v)4 nextShipâ true5 nextP laneâ true6 while nextShip or nextP lane do7 if nextShip then8 nextShipâ ShipAssignment(A \ Aa)9 if nextP lane then
10 nextP laneâ PlaneAssignment(A)
11 Finalize(A,Aa)
Algorithm 7: PopulationUpdate
Input: i â C, depart, j â C, arrival, carry1 for depart ⤠t ⤠|T | do2 wi,t â wi,t â carry3 for arrival ⤠t ⤠|T | do4 wj,t â wj,t + carry
127
Algorithm 8: AirportCheck
Input: a â Aa, ν â V, depart, t, transit1 decisionâ true2 if transit then3 airportT imeâ departâ t4 numâ number of planes located in ν at airportT ime5 if num+ 1 > κν then6 decisionâ false7 else8 while airportT ime > lastT ime[a] do9 numâ number of planes located in lastLoc[a] at airportT ime
10 if num+ 1 > ÎşlastLoc[a] then11 decisionâ false12 break
13 else14 airportT imeâ = 1
15 else16 airportT imeâ lastT ime[a]17 while airportT ime ⤠depart do18 numâ number of planes located in ν at airportT ime19 if num+ 1 > κν then20 decisionâ false21 break
22 else23 airportT ime+ = 1
24 return decision
Algorithm 9: SendAsset
Input: a â A1 if a â A \ Aa then2 targetâ CruiseShip3 else4 targetâ Anchorage5 arrivalâ lastT ime[a] + Ďa,lastLoc[a],target
6 if arrival ⤠|T | then7 YlastLoc[a],target,a,lastT ime[a] â 18 Xa,target,arrival â 19 lastT ime[a]â arrival
10 lastLoc[a]â target
11 else12 while lastT ime[a] < |T | do13 Xv,lastLoc[a],lastT ime[a]+1 â 114 lastT ime[a]+ = 1
15 A â A \ a
128
Algorithm 10: ShipAssignment
Input: A \ Aa, V1 map[]â null2 for v â A \ Aa do3 stay â false4 rv âM5 if lastT ime[v] = |T | then6 A â A \ v7 next v
8 for ν â V do9 tâ Ďv,Ship,ν
10 if @t > 0 then11 next v12 popCruiseShipâ min. number of population in CruiseShip between
lastT ime[v] and |T |13 minCapâ the last min. non-zero available capacity in ν between
lastT ime[v] + t and |T |14 minTimeâ the corresponding time period of minCap15 arrivalâ max(lastT ime[v] + t,minT ime)16 if minCap = 0 or arrival > |T | then17 next ν18 if lastT ime[v] + t < minTime then19 stay â true20 carry â min(Âľv, popCruiseShip,minCap)
21 if rv <carryarrival
then
22 rv â carry/arrival23 map[v]â ν, rv, arrival, carry, stay24 vâ â arg min
mâmapm.get(arrival). If there is a tie, then vâ â arg max
mâmapm.get(rv)
25 νâ, rvâ, arrivalâ, carryâ, stayâ â map[vâ]
26 tâ â Ďvâ,CruiseShip,νâ
27 if stayâ then28 departâ arrivalâ â tâ29 YCruiseShip,νâ,vâ,depart â 130 Xvâ,νâ,arrivalâ â 131 PopulationUpdate(CruiseShip, depart, νâ, arrivalâ, carryâ)32 while lastT imeâ < depart do33 Xvâ,CruiseShip,depart â 134 departâ = 1
35 else36 YCruiseShip,νâ,vâ,lastT ime[vâ] â 137 Xvâ,νâ,arrivalâ â 138 PopulationUpdate(CruiseShip, lastT ime[vâ], νâ, arrivalâ, carryâ)
39 lastT ime[vâ]â arrivalâ
40 lastLoc[vâ]â νâ
41 SendAsset(vâ)
129
42 remâ the remaining population in CruiseShip after arrivalâ
43 if rem = 0 or A \ Aa = â then44 return false45 else46 return true
Algorithm 11: PlaneAssignment
Input: Aa, V1 map[]â null2 for a â Aa do3 transitâ false4 rv âM5 if lastT ime[a] = |T | then6 Aa â Aa \ a7 next a
8 for ν â V do9 tâ 0
10 if ν = currentLoc then11 minPopâ the last min non-zero available population in ν between
lastT ime[a] + t and |T |12 departâ the corresponding time period of minPop
13 else if θv,i = 1 then14 tâ Ďa,lastLoc[a],ν
15 minPopâ the last min. non-zero available pop. in ν between lastT ime[a]+t and |T |
16 departâ the corresponding time period of minPop17 transitâ true
18 else19 continue20 canLandâ AirportCheck(a, ν, depart, t, transit)21 if depart ⼠|T | or minPop = 0 or !canLand then22 next a23 carry â min(minPop, Ďa)
24 if rv <carrydepart
then
25 rv â carrydepart
26 map[a]â ν, rv, depart, carry, transit
130
27 for a â Aa do28 if a ( map then29 numâ number of planes located in lastLoc[a] at lastT ime[a] + 1;30 if num+ 1 ⤠κlastLoc[a] then31 Xa,lastLoc[a],lastT ime[a]+1 â 1;32 lastT ime[a]+ = 1 ;
33 else34 for ν â V do35 tâ lastT ime[a] + Ďa,lastLoc[a],ν ;36 totalPopâ total number of population in ν after lastT ime[a] + t ;37 numâ number of planes located in ν at lastT ime[a] + t ;38 if totalPop = 0 and num+ 1 ⤠κν then39 YlastLoc[a],ν,a,lastT ime[a] â 1;40 Xa,ν,t â 1;41 lastT ime[a]â lastT ime[a] + t ;42 lastLoc[a]â ν ;43 break ;
44 aâ â arg minmâmap
m.get(depart). If there is a tie, then aâ â arg maxmâmap
m.get(rv)
45 νâ, rvâ, departâ, carryâ, transitâ â map[aâ]
46 if transit then47 tâ Ďaâ,lastLoc[aâ],νâ48 leaveâ departâ â t;49 YlastLoc[aâ],νâ,aâ,leave â 150 Xaâ,νâ,leave â 151 lastLoc[aâ]â νâ
52 arrivalâ departâ + Ďaâ,νâ,Anchroage53 PopulationUpdate(νâ, departâ, Anchroage, arrival, carryâ);54 while lastT ime[aâ] < departâ do55 Xaâ,νâ,departâ â 156 departââ = 1
57 else58 arrivalâ departâ + Ďaâ,νâ,Anchroage59 while lastT ime[aâ] ⤠departâ do60 Xaâ,νâ,departâ â 161 departââ = 1
62 PopulationUpdate(νâ, departâ, Anchorage, arrival, carryâ)
63 lastT ime[aâ]â departâ
64 SendAsset(aâ)65 if Aa = â then66 return false67 else68 return true
131
Algorithm 12: Finalize
Input: A,Aa1 for a â Aa do2 while lastT ime[a] ⤠|T | do3 numâ number of planes located in lastLoc[a] at lastT ime[a]4 if num+ 1 ⤠κlastLoc[a] then5 Xa,lastLoc[a],lastT ime[a] â 16 lastT ime[a]+ = 1
7 else8 locâ a location where there is enough airport capacity at
t = lastT ime[a] + Ďa,lastLoc[a],loc and plane a can land in (i.e., θloc,i = 1)9 YlastLoc[a],loc,a,lastT ime[a] â 1
10 Xa,loc,t â 111 lastT ime[a]â t12 lastLoc[a]â loc
13 for v â A \ Aa do14 while lastT ime[v] ⤠|T | do15 lastT ime[v]+ = 116 Xa,lastLoc[v],lastT ime[v] â 1
A.2 Method Selection
In this section, we compare the performance of solving the IP directly, warm-starting with
both OTH and COBOH. We first conduct our comparison in the baseline experiment. We then
create another setting where the runway lengths of the airports located in the small villages are
assumed to be upgraded implying that all plane types can land in and take off from those airports.
With the latter experiment, our goal is to examine the performance of the solution methods when
there is a change in the baseline setup and we observed that the decisions become more difficult in
this setting.
We present how the two heuristic approaches performed with respect to their initial optimal-
ity gaps in Tables A1 and A2 which correspond to the baseline experiment and the experiment with
the upgraded runways, respectively. The optimality gaps are calculated as follows. After obtaining
a solution vector via a heuristic method, the IP model is warm-started with the transportation vari-
ables Xait and Yijat. Note that even though we can include the number of evacuees transported (i.e.,
mijat) in the solution vector, our preliminary experiments indicated that providing only the former
variables yields better initial objective values. After warm-starting the model, the initial objective
value and the best bound reported by the solver by the end of the time limit are used as a UB and
an LB, respectively. The optimality gap then is computed as (UBâLB)LB .
Recall that heuristic approaches proposed solely focus on the transportation decisions and
132
Table A1: Comparison of the initial optimality gaps in the baseline experimentIncident 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5Num. ofPeople
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
OTH 0.54% 0.01% 0.05% 0.22% 1.01% 0.16% 2.53% 3.09% 2.65% 0.01% 0.08% 0.77% 0.40% 0.09% 0.36%COBOH 0.38% 0.88% 0.65% 12.36% 18.81% 6.45% 52.78% 24.99% 16.57% 23.43% 15.57% 9.89% 0.05% 0.09% 0.33%
Table A2: Comparison of the initial optimality gaps in the experiment with the upgraded runwaysIncident 1 1 1 2 2 2 2 3 3 4 4 4 5 5 5Num. ofPeople
800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600 800 1200 1600
OTH 0.06% 0.39% 0.20% 0.62% 2.41% 0.66% 0.09% 0.05% 0.16% 3.99% 1.71% 1.71% 4.63% 3.84% 2.10%COBOH 0.38% 0.95% 1.29% 4.10% 5.72% 7.69% 6.51% 6.64% 8.06% 6.69% 7.13% 7.90% 0.05% 0.09% 0.33%
in essence the OTH solves the transportation version of the model to the optimality. In both
experiments, it can be seen that the OTH produces a better initial optimality gap compared to the
COBOH in most of the instances. However, the COBOH method outperforms the OTH with respect
to the initial optimality gaps in Incident 5 in both experiments. In addition, it produces a better
initial gap in Incident 1-800 in the baseline experiment. This can be explained by the observation
that if there are large local communities capacity-wise around an incident area (e.g., Incident 1
and Incident 5), the COBOH gives a good approximation of the optimal solution. An important
observation with respect to this table is that significant improvements in terms of the objective can
be obtained by using optimization-based approaches (i.e., OTH) as opposed to relying on intuitive
approaches to responding to the mass rescue event.
We now focus on the final results obtained with three methods in two experiments. Tables
A3 and A4 summarize the comparisons in terms of the solution times and the final optimality gaps
for the baseline experiment and the experiment with the upgraded runways, respectively. In the
baseline experiment, while we observe that the number of instances solved to optimality with three
methods are close to each other, the IP model solved with CPLEX did not return any incumbent
solution for Incident 3-1200. As for the experiment with the upgraded runways, while the IP model
reaches the optimal solution via CPLEX in six scenarios, the problem is solved to optimality in seven
scenarios with the support of the heuristic approaches. However, while CPLEX did not produce any
solution within an hour for Incident 4-1600, it returned a poor solution (i.e., an optimality gap of
5.90%) for Incident 4-1200. Overall, we observe that since the larger local communities with higher
capacities (i.e., Kotzebue, Nome, and Utqiagvik) are closer to the Incidents 1,2, and 5, it becomes
relatively easier to obtain a high quality solution with CPLEX for those specific incident areas in
both experiments.
As for warm-starting the model with the solutions generated by the heuristic approaches,
133
Table A3: Comparison of the solution methods in the baseline experiment [Time (in mins), Gap (%)]IP OTH COBOH
IncidentNum. ofPeople
Time Gap Time Gap Time Gap
Incident 1 800 16.43 0.00 9.72 0.00 13.05 0.00Incident 1 1200 60.00 0.26 51.00 0.00 60.00 0.07Incident 1 1600 60.00 0.12 60.00 0.03 60.00 0.15Incident 2 800 12.22 0.00 4.49 0.00 14.41 0.00Incident 2 1200 13.89 0.00 13.49 0.00 6.42 0.00Incident 2 1600 51.55 0.00 60.00 0.14 60.00 0.19Incident 3 800 60.00 1.65 60.00 1.55 60.00 4.00Incident 3 1200 60.00 â 60.00 0.47 60.00 0.59Incident 3 1600 60.00 0.89 60 1.19 60.00 0.87Incident 4 800 41.1 0.00 7.62 0.00 13.78 0.00Incident 4 1200 24.29 0.00 6.61 0.00 7.02 0.00Incident 4 1600 28.31 0.00 4.82 0.00 8.41 0.00Incident 5 800 3.69 0.00 3.01 0.00 2.53 0.00Incident 5 1200 3.57 0.00 3.13 0.00 2.79 0.00Incident 5 1600 4.46 0.00 3.07 0.00 2.86 0.00
even though the OTH produces a better initial solution (see Tables A1 and A2) in most of the
cases, when it comes to the overall performance in terms of the solution time, we do not observe a
consistent difference. The warm-start with the OTH produces better final results in more scenarios
than warm-start with the COBOH (e.g., Incident 1-1200 and Incident 1-1600 in both Tables A1 and
A2).
Table A4: Comparison of the solution methods in the experiment with the upgraded runways
IP OTH COBOH
IncidentNum. of
PeopleTime Gap Time Gap Time Gap
Incident 1 800 45.28 0.00 17.78 0.00 26.02 0.00
Incident 1 1200 60.00 0.31 60.00 0.14 60.00 0.24
Incident 1 1600 60.00 0.29 60.00 0.12 60.00 0.27
Incident 2 800 7.60 0.00 11.23 0.00 9.53 0.00
Incident 2 1200 60.00 0.85 60.00 0.05 60.00 0.05
Incident 2 1600 60.00 1.01 60.00 0.19 52.44 0.00
Incident 3 800 39.74 0.00 60.00 0.04 60.39 0.04
Incident 3 1200 60.00 1.19 60.00 0.02 60.00 0.02
Incident 3 1600 60.00 1.54 60.00 0.16 60.00 0.30
Incident 4 800 60.00 0.02 7.20 0.00 33.34 0.00
Incident 4 1200 60.00 5.90 10.35 0.00 60.35 0.02
Incident 4 1600 60.00 â 60.00 0.05 60.39 0.04
Incident 5 800 5.90 0.00 5.71 0.00 3.80 0.00
Incident 5 1200 7.39 0.00 5.12 0.00 4.07 0.00
Incident 5 1600 8.11 0.00 5.77 0.00 5.27 0.00
134
Bibliography
Adulyasak, Y., Cordeau, J.-F., and Jans, R. (2015). Benders decomposition for production routingunder demand uncertainty. Operations Research, 63(4):851â867.
Afenyo, M., Khan, F., and Ng, A. K. (2020). Assessing the risk of potential oil spills in the Arcticdue to shipping. In Maritime Transport and Regional Sustainability, pages 179â193. Elsevier.
Ahat, B., Ekim, T., and TaskÄąn, Z. C. (2017). Integer programming formulations and Bendersdecomposition for the maximum induced matching problem. INFORMS Journal on Computing,30(1):43â56.
Aiello, G., Hopps, F., Santisi, D., and Venticinque, M. (2020). The employment of unmanned aerialvehicles for analyzing and mitigating disaster risks in industrial sites. IEEE Transactions onEngineering Management, 67(3):519â530.
Akers, S. B., Harel, D., and Krishnamurthy, B. (1994). The star graph: An attractive alternative tothe n-cube. Proceedings of the International Conference on Parallel Processing, pages 393â400.
Akers, S. B. and Krishnamurthy, B. (1989). A group-theoretic model for symmetric interconnectionnetworks. IEEE Transactions on Computers, 38(4):555â566.
Alaska Airlines (2020). Our aircraft. https://www.alaskaair.com/content/travel-info/our-
aircraft/. (Accessed on 11/03/2020).
Alaska Department of Health and Social Services (2018). Trauma system in Alaska. http://dhss
.alaska.gov/dph/Emergency/Pages/trauma/default.aspx. (Accessed on 12/29/2019).
Alaska Department of Revenue (2020). tax.alaska.gov/programs/documentviewer/viewer.aspx?1583r.http://tax.alaska.gov/programs/documentviewer/viewer.aspx?1583r. (Accessed on09/05/2020).
Alibeyg, A., Contreras, I., and Fernandez, E. (2018). Exact solution of hub network design problemswith profits. European Journal of Operational Research, 266(1):57â71.
Alkaabneh, F., Diabat, A., and Elhedhli, S. (2019). A Lagrangian heuristic and GRASP for thehub-and-spoke network system with economies-of-scale and congestion. Transportation ResearchPart C: Emerging Technologies, 102:249â273.
Allison, E. and Mandler, B. (2018). Oil and gas in the U.S. Arctic. https://www.americangeosciences.org/geoscience-currents/oil-and-gas-us-arctic. (Accessed on 01/07/2020).
Arctic Domain Awareness Center (2016). Arctic-related incidents of national significance workshop.https://arcticdomainawarenesscenter.org/Downloads/PDF/Arctic%20IoNS/ADAC Arctic%
20IoNS%202016 Report 160906.pdf. (Accessed on 03/03/2021).
135
Arctic Domain Awareness Center (2021). Arctic maritime horizons workshop. https://arcticdo
mainawarenesscenter.org/Events. (Accessed on 03/22/2021).
Ashtiani, M., Salehzadeh-Yazdi, A., Razaghi-Moghadam, Z., Hennig, H., Wolkenhauer, O., Mirzaie,M., and Jafari, M. (2018). A systematic survey of centrality measures for protein-protein interac-tion networks. BMC Systems Biology, 12(1):80.
Aykin, T. (1994). Lagrangian relaxation based approaches to capacitated hub-and-spoke networkdesign problem. European Journal of Operational Research, 79(3):501â523.
Bai, L. and Rubin, P. A. (2009). Combinatorial Benders cuts for the minimum tollbooth problem.Operations Research, 57(6):1510â1522.
Banerjee, A., Chandrasekhar, A. G., Duflo, E., and Jackson, M. O. (2013). The diffusion of micro-finance. Science, 341(6144):1236498.
Bavelas, A. (1948). A mathematical model for group structures. Applied Anthropology, 7(3):16â30.
Bavelas, A. (1950). Communication patterns in task-oriented groups. The Journal of the AcousticalSociety of America, 22(6):725â730.
Behbahani, H., Nazari, S., Kang, M. J., and Litman, T. (2019). A conceptual framework to formulatetransportation network design problem considering social equity criteria. Transportation ResearchPart A: Policy and Practice, 125:171â183.
Benders, J. F. (1962). Partitioning procedures for solving mixedâvariables programming problems.Numerische Mathematik, 4(1):238â252.
Bhowmick, S. S. and Seah, B. S. (2015). Clustering and summarizing protein-protein interactionnetworks: A survey. IEEE Transactions on Knowledge and Data Engineering, 28(3):638â658.
Bley, A. and Rezapour, M. (2016). Combinatorial approximation algorithms for buy-at-bulk con-nected facility location problems. Discrete Applied Mathematics, 213:34â46.
Bonacich, P. (1972). Factoring and weighting approaches to status scores and clique identification.Journal of Mathematical Sociology, 2(1):113â120.
Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology,92(5):1170â1182.
Botton, Q., Fortz, B., Gouveia, L., and Poss, M. (2013). Benders decomposition for the hop-constrained survivable network design problem. INFORMS Journal on Computing, 25(1):13â26.
Brady, C. (2019). Boeing 737 Detailed Technical Data. http://www.b737.org.uk/techspecsdet
ailed.htm. (Accessed on 12/04/2019).
Brooks, A. D. (2020). Search & Rescue in The North Slope Borough. http://www.north-slope.
org/departments/search-rescue. (Accessed on 01/20/2020).
Canca, D., De-Los-Santos, A., Laporte, G., and Mesa, J. A. (2017). An adaptive neighborhoodsearch metaheuristic for the integrated railway rapid transit network design and line planningproblem. Computers & Operations Research, 78:1â14.
Cessna (2019). Cessna Caravan. https://cessna.txtav.com/en/turboprop/caravan. (Accessedon 12/04/2019).
Chankong, V. and Haimes, Y. Y. (2008). Multiobjective decision making: Theory and methodology.Courier Dover Publications.
136
Chen, L. and Miller-Hooks, E. (2012). Resilience: an indicator of recovery capability in intermodalfreight transport. Transportation Science, 46(1):109â123.
Chiang, W.-K. and Chen, R.-J. (1998). Topological properties of the (n, k)-star graph. InternationalJournal of Foundations of Computer Science, 9(02):235â248.
Chou, Z.-T., Hsu, C.-C., and Sheu, J.-P. (1996). Bubblesort star graphs: A new interconnectionnetwork. In Proceedings of 1996 International Conference on Parallel and Distributed Systems,pages 41â48. IEEE.
Coast Guard News (2016). Coast guard partners industry conduct mass rescue tabletop exercisein Anchorage Alaska. https://coastguardnews.com/coast-guard-partners-industry-con
duct-mass-rescue-tabletop-exercise-in-anchorage-alaska/2016/04/21/. (Accessed on12/29/2019).
Contreras, I., Cordeau, J.-F., and Laporte, G. (2011). Benders decomposition for large-scale unca-pacitated hub location. Operations Research, 59(6):1477â1490.
Cordeau, J.-F., Furini, F., and Ljubic, I. (2019). Benders decomposition for very large scale partialset covering and maximal covering location problems. European Journal of Operational Research,275(3):882â896.
Crainic, T. G., Hewitt, M., Toulouse, M., and Vu, D. M. (2016). Service network design withresource constraints. Transportation Science, 50(4):1380â1393.
Dalal, J. and Uster, H. (2017). Combining worst case and average case considerations in an integratedemergency response network design problem. Transportation Science, 52(1):171â188.
Dangalchev, C. (2006). Residual closeness in networks. Physica A: Statistical Mechanics and itsApplications, 365(2):556â564.
Day, K. and Tripathi, A. (1992). Arrangement graphs: a class of generalized star graphs. InformationProcessing Letters, 42(5):235â241.
De Corte, A. and Sorensen, K. (2016). An iterated local search algorithm for water distributionnetwork design optimization. Networks, 67(3):187â198.
Division of Homeland Security & Emergency Management (2019). Resource catalog. https://ww
w.ready.alaska.gov/SEOC/ResourceCatalog. (Accessed on 12/04/2019).
Doan, X. V. and Shaw, D. (2019). Resource allocation when planning for simultaneous disasters.European Journal of Operational Research, 274(2):687â709.
Eckstein, M. (2021). New Arctic strategy calls for regular presence as a way to compete with Russia,China . https://news.usni.org/2021/01/05/new-arctic-strategy-calls-for-regular-p
resence-as-a-way-to-compete-with-russia-china. (Accessed on 03/08/2021).
Elbaih, A. H. and Alnasser, S. R. (2020). Teaching approach for START triage in disaster manage-ment. Medicine, 9(4):4.
Elmhadhbi, L., Karray, M.-H., Archimede, B., Otte, J. N., and Smith, B. (2020). A semantics-basedcommon operational command system for multiagency disaster response. IEEE Transactions onEngineering Management, pages 1â15.
Emde, S., Polten, L., and Gendreau, M. (2020). Logic-based Benders decomposition for schedulinga batching machine. Computers & Operations Research, 113:104777.
137
Enayaty-Ahangar, F., Rainwater, C. E., and Sharkey, T. C. (2019). A logic-based decompositionapproach for multi-period network interdiction models. Omega, 87:71â85.
Eskandarpour, M., Dejax, P., and Peton, O. (2017). A large neighborhood search heuristic for supplychain network design. Computers & Operations Research, 80:23â37.
Estrada, E. (2006). Virtual identification of essential proteins within the protein interaction networkof yeast. Proteomics, 6(1):35â40.
Estrada, E. and RodrÄąguez-Velazquez, J. A. (2005). Subgraph centrality in complex networks. Phys.Rev. E, 71:056103.
Everett, M. G. and Borgatti, S. P. (1999). The centrality of groups and classes. The Journal ofMathematical Sociology, 23(3):181â201.
Everett, M. G. and Borgatti, S. P. (2005). Extending centrality. Models and Methods in SocialNetwork Analysis, 35(1):57â76.
Farahani, R. Z., Lotfi, M., Baghaian, A., Ruiz, R., and Rezapour, S. (2020). Mass casualty man-agement in disaster scene: A systematic review of OR&MS research in humanitarian operations.European Journal of Operational Research, 287(3):787â819.
Fazel-Zarandi, M. M. and Beck, J. C. (2012). Using logic-based Benders decomposition to solvethe capacity-and distance-constrained plant location problem. INFORMS Journal on Computing,24(3):387â398.
Federal Aviation Administration (2019). Alaskan Region Airports Division. https://www.faa.go
v/airports/alaskan/. (Accessed on 03/11/2020).
Fischetti, M., Ljubic, I., and Sinnl, M. (2016). Benders decomposition without separability: Acomputational study for capacitated facility location problems. European Journal of OperationalResearch, 253(3):557â569.
Fischetti, M., Ljubic, I., and Sinnl, M. (2017). Redesigning Benders decomposition for large-scalefacility location. Management Science, 63(7):2146â2162.
Fjørtoft, K. and Berg, T. E. (2020). Handling the preparedness challenges for maritime and offshoreoperations in Arctic waters. In Arctic Marine Sustainability, pages 187â212. Springer.
Fortz, B., Gorgone, E., and Papadimitriou, D. (2017). A Lagrangian heuristic algorithm for thetime-dependent combined network design and routing problem. Networks, 69(1):110â123.
Frank, S. M. and Rebennack, S. (2015). Optimal design of mixed AC-DC distribution systemsfor commercial buildings: A nonconvex generalized Benders Decomposition approach. EuropeanJournal of Operational Research, 242(3):710â729.
Freeman, L. C. (1978). Centrality in social networks conceptual clarification. Social Networks,1(3):215â239.
Friggstad, Z., Rezapour, M., Salavatipour, M. R., and Soto, J. A. (2019). LP-based approximationalgorithms for facility location in buy-at-bulk network design. Algorithmica, 81(3):1075â1095.
Gabrel, V., Knippel, A., and Minoux, M. (1999). Exact solution of multicommodity network opti-mization problems with general step cost functions. Operations Research Letters, 25(1):15â23.
Garrett, R. A., Sharkey, T. C., Grabowski, M., and Wallace, W. A. (2017). Dynamic resourceallocation to support oil spill response planning for energy exploration in the Arctic. EuropeanJournal of Operational Research, 257(1):272â286.
138
GCR (2017). AirportIQ 5010. https://www.airportiq5010.com/5010web/. (Accessed on01/31/2020).
Gendron, B. (2019). Revisiting Lagrangian relaxation for network design. Discrete Applied Mathe-matics, 261:203â218.
Geoffrion, A. M. (1972). Generalized Benders decomposition. Journal of Optimization Theory andApplications, 10(4):237â260.
Goemans, M. X., Goldberg, A. V., Plotkin, S., Shmoys, D. B., Tardos, E., and Williamson, D. P.(1994). Improved approximation algorithms for network design problems. In Proceedings of thefifth annual ACM-SIAM symposium on Discrete algorithms, pages 223â232. Society for Industrialand Applied Mathematics.
Govindan, K., Jafarian, A., and Nourbakhsh, V. (2019). Designing a sustainable supply chain net-work integrated with vehicle routing: A comparison of hybrid swarm intelligence metaheuristics.Computers & Operations Research, 110:220â235.
Grimmer, B. (2018). Dual-based approximation algorithms for cut-based network connectivity prob-lems. Algorithmica, 80(10):2849â2873.
Griner, C. (2013). Learjet 31A Rescue Bird in search and rescue. https://www.flickr.com/photos/air traveller/10392962794. (Accessed on 12/04/2019).
Guo, C., Bodur, M., Aleman, D. M., and Urbach, D. R. (2021). Logic-based Benders decompo-sition and binary decision diagram based approaches for stochastic distributed operating roomscheduling. INFORMS Journal on Computing.
Hanlon, T. (2020). ConocoPhillips shuts down North Slope drilling over coronavirus concerns.https://www.alaskapublic.org/2020/04/08/conocophillips-shuts-down-north-slope-dr
illing-over-coronavirus-concerns/. (Accessed on 09/05/2020).
Heleniak, T. (2020). The future of the Arctic populations. Polar Geography, pages 1â17.
Hjort, J., Karjalainen, O., Aalto, J., Westermann, S., Romanovsky, V. E., Nelson, F. E., Etzelmuller,B., and Luoto, M. (2018). Degrading permafrost puts Arctic infrastructure at risk by mid-century.Nature Communications, 9(1):1â9.
Hoag, H. (2016). NOAA is Updating its Arctic Charts to Prevent a Nautical. https://deeply.t
henewhumanitarian.org/arctic/community/2016/08/29/noaa-is-updating-its-arctic-ch
arts-to-prevent-a-nautical-disaster. (Accessed on 03/08/2021).
HolguÄąn-Veras, J., Perez, N., Jaller, M., Van Wassenhove, L. N., and Aros-Vera, F. (2013). Onthe appropriate objective function for postâdisaster humanitarian logistics models. Journal ofOperations Management, 31(5):262â280.
Holmberg, K. (1994). On using approximations of the Benders master problem. European Journalof Operational Research, 77(1):111â125.
Hooker, J. N. (2007). Planning and scheduling by logic-based Benders decomposition. OperationsResearch, 55(3):588â602.
Hooker, J. N. and Ottosson, G. (2003). Logic-based Benders decomposition. Mathematical Program-ming, 96(1):33â60.
Hu, M., Cai, W., and Zhao, H. (2019). Simulation of passenger evacuation process in cruise shipsbased on a multi-grid model. Symmetry, 11(9):1166.
139
Humpert, M. (2018). Arctic cruise ship runs aground in Canadaâs northwest passage. https://www.highnorthnews.com/en/arctic-cruise-ship-runs-aground-canadas-northwest-passage.(Accessed on 12/29/2019).
Humpert, M. (2019). New satellite images reveal extent of Russiaâs military and economic build-upin the Arctic. https://www.highnorthnews.com/en/new-satellite-images-reveal-extent-
russias-military-and-economic-build-arctic. (Accessed on 12/04/2019).
Igraph (2020). R igraph manual pages. https://igraph.org/r/doc. (Accessed on 12/07/2020).
Ilinova, A. and Chanysheva, A. (2020). Algorithm for assessing the prospects of offshore oil and gasprojects in the arctic. Energy Reports, 6:504â509.
International Maritime Organization (2016). International code for ships operating in polar waters.http://www.imo.org/en/MediaCentre/HotTopics/polar/Documents/POLAR%20CODE%20TEXT
%20AS%20ADOPTED.pdf. (Accessed on 09/14/2020).
Jalili, M., Salehzadeh-Yazdi, A., Asgari, Y., Arab, S. S., Yaghmaie, M., Ghavamzadeh, A., andAlimoghaddam, K. (2015). Centiserver: A Comprehensive Resource, Web-Based Application andR Package for Centrality Analysis. PLOS ONE, 10(11):1â8.
Jenkins, P. R., Lunday, B. J., and Robbins, M. J. (2019). Robust, multi-objective optimization forthe military medical evacuation location-allocation problem. Omega, page 102088.
Jeong, H., Mason, S. P., Barabasi, A.-L., and Oltvai, Z. N. (2001). Lethality and centrality in proteinnetworks. Nature, 411(6833):41â42.
Joy, M. P., Brock, A., Ingber, D. E., and Huang, S. (2005). High-betweenness proteins in the yeastprotein interaction network. BioMed Research International, 2005(2):96â103.
Kamath, R. S., Fraser, A. G., Dong, Y., Poulin, G., Durbin, R., Gotta, M., Kanapin, A., Le Bot,N., Moreno, S., Sohrmann, M., et al. (2003). Systematic functional analysis of the Caenorhabditiselegans genome using RNAi. Nature, 421(6920):231â237.
Kelman, I. (2020). Arctic humanitarianism for post-disaster settlement and shelter. Disaster Pre-vention and Management: An International Journal, 29(4):471â480.
Keskin, M. E. (2017). A column generation heuristic for optimal wireless sensor network design withmobile sinks. European Journal of Operational Research, 260(1):291â304.
Kleitman, D. J. and Winston, K. J. (1982). On the number of graphs without 4-cycles. DiscreteMathematics, 41(2):167â172.
Kloimullner, C. and Raidl, G. R. (2017). Full-load route planning for balancing bike sharing systemsby logic-based Benders decomposition. Networks, 69(3):270â289.
Leavitt, H. J. (1951). Some effects of certain communication patterns on group performance. TheJournal of Abnormal and Social Psychology, 46(1):38.
Leitner, M., Ljubic, I., Riedler, M., and Ruthmair, M. (2020). Exact approaches for the directednetwork design problem with relays. Omega, 91:102005.
Li, C., Lin, S., and Li, S. (2020a). Structure connectivity and substructure connectivity of stargraphs. Discrete Applied Mathematics, 284:472â480.
Li, Y., Zhang, J., and Yu, G. (2020b). A scenario-based hybrid robust and stochastic approachfor joint planning of relief logistics and casualty distribution considering secondary disasters.Transportation Research Part E: Logistics and Transportation Review, 141:102029.
140
Li, Z., Swann, J. L., and Keskinocak, P. (2018). Value of inventory information in allocating alimited supply of influenza vaccine during a pandemic. PLOS One, 13(10):e0206293.
Lin, L., Huang, Y., Hsieh, S.-Y., and Xu, L. (2020). Strong reliability of star graphs interconnectionnetworks. IEEE Transactions on Reliability.
Liu, Y., Cui, N., and Zhang, J. (2019). Integrated temporary facility location and casualty allocationplanning for post-disaster humanitarian medical service. Transportation Research Part E: Logisticsand Transportation Review, 128:1â16.
Mak, L., Farnworth, B., Wissler, E. H., DuCharme, M. B., Uglene, W., Boileau, R., Hackett, P.,and Kuczora, A. (2011). Thermal requirements for surviving a mass rescue incident in the Arctic:Preliminary results. In ASME 2011 30th International Conference on Ocean, Offshore and ArcticEngineering, pages 375â383. American Society of Mechanical Engineers Digital Collection.
McKee, C. H., Heffernan, R. W., Willenbring, B. D., Schwartz, R. B., Liu, J. M., Colella, M. R.,and Lerner, E. B. (2020). Comparing the accuracy of mass casualty triage systems when used inan adult population. Prehospital Emergency Care, 24(4):515â524.
McNutt, C. (2016). Northwest Passage 2016 Exercise, After Action Report. https://www.hsdl.o
rg/?abstract&did=802138. (Accessed on 12/29/2019).
Messner, S. (2020). Future Arctic shipping, black carbon emissions, and climate change. In MaritimeTransport and Regional Sustainability, pages 195â208. Elsevier.
Morgunova, M. (2020). The global energy system through a prism of change: The oil & gas industryand the case of the Arctic. PhD thesis, KTH Royal Institute of Technology.
Nasirian, F., Pajouh, F. M., and Balasundaram, B. (2020). Detecting a most closeness-central cliquein complex networks. European Journal of Operational Research, 283(2):461â475.
National Oceanic & Atmospheric Administration (2021). NOAA surveys the unsurveyed, leadingthe way in the U.S. Arctic. https://nauticalcharts.noaa.gov/updates/noaa-surveys-the
-unsurveyed-leading-the-way-in-the-u-s-arctic/. (Accessed on 03/08/2021).
Neelam, S. and Sood, S. K. (2020). A scientometric review of global research on smart disastermanagement. IEEE Transactions on Engineering Management, 68(1):317â329.
Nguyen, H., Sharkey, T. C., Mitchell, J. E., and Wallace, W. A. (2020). Optimizing the recoveryof disrupted single-sourced multi-echelon assembly supply chain networks. IISE Transactions,52(7):703â720.
Nurre, S. G., Cavdaroglu, B., Mitchell, J. E., Sharkey, T. C., and Wallace, W. A. (2012). Restoringinfrastructure systems: An integrated network design and scheduling (INDS) problem. EuropeanJournal of Operational Research, 223(3):794â806.
Office of Aviation Forces (2019). USCG Fixed Wing & Sensors Division (CG-7113). https://ww
w.dco.uscg.mil/Our-Organization/Assistant-Commandant-for-Capability-CG-7/Off
ice-of-Aviation-Force-CG-711/Fixed-Wing-Sensors-Division-CG-7113/. (Accessed on12/04/2019).
Ăsthagen, A. (2020). Maritime Tasks and Challenges in the Arctic. In Coast Guards and OceanPolitics in the Arctic, pages 25â32. Springer.
Paraskevopoulos, D. C., Bektas, T., Crainic, T. G., and Potts, C. N. (2016). A cycle-based evo-lutionary algorithm for the fixed-charge capacitated multi-commodity network design problem.European Journal of Operational Research, 253(2):265â279.
141
Pavlov, V. (2020). Arctic marine oil spill response methods: Environmental challenges and techno-logical limitations. In Arctic Marine Sustainability, pages 213â248. Springer.
Perez-RodrÄąguez, N. and HolguÄąn-Veras, J. (2015). Inventoryâallocation distribution models forpostdisaster humanitarian logistics with explicit consideration of deprivation costs. TransportationScience, 50(4):1261â1285.
Przybylak, R. and Wyszynski, P. (2020). Air temperature changes in the Arctic in the period1951â2015 in the light of observational and reanalysis data. Theoretical and Applied Climatology,139(1-2):75â94.
Rahmaniani, R., Crainic, T. G., Gendreau, M., and Rei, W. (2018). Accelerating the Bendersdecomposition method: Application to stochastic network design problems. SIAM Journal onOptimization, 28(1):875â903.
Rambha, T., Nozick, L. K., Davidson, R., Yi, W., and Yang, K. (2021). A stochastic optimizationmodel for staged hospital evacuation during hurricanes. Transportation Research Part E: Logisticsand Transportation Review, 151:102321.
Ramirez-Nafarrate, A., Araz, O. M., and Fowler, J. W. (2021). Decision assessment algorithms forlocation and capacity optimization under resource shortages. Decision Sciences, 52(1):142â181.
Rasti, S. and Vogiatzis, C. (2019). A survey of computational methods in proteinâprotein interactionnetworks. Annals of Operations Research, 276(1-2):35â87.
Ravi, R., Marathe, M. V., Ravi, S., Rosenkrantz, D. J., and Hunt III, H. B. (2001). Approxi-mation algorithms for degree-constrained minimum-cost network-design problems. Algorithmica,31(1):58â78.
RavnAir Alaska (2020). The Ravn Aircraft Fleet Specifications. https://www.flyravn.com/abou
t-us/aircraft-fleet/. (Accessed on 12/04/2019).
RodrÄąguez-EspÄąndola, O., Alem, D., and Da Silva, L. P. (2020). A shortage risk mitigation model formulti-agency coordination in logistics planning. Computers & Industrial Engineering, 148:106676.
Rogers, D. D., King, M., and Carnahan, H. (2020). Arctic search and rescue: A case study forunderstanding issues related to training and human factors when working in the north. In ArcticMarine Sustainability, pages 333â344. Springer.
Roshanaei, V., Luong, C., Aleman, D. M., and Urbach, D. (2017). Propagating logic-based Ben-dersâ decomposition approaches for distributed operating room scheduling. European Journal ofOperational Research, 257(2):439â455.
Ruskin, L. (2018). China seeks bigger role in Arctic. https://www.alaskapublic.org/2018/02/06/china-seeks-bigger-role-in-arctic/. (Accessed on 02/05/2020).
Rysz, M., Pajouh, F. M., and Pasiliao, E. L. (2018). Finding clique clusters with the highestbetweenness centrality. European Journal of Operational Research, 271(1):155â164.
Sabouhi, F., Bozorgi-Amiri, A., Moshref-Javadi, M., and Heydari, M. (2019). An integrated routingand scheduling model for evacuation and commodity distribution in large-scale disaster reliefoperations: a case study. Annals of Operations Research, 283(1):643â677.
Saif, A. and Elhedhli, S. (2016). Cold supply chain design with environmental considerations: Asimulation-optimization approach. European Journal of Operational Research, 251(1):274â287.
142
Samotij, W. (2015). Counting independent sets in graphs. European Journal of Combinatorics,48:5â18.
Sarma, D., Das, A., Dutta, P., and Bera, U. K. (2020). A cost minimization resource allocationmodel for disaster relief operations with an information crowdsourcing-based mcdm approach.IEEE Transactions on Engineering Management, pages 1â21. .
Schiermeyer, I. (2019). Maximum independent sets near the upper bound. Discrete Applied Mathe-matics, 266:186â190.
Schofield, C. and Ăsthagen, A. (2020). A Divided Arctic: Maritime Boundary Agreements andDisputes in the Arctic Ocean. In Handbook on Geopolitics and Security in the Arctic, pages171â191. Springer.
Sen, S., Barnhart, C., Birge, J., Boyd, A., Fu, M., Hochbaum, D., Morton, D., Nemhauser, G.,Nelson, B., Powell, W., et al. (2014). Operations research: A catalyst for engineering grandchallenges. Technical report, Tech. rep., National Science Foundation.
Setiawan, E., Liu, J., and French, A. (2019). Resource location for relief distribution and victimevacuation after a sudden-onset disaster. IISE Transactions, 51(8):830â846.
Shalina, E. V., Johannessen, O. M., and Sandven, S. (2020). Changes in Arctic Sea Ice Cover in theTwentieth and Twenty-First Centuries. In Sea Ice in the Arctic, pages 93â166. Springer.
Sherali, H. D., Bae, K.-H., and Haouari, M. (2010). Integrated airline schedule design and fleetassignment: Polyhedral analysis and Bendersâ decomposition approach. INFORMS Journal onComputing, 22(4):500â513.
Sherman, R. (2000). C-17 Globemaster III. https://fas.org/man/dod-101/sys/ac/c-17.htm.(Accessed on 12/04/2019).
Shu, J., Lv, W., and Na, Q. (2021). Humanitarian relief supply network design: Expander graphbased approach and a case study of 2013 flood in northeast china. Transportation Research PartE: Logistics and Transportation Review, 146:102178.
Statista Research Department (2020). Cruise industry statistics & facts. https://www.statista
.com/topics/1004/cruise-industry/. (Accessed on 03/08/2021).
Stauffer, J. M. and Kumar, S. (2021). Impact of incorporating returns into pre-disaster deploymentsfor rapid-onset predictable disasters. Production and Operations Management, 30(2):451â474.
SteadieSeifi, M., Dellaert, N., Nuijten, W., and Van Woensel, T. (2017). A metaheuristic for themultimodal network flow problem with product quality preservation and empty repositioning.Transportation Research Part B: Methodological, 106:321â344.
Stepanov, A. and Smith, J. M. (2009). Multi-objective evacuation routing in transportation net-works. European Journal of Operational Research, 198(2):435â446.
Stepien, A., Kauppila, L., Kopra, S., Kapyla, J., Lanteigne, M., Mikkola, H., and Nojonen, M.(2020). Chinaâs economic presence in the Arctic: Realities, expectations and concerns. In ChinesePolicy and Presence in the Arctic, pages 90â136. Brill Nijhoff.
Struzik, E. (2018). In the melting Arctic, a harrowing account from a stranded ship. https://e360.yale.edu/features/in-the-melting-arctic-harrowing-account-from-a-stranded-ship.(Accessed on 03/11/2020).
143
Sung, I. and Lee, T. (2016). Optimal allocation of emergency medical resources in a mass casualtyincident: Patient prioritization by column generation. European Journal of Operational Research,252(2):623â634.
Szklarczyk, D., Franceschini, A., Wyder, S., Forslund, K., Heller, D., Huerta-Cepas, J., Simonovic,M., Roth, A., Santos, A., Tsafou, K. P., et al. (2015). STRING v10: Proteinâprotein interactionnetworks, integrated over the tree of life. Nucleic Acids Research, 43(D1):D447âD452.
TaskÄąn, Z. C., Smith, J. C., and Romeijn, H. E. (2012). Mixed-integer programming techniques fordecomposing IMRT fluence maps using rectangular apertures. Annals of Operations Research,196(1):799â818.
Triantaphyllou, E. (2000). Multi-criteria decision making methods. In Multi-criteria decision makingmethods: A comparative study, pages 5â21. Springer.
United Nations (2020). Passenger Vessels. London, UK: The International Maritime Organization(IMO). http://www.imo.org/en/OurWork/Safety/Regulations/Pages/PassengerShips.asp
x. (Accessed on 09/05/2020).
United States Air Force (2008). HC-130P/N. https://www.106rqw.ang.af.mil/About-Us/Fact-Sheets/Display/Article/1041575/hc-130pn/. (Accessed on 12/04/2019).
U.S. Bureau of the Census (2019). The United States Census 2020. https://www.census.gov/.(Accessed on 12/29/2019).
US EPA (2020). Summary of the oil pollution act: Laws & regulations. https://www.epa.gov/la
ws-regulations/summary-oil-pollution-act#:~:text=33%20U.S.C.&text=The%20Oil%20Po
llution%20Act%20(OPA,or%20unwilling%20to%20do%20so. (Accessed on 09/05/2020).
USCG (2016). Operational Assets. https://www.work.uscg.mil/Assets/. (Accessed on12/04/2019).
Uster, H., Easwaran, G., Akcali, E., and Cetinkaya, S. (2007). Benders decomposition with alter-native multiple cuts for a multiâproduct closed-loop supply chain network design model. NavalResearch Logistics, 54(8):890â907.
Uster, H., Wang, X., and Yates, J. T. (2018). Strategic Evacuation Network Design (SEND) undercost and time considerations. Transportation Research Part B: Methodological, 107:124â145.
Veremyev, A., Prokopyev, O. A., and Pasiliao, E. L. (2017). Finding groups with maximum be-tweenness centrality. Optimization Methods and Software, 32(2):369â399.
Vogiatzis, C. and Camur, M. C. (2019). Identification of essential proteins using induced stars inproteinâprotein interaction networks. INFORMS Journal on Computing, 31(4):703â718.
Vogiatzis, C., Veremyev, A., Pasiliao, E. L., and Pardalos, P. M. (2015). An integer programmingapproach for finding the most and the least central cliques. Optimization Letters, 9(4):615â633.
Waldholz, R. (2016). On the scene with the Crystal Serenity. https://www.ktoo.org/2016/08/1
7/scene-crystal-serenity/. (Accessed on 03/25/2020).
Wang, J., Peng, W., and Wu, F.-X. (2013). Computational approaches to predicting essentialproteins: A survey. PROTEOMICSâClinical Applications, 7(1-2):181â192.
World Health Organization (2019). Publications on water sanitation and health. https://www.wh
o.int/water sanitation health/publications/en/. (Accessed on 12/04/2019).
144
Wuchty, S. and Stadler, P. F. (2003). Centers of complex networks. Journal of Theoretical Biology,223(1):45â53.
Ye, Y., Jiao, W., and Yan, H. (2020). Managing relief inventories responding to natural disasters:Gaps between practice and literature. Production and Operations Management, 29(4):807â832.
Yu, L., Yang, H., Miao, L., and Zhang, C. (2019). Rollout algorithms for resource allocation inhumanitarian logistics. IISE Transactions, 51(8):887â909.
Zetina, C. A., Contreras, I., and Cordeau, J.-F. (2019). Exact algorithms based on Benders decom-position for multicommodity uncapacitated fixed-charge network design. Computers & OperationsResearch, 111:311â324.
Zhong, S., Cheng, R., Jiang, Y., Wang, Z., Larsen, A., and Nielsen, O. A. (2020). Risk-averse op-timization of disaster relief facility location and vehicle routing under stochastic demand. Trans-portation Research Part E: Logistics and Transportation Review, 141:102015.
145