Intro to the ESUPS Optimization Model#

We’ll start with a simplified version of the model and then build from there.

Consider the scenario where we’re expecting to have a specific disaster and want to transport the supplies from warehouses with predefined stock as quickly as possible to the disaster. Perhaps it’s hurricane season, or a volcano has been showing increasing activity.

First, we list the input data we have:

  • \(I\) is the set of warehouses

  • \(\tau_i\) is the driving time to ship a single item from warehouse \(i \in I\) to the disaster

  • \(b_i\) is the starting inventory (buckets) at each warehouse \(i \in I\)

  • \(d\) is the demand we need to satisfy, i.e., the number of items we need to ship to the disaster

Then, let’s define our decision variables, i.e. the things we can vary.

  • \(y_i\) is the amount of supplies to send from warehouse \(i \in I\) to our disaster

Now that we have described what we can change with our variables, we can figure out how to represent the objective function!

\[ \begin{aligned} \min \sum_{i \in I} \tau_{i} \cdot y_i \end{aligned} \]

The constraints are:

\[\begin{split} \begin{aligned} \text{s.t.} & \sum_{i \in I} & y_{i} & = d & & \quad \text{(total supplies sent must meet demand)}\\ & & y_i & \leq b_i & \forall i \in I & \quad \text{(you can't send more than a warehouse has)}\\ &\text{} & y_{i} & \geq 0 &\forall i \in I & \quad \text{(you can't send negative supplies)} \end{aligned} \end{split}\]

Code the Basic Model#

So now let’s code and solve this model.

import gurobipy as gp
from gurobipy import GRB
import pandas as pd

path = "https://raw.githubusercontent.com/Gurobi/gurobi-casestudy-esups/refs/heads/main/docs/"
simple_Allocation = pd.read_csv(path + "data/simple_Allocation.csv")

# Prep data
BucketsNeeded = 13561
n = len(simple_Allocation)
t = simple_Allocation.drivingTime_hrs
b = simple_Allocation.Buckets

# Create model
model = gp.Model("simple_Allocation")
model.Params.OutputFlag = 0

# Add decision variables
y = model.addVars(n, name="Warehouse_Allocation")

# Add constraint to meet demand
model.addConstr(
    gp.quicksum(y[i] for i in range(n)) == BucketsNeeded, name="Meet_Demand"
)

# Add in warehouse_constraints
for i in range(n):
    model.addConstr(y[i] <= b[i], name=f"warehouse_endowment_{i}")

# Note we don't have a constraint for y >= 0 since it's assumed in the variable definition
# Add objective
objective = gp.quicksum(t[i] * y[i] for i in range(n))
model.setObjective(objective, GRB.MINIMIZE)

# Fire up the solver!
model.optimize()
Restricted license - for non-production use only - expires 2026-11-23

Now let’s analyze the results!

simple_Allocation_sol = simple_Allocation.copy()
simple_Allocation_sol["y"] = model.getAttr("X", y)
simple_Allocation_sol.sort_values("drivingTime_hrs")
depotCity drivingTime_hrs Buckets y
1 Ambatondrazaka 0.00 26 26.0
4 Antananarivo Renivohitra 6.00 9046 9046.0
11 Miarinarivo 7.00 3 3.0
14 Toamasina I 8.00 1580 1580.0
5 Antsohihy 10.00 610 610.0
2 Ambositra 11.00 41 0.0
7 Fenerive Est 11.00 6689 2296.0
0 Ambanja 14.00 375 0.0
9 Maintirano 14.25 2460 0.0
8 Mahajanga I 16.00 150 0.0
12 Morondava 16.00 736 0.0
10 Manakara 17.00 4235 0.0
6 Farafangana 19.00 5201 0.0
15 Toliara 22.00 1637 0.0
13 Sambava 23.00 5700 0.0
3 Ambovombe 26.00 2322 0.0

Now, you might think that if we want to position supplies to respond to a known disaster, we should just put them as close as possible. It’s an intuitive solution that can be solved with a simple greedy algorithm. But of course, life is never that simple.

Now that we’ve got the initial problem outlined, let’s start making it more realistic with two additions:

  1. Instead of preparing for only one disaster, let’s prepare for all the disasters that might occur.

  2. Instead of being an omniscient observer, let’s say we aren’t sure where the next disaster will be.

Now, these assumptions might be intimidating at first. How do you rigorously model information you don’t know? Isn’t this the realm of predictive models?

This reaction is totally normal and indeed, we often use predictive modeling when our information is incomplete. However, remember that this problem isn’t easily suited to ML techniques due to the low availability of data, small feature space, and large solution space. So how do we approach this with optimization? Well, now that we know how supplies will be dispensed for a given warehouse allocation, we can leverage the small set of historical disasters we do have to ask which allocations would have saved/minimized the time taken to respond to all of the disasters. Then, our optimization techniques will give us a small number that could feasibly satisfy this (where our constraints meet, see the introduction for more on this) and we can simply iterate over them.

But how do we go about it in practice?

Include All Disasters#

Before we begin, let’s explicitly define our new problem with the additional requirements outlined in the previous section so we’re all on the same page. Our first step is to add in the fact that there are more disasters than just one. We can do that by including a variable index to denote which disaster we’re talking about.

Let \(K\) be the set of disasters, and \(k\) be the index for the disaster scenario at hand, i.e., a storm, earthquake, or epidemic.

In scenario \(k\), the time for a warehouse \(i\) to send \(y_i\) items is \(\tau_{ik} \cdot y^k_i\)

Repeating for all warehouses gives us \( \sum_i \tau_{ik} \cdot y^k_i \)

Repeating for all disasters gives us \( \sum_k \sum_i \tau_{ik} \cdot y^k_i \)

The question that ESUPS, and subsequently this model, set out to answer is: Where should disaster relief supplies be stored to ensure they get to people in need as quickly and cheaply as possible in the event of a disaster?

By comparing the overall travel time for every single disaster that we have data on, we can make determinations on whether storing different quantities of materials in different warehouses would be better or worse given our historical data.

Before we get to solving, let’s add in one more factor. Different disasters occur at different rates. A landlocked nation may be less likely to experience a disaster caused by a tropical storm than an earthquake, so shouldn’t we weigh the response time to earthquakes more than that of a storm?

Accounting for Randomness#

One of the most difficult tasks that we can encounter in data science problems is randomness, or as it’s often called, “stochastic” elements. This can be seen in all kinds of ways, but in our case study, we’re going to look at how we account for not knowing which disaster will hit next (perhaps even several years in the future). The way we do this is to create a stochastic model, which may initially seem challenging—but in discrete events like this, it’s super easy.

If you’re familiar with expected value, this will quickly make sense—but no need for any prior experience! See the sub-notebook for a quick intro. The idea is that we weigh the outcome of the event (i.e., total travel time in this case) by the probability it occurs. For example, if an earthquake is 3 times as likely as a flood, we would rewrite the equation as:

\[0.75 \cdot\text{(travel time earthquake)} + 0.25 \cdot\text{(travel time flood)} \]

Updated Model#

Now that we’ve had some background on expected value, let’s return to our model. Remember, our goal is to account for the inherent randomness associated with different disasters occurring.

Let \(P^k\) be the probability of disaster \(k\) and \(t^k\) be the total travel time involved in disaster \(k\) from the previous sections. Using our definition of expected value, this gives us

\[ \sum_k P^k \cdot t^k\]

Now, you might notice that we have already written an equation for the total travel time for disaster. Substituting this in, we get

\[\sum_k P^k \sum_i \tau_{ik} \cdot y^k_i\]

In our case, turning this problem into a stochastic optimization problem that can account for probability only involves one more value per disaster. Our final task right now is to minimize the time required to get supplies to the disaster sites, given how likely each disaster is to occur:

\[ \min_y \sum_k P^k \sum_i \tau_{ik} \cdot y^k_i \]

And we can see the constraints are almost unchanged:

\[\begin{split} \begin{aligned} \text{s.t.} \quad \sum_{i} y^k_{i} & = d^k & \forall k & \quad \text{(total supplies sent must meet demand)}\\ y^k_i & \leq b_i & \forall i \in I,~ \forall k \in K & \quad \text{(you can't send more than a warehouse has)}\\ y^k_{i} & \geq 0 & \forall i \in I,~ \forall k \in K & \quad \text{(you can't send negative supplies)}\\ \end{aligned} \end{split}\]

Note that we made an important assumption when defining the warehouse capacity constraints: We assumed that not all disasters happen at the same time, and so we have the full warehouse capacity available for each disaster.

All the probabilities should sum up to one, i.e., . But how do we calculate the probabilities? Well, we have good long-term data on what disasters have affected which countries. For now, we can go through that data and calculate the probability of disaster by counting how many times it has occurred and dividing by the number of total disasters. You may have noticed that every single disaster will have the same probability, or in other words, this will produce a uniform distribution. As a result, can effectively be removed from our objective function, so our new objective will give the exact same results as our old one. We’ll talk about ways to potentially address this later on in the notebooks as an extension, but for the moment, the purpose of adding this is to show how easily the equation can be modified to be stochastic and to get you used to the idea. So now that we have the new objective function, we can let the model go ahead and use Gurobi to solve it for us!

However, we first need to read in the data and prepare it.

warehouses = pd.read_csv(path + "data/warehouses.csv")
warehouses
Type Lat Long Buckets
0 Warehouses -17.823700 48.426300 26
1 Warehouses -20.516700 47.250000 41
2 Warehouses -18.908500 47.537500 9046
3 Warehouses -17.384300 49.409800 2762
4 Warehouses -16.916700 49.900000 1682
5 Warehouses -15.430900 49.758300 1870
6 Warehouses -16.170200 49.774100 0
7 Warehouses -16.926100 49.587100 375
8 Warehouses -25.176133 46.089378 2322
9 Warehouses -25.031600 46.990000 0
10 Warehouses -24.206400 45.865800 610
11 Warehouses -23.350000 43.666700 1027
12 Warehouses -23.452200 45.078100 0
13 Warehouses -23.426050 47.059080 1870
14 Warehouses -22.816700 47.816700 3331
15 Warehouses -23.350500 47.608800 0
16 Warehouses -18.149900 49.402300 1580
17 Warehouses -16.950400 46.828100 0
18 Warehouses -15.716700 46.316700 150
19 Warehouses -13.680400 48.455500 375
20 Warehouses -18.960800 46.903600 3
21 Warehouses -18.064600 44.029500 2460
22 Warehouses -20.284700 44.317500 736
23 Warehouses -14.883300 50.283300 5700
24 Warehouses -14.266700 50.166700 0
25 Warehouses -14.876200 47.983500 610
26 Warehouses -22.150000 48.000000 4235
distanceMatrix_scenarios = pd.read_csv(path + "data/distanceMatrix_scenarios.csv")
distanceMatrix_scenarios
scenario warehouse drivingTime_hrs
0 0 0 21.00
1 0 1 30.00
2 0 2 25.00
3 0 3 32.00
4 0 4 35.75
... ... ... ...
457 21 16 26.00
458 21 17 19.00
459 21 18 48.00
460 21 19 33.00
461 21 20 15.00

462 rows × 3 columns

disasters = pd.read_csv(path + "data/disasters.csv")
# one bucket is needed for 2.5 people on average
total_buckets = warehouses["Buckets"].sum()
disasters["demand"] = (
    (disasters["People Impacted"] / 2.5).round().clip(upper=total_buckets).astype(int)
)
demand = disasters["demand"].values
probs = [1 / len(demand)] * len(demand)

disasters
Type Lat Long People Impacted demand
0 Storm -12.266700 49.283300 118000 40811
1 Storm -14.266700 50.166700 100215 40086
2 Epidemic -14.876200 47.983500 21976 8790
3 Epidemic -15.716700 46.316700 15172 6069
4 Flood -16.950400 46.828100 20000 8000
5 Flood -17.384300 49.409800 28223 11289
6 Storm -17.823700 48.426300 84309 33724
7 Storm -18.064600 44.029500 526200 40811
8 Storm -18.149900 49.402300 600000 40811
9 Epidemic -18.769800 46.050000 3055 1222
10 Storm -18.908500 47.537500 55345 22138
11 Storm -18.960800 46.903600 1900 760
12 Flood -19.866700 47.033300 23369 9348
13 Storm -20.284700 44.317500 13561 5424
14 Storm -21.236700 48.346100 500 200
15 Storm -21.447000 47.087200 736938 40811
16 Storm -22.150000 48.000000 162086 40811
17 Storm -22.396100 46.121700 369272 40811
18 Storm -22.816700 47.816700 100000 40000
19 Storm -23.350000 43.666700 70000 28000
20 Storm -25.031600 46.990000 540043 40811
21 Storm -25.176133 46.089378 250000 40811
m = len(demand)  # number of disasters/scenarios
t = distanceMatrix_scenarios.set_index(["scenario", "warehouse"]).squeeze().to_dict()
b = warehouses["Buckets"][warehouses["Buckets"] > 0].reset_index(drop=True)
n = len(b)  # number of warehouses

model_2 = gp.Model("multiple_disasters")
model_2.Params.OutputFlag = 0

# Amount to take per warehouse
y = model_2.addVars(t.keys(), name="Warehouse_Allocation")

# Add constraints to meet demand for each disaster scenario k
for k in range(m):

    # Demand constraints
    model_2.addConstr(y.sum(k, "*") == demand[k], name=f"Meet_Demand_K:{k}")

    # Warehouse constraints
    for i in range(n):
        model_2.addConstr(y[k, i] <= b[i], name=f"warehouse_endowment_K:{k}_I:{i}")

# Objective function to minimize the weighted driving time over all scenarios
objective = gp.quicksum(
    probs[k] * gp.quicksum(t[k, i] * y[k, i] for i in range(n)) for k in range(m)
)

# Optimize model
model_2.setObjective(objective, GRB.MINIMIZE)
model_2.optimize()

# Store results in dataframe a
a = pd.DataFrame(
    {
        "Warehouse allocation": model_2.getAttr("VarName", y),
        "Buckets": model_2.getAttr("X", y),
    }
)
a[:15]
Warehouse allocation Buckets
0 0 Warehouse_Allocation[0,0] 26.0
1 Warehouse_Allocation[0,1] 41.0
2 Warehouse_Allocation[0,2] 9046.0
3 Warehouse_Allocation[0,3] 2762.0
4 Warehouse_Allocation[0,4] 1682.0
5 Warehouse_Allocation[0,5] 1870.0
6 Warehouse_Allocation[0,6] 375.0
7 Warehouse_Allocation[0,7] 2322.0
8 Warehouse_Allocation[0,8] 610.0
9 Warehouse_Allocation[0,9] 1027.0
10 Warehouse_Allocation[0,10] 1870.0
11 Warehouse_Allocation[0,11] 3331.0
12 Warehouse_Allocation[0,12] 1580.0
13 Warehouse_Allocation[0,13] 150.0
14 Warehouse_Allocation[0,14] 375.0

Data Science Extension#

We’ve seen how the objective function uses the uniform distribution to solve the problem. But what if, knowing that climate change is increasingly energizing large storms, we decide that past hurricane impacts aren’t representative of what’s to come? In this section, we want to prompt you to come up with predictive elements to improve our models. Feel free to use some of the ideas below or go in an entirely new direction!

Case Study Focus: Coastal Eastern African Nations#

Using the disaster impact data for coastal Eastern African nations, can you develop a model to predict how these impacts might escalate for Madagascar in the coming years? Consider not only the historical data, but also factors such as changes in sea surface temperatures, shifting storm tracks, population growth along vulnerable coastlines, and evolving infrastructure resilience. Further, can you integrate this predictive model into an optimization framework to better allocate resources for disaster preparedness and response?

Potential Approaches to Explore:

  • Comparing Time Series Models: Traditional statistical time series models like ARIMAX (AutoRegressive Integrated Moving Average with Explanatory Variables) are commonly used to predict future values based on past data. How do these models compare with more advanced Recurrent Neural Network (RNN)-based approaches. like Long Short-Term Memory (LSTM) networks or Gated Recurrent Units (GRUs), in,capturing long-term dependencies, especially under non-stationary conditions induced by climate change?

  • Incorporating Geospatial Data: Geospatial features, such as latitude, longitude, elevation, and proximity to bodies of water, can play a crucial role in predicting the impact of tropical storms. Can we encode geospatial information using techniques like convolutional neural networks (CNNs) for spatial feature extraction, or leverage more specialized models such as Geographical Weighted Regression (GWR) or Graph Neural Networks (GNNs) to account for spatial dependencies?

  • Incorporating Climate Change Projections: Beyond just historical data, consider how future climate projections can be integrated into the model. Can we use downscaled climate model outputs or ensemble approaches to account for different climate scenarios? How would these scenarios affect the frequency and intensity of tropical storms affecting coastal Eastern African nations?

  • Feature Engineering with Climate Indicators: Introduce climate change indicators as predictive features. For example, how do trends in sea surface temperatures (SSTs), El Niño-Southern Oscillation (ENSO) phases, or the Atlantic Multi-decadal Oscillation (AMO) correlate with storm intensification? Would incorporating these indicators as additional time series variables enhance predictive accuracy?

  • Hybrid and Ensemble Models: Can we leverage hybrid models that combine both statistical and deep learning approaches, or use ensemble methods that aggregate predictions from multiple models? For example, combining ARIMAX for short-term predictions with LSTM for capturing long-term trends may provide a more comprehensive forecasting tool.

  • Optimization Integration: Once a reliable predictive model is established, how can it be integrated into an optimization framework for resource allocation? For example, can we build an optimization model that minimizes both the cost of disaster preparedness and the potential loss from future storm impacts?

  • Model Explainability and Decision-Making: How can we ensure that the model is interpretable for decision-makers? Consider the use of techniques like SHAP (SHapley Additive exPlanations) values or LIME (Local Interpretable Model-agnostic Explanations) to explain which factors contribute most to the model’s predictions, helping policymakers make informed decisions.

By combining predictive analytics with optimization, we can not only forecast future disaster impacts, but also develop actionable strategies for minimizing those impacts. The goal is to make our models both more accurate and more useful in real-world applications, driving better outcomes for communities at risk.