Intro to the ESUPS Optimization Model

Intro to the ESUPS Optimization Model#

We’ll start with a simplified version of the model and then build from there.

Consider the scenario where we’re expecting to have a specific disaster and want to transport the supplies from warehouses with predefined stock as quickly as possible to the disaster. Perhaps it’s hurricane season, or a volcano has been showing increasing activity.

First, we list the input data we have:

\(I\) is the set of warehouses
\(\tau_i\) is the driving time to ship a single item from warehouse \(i \in I\) to the disaster
\(b_i\) is the starting inventory (buckets) at each warehouse \(i \in I\)
\(d\) is the demand we need to satisfy, i.e., the number of items we need to ship to the disaster

Then, let’s define our decision variables, i.e. the things we can vary.

\(y_i\) is the amount of supplies to send from warehouse \(i \in I\) to our disaster

Now that we have described what we can change with our variables, we can figure out how to represent the objective function!

\[ \begin{aligned} \min \sum_{i \in I} \tau_{i} \cdot y_i \end{aligned} \]

The constraints are:

\[\begin{split} \begin{aligned} \text{s.t.} & \sum_{i \in I} & y_{i} & = d & & \quad \text{(total supplies sent must meet demand)}\\ & & y_i & \leq b_i & \forall i \in I & \quad \text{(you can't send more than a warehouse has)}\\ &\text{} & y_{i} & \geq 0 &\forall i \in I & \quad \text{(you can't send negative supplies)} \end{aligned} \end{split}\]

Code the Basic Model#

So now let’s code and solve this model.

import gurobipy as gp
from gurobipy import GRB
import pandas as pd

path = "https://raw.githubusercontent.com/Gurobi/gurobi-casestudy-esups/refs/heads/main/docs/"
simple_Allocation = pd.read_csv(path + "data/simple_Allocation.csv")

# Prep data
BucketsNeeded = 13561
n = len(simple_Allocation)
t = simple_Allocation.drivingTime_hrs
b = simple_Allocation.Buckets

# Create model
model = gp.Model("simple_Allocation")
model.Params.OutputFlag = 0

# Add decision variables
y = model.addVars(n, name="Warehouse_Allocation")

# Add constraint to meet demand
model.addConstr(
    gp.quicksum(y[i] for i in range(n)) == BucketsNeeded, name="Meet_Demand"
)

# Add in warehouse_constraints
for i in range(n):
    model.addConstr(y[i] <= b[i], name=f"warehouse_endowment_{i}")

# Note we don't have a constraint for y >= 0 since it's assumed in the variable definition
# Add objective
objective = gp.quicksum(t[i] * y[i] for i in range(n))
model.setObjective(objective, GRB.MINIMIZE)

# Fire up the solver!
model.optimize()

Restricted license - for non-production use only - expires 2026-11-23

Now let’s analyze the results!

simple_Allocation_sol = simple_Allocation.copy()
simple_Allocation_sol["y"] = model.getAttr("X", y)
simple_Allocation_sol.sort_values("drivingTime_hrs")

	depotCity	drivingTime_hrs	Buckets	y
1	Ambatondrazaka	0.00	26	26.0
4	Antananarivo Renivohitra	6.00	9046	9046.0
11	Miarinarivo	7.00	3	3.0
14	Toamasina I	8.00	1580	1580.0
5	Antsohihy	10.00	610	610.0
2	Ambositra	11.00	41	0.0
7	Fenerive Est	11.00	6689	2296.0
0	Ambanja	14.00	375	0.0
9	Maintirano	14.25	2460	0.0
8	Mahajanga I	16.00	150	0.0
12	Morondava	16.00	736	0.0
10	Manakara	17.00	4235	0.0
6	Farafangana	19.00	5201	0.0
15	Toliara	22.00	1637	0.0
13	Sambava	23.00	5700	0.0
3	Ambovombe	26.00	2322	0.0

Now, you might think that if we want to position supplies to respond to a known disaster, we should just put them as close as possible. It’s an intuitive solution that can be solved with a simple greedy algorithm. But of course, life is never that simple.

Now that we’ve got the initial problem outlined, let’s start making it more realistic with two additions:

Instead of preparing for only one disaster, let’s prepare for all the disasters that might occur.
Instead of being an omniscient observer, let’s say we aren’t sure where the next disaster will be.

Now, these assumptions might be intimidating at first. How do you rigorously model information you don’t know? Isn’t this the realm of predictive models?

This reaction is totally normal and indeed, we often use predictive modeling when our information is incomplete. However, remember that this problem isn’t easily suited to ML techniques due to the low availability of data, small feature space, and large solution space. So how do we approach this with optimization? Well, now that we know how supplies will be dispensed for a given warehouse allocation, we can leverage the small set of historical disasters we do have to ask which allocations would have saved/minimized the time taken to respond to all of the disasters. Then, our optimization techniques will give us a small number that could feasibly satisfy this (where our constraints meet, see the introduction for more on this) and we can simply iterate over them.

But how do we go about it in practice?

Include All Disasters#

Before we begin, let’s explicitly define our new problem with the additional requirements outlined in the previous section so we’re all on the same page. Our first step is to add in the fact that there are more disasters than just one. We can do that by including a variable index to denote which disaster we’re talking about.

Let \(K\) be the set of disasters, and \(k\) be the index for the disaster scenario at hand, i.e., a storm, earthquake, or epidemic.

In scenario \(k\), the time for a warehouse \(i\) to send \(y_i\) items is \(\tau_{ik} \cdot y^k_i\)

Repeating for all warehouses gives us \( \sum_i \tau_{ik} \cdot y^k_i \)

Repeating for all disasters gives us \( \sum_k \sum_i \tau_{ik} \cdot y^k_i \)

The question that ESUPS, and subsequently this model, set out to answer is: Where should disaster relief supplies be stored to ensure they get to people in need as quickly and cheaply as possible in the event of a disaster?

By comparing the overall travel time for every single disaster that we have data on, we can make determinations on whether storing different quantities of materials in different warehouses would be better or worse given our historical data.

Before we get to solving, let’s add in one more factor. Different disasters occur at different rates. A landlocked nation may be less likely to experience a disaster caused by a tropical storm than an earthquake, so shouldn’t we weigh the response time to earthquakes more than that of a storm?

Accounting for Randomness#

One of the most difficult tasks that we can encounter in data science problems is randomness, or as it’s often called, “stochastic” elements. This can be seen in all kinds of ways, but in our case study, we’re going to look at how we account for not knowing which disaster will hit next (perhaps even several years in the future). The way we do this is to create a stochastic model, which may initially seem challenging—but in discrete events like this, it’s super easy.

If you’re familiar with expected value, this will quickly make sense—but no need for any prior experience! See the sub-notebook for a quick intro. The idea is that we weigh the outcome of the event (i.e., total travel time in this case) by the probability it occurs. For example, if an earthquake is 3 times as likely as a flood, we would rewrite the equation as:

\[0.75 \cdot\text{(travel time earthquake)} + 0.25 \cdot\text{(travel time flood)} \]

Updated Model#

Now that we’ve had some background on expected value, let’s return to our model. Remember, our goal is to account for the inherent randomness associated with different disasters occurring.

Let \(P^k\) be the probability of disaster \(k\) and \(t^k\) be the total travel time involved in disaster \(k\) from the previous sections. Using our definition of expected value, this gives us

\[ \sum_k P^k \cdot t^k\]

Now, you might notice that we have already written an equation for the total travel time for disaster. Substituting this in, we get

\[\sum_k P^k \sum_i \tau_{ik} \cdot y^k_i\]

In our case, turning this problem into a stochastic optimization problem that can account for probability only involves one more value per disaster. Our final task right now is to minimize the time required to get supplies to the disaster sites, given how likely each disaster is to occur:

\[ \min_y \sum_k P^k \sum_i \tau_{ik} \cdot y^k_i \]

And we can see the constraints are almost unchanged:

\[\begin{split} \begin{aligned} \text{s.t.} \quad \sum_{i} y^k_{i} & = d^k & \forall k & \quad \text{(total supplies sent must meet demand)}\\ y^k_i & \leq b_i & \forall i \in I,~ \forall k \in K & \quad \text{(you can't send more than a warehouse has)}\\ y^k_{i} & \geq 0 & \forall i \in I,~ \forall k \in K & \quad \text{(you can't send negative supplies)}\\ \end{aligned} \end{split}\]

Note that we made an important assumption when defining the warehouse capacity constraints: We assumed that not all disasters happen at the same time, and so we have the full warehouse capacity available for each disaster.

All the probabilities should sum up to one, i.e., . But how do we calculate the probabilities? Well, we have good long-term data on what disasters have affected which countries. For now, we can go through that data and calculate the probability of disaster by counting how many times it has occurred and dividing by the number of total disasters. You may have noticed that every single disaster will have the same probability, or in other words, this will produce a uniform distribution. As a result, can effectively be removed from our objective function, so our new objective will give the exact same results as our old one. We’ll talk about ways to potentially address this later on in the notebooks as an extension, but for the moment, the purpose of adding this is to show how easily the equation can be modified to be stochastic and to get you used to the idea. So now that we have the new objective function, we can let the model go ahead and use Gurobi to solve it for us!

However, we first need to read in the data and prepare it.

warehouses = pd.read_csv(path + "data/warehouses.csv")
warehouses

	Type	Lat	Long	Buckets
0	Warehouses	-17.823700	48.426300	26
1	Warehouses	-20.516700	47.250000	41
2	Warehouses	-18.908500	47.537500	9046
3	Warehouses	-17.384300	49.409800	2762
4	Warehouses	-16.916700	49.900000	1682
5	Warehouses	-15.430900	49.758300	1870
6	Warehouses	-16.170200	49.774100	0
7	Warehouses	-16.926100	49.587100	375
8	Warehouses	-25.176133	46.089378	2322
9	Warehouses	-25.031600	46.990000	0
10	Warehouses	-24.206400	45.865800	610
11	Warehouses	-23.350000	43.666700	1027
12	Warehouses	-23.452200	45.078100	0
13	Warehouses	-23.426050	47.059080	1870
14	Warehouses	-22.816700	47.816700	3331
15	Warehouses	-23.350500	47.608800	0
16	Warehouses	-18.149900	49.402300	1580
17	Warehouses	-16.950400	46.828100	0
18	Warehouses	-15.716700	46.316700	150
19	Warehouses	-13.680400	48.455500	375
20	Warehouses	-18.960800	46.903600	3
21	Warehouses	-18.064600	44.029500	2460
22	Warehouses	-20.284700	44.317500	736
23	Warehouses	-14.883300	50.283300	5700
24	Warehouses	-14.266700	50.166700	0
25	Warehouses	-14.876200	47.983500	610
26	Warehouses	-22.150000	48.000000	4235

distanceMatrix_scenarios = pd.read_csv(path + "data/distanceMatrix_scenarios.csv")
distanceMatrix_scenarios

	scenario	warehouse	drivingTime_hrs
0	0	0	21.00
1	0	1	30.00
2	0	2	25.00
3	0	3	32.00
4	0	4	35.75
...	...	...	...
457	21	16	26.00
458	21	17	19.00
459	21	18	48.00
460	21	19	33.00
461	21	20	15.00

462 rows × 3 columns

disasters = pd.read_csv(path + "data/disasters.csv")
# one bucket is needed for 2.5 people on average
total_buckets = warehouses["Buckets"].sum()
disasters["demand"] = (
    (disasters["People Impacted"] / 2.5).round().clip(upper=total_buckets).astype(int)
)
demand = disasters["demand"].values
probs = [1 / len(demand)] * len(demand)

disasters

	Type	Lat	Long	People Impacted	demand
0	Storm	-12.266700	49.283300	118000	40811
1	Storm	-14.266700	50.166700	100215	40086
2	Epidemic	-14.876200	47.983500	21976	8790
3	Epidemic	-15.716700	46.316700	15172	6069
4	Flood	-16.950400	46.828100	20000	8000
5	Flood	-17.384300	49.409800	28223	11289
6	Storm	-17.823700	48.426300	84309	33724
7	Storm	-18.064600	44.029500	526200	40811
8	Storm	-18.149900	49.402300	600000	40811
9	Epidemic	-18.769800	46.050000	3055	1222
10	Storm	-18.908500	47.537500	55345	22138
11	Storm	-18.960800	46.903600	1900	760
12	Flood	-19.866700	47.033300	23369	9348
13	Storm	-20.284700	44.317500	13561	5424
14	Storm	-21.236700	48.346100	500	200
15	Storm	-21.447000	47.087200	736938	40811
16	Storm	-22.150000	48.000000	162086	40811
17	Storm	-22.396100	46.121700	369272	40811
18	Storm	-22.816700	47.816700	100000	40000
19	Storm	-23.350000	43.666700	70000	28000
20	Storm	-25.031600	46.990000	540043	40811
21	Storm	-25.176133	46.089378	250000	40811

m = len(demand)  # number of disasters/scenarios
t = distanceMatrix_scenarios.set_index(["scenario", "warehouse"]).squeeze().to_dict()
b = warehouses["Buckets"][warehouses["Buckets"] > 0].reset_index(drop=True)
n = len(b)  # number of warehouses

model_2 = gp.Model("multiple_disasters")
model_2.Params.OutputFlag = 0

# Amount to take per warehouse
y = model_2.addVars(t.keys(), name="Warehouse_Allocation")

# Add constraints to meet demand for each disaster scenario k
for k in range(m):

    # Demand constraints
    model_2.addConstr(y.sum(k, "*") == demand[k], name=f"Meet_Demand_K:{k}")

    # Warehouse constraints
    for i in range(n):
        model_2.addConstr(y[k, i] <= b[i], name=f"warehouse_endowment_K:{k}_I:{i}")

# Objective function to minimize the weighted driving time over all scenarios
objective = gp.quicksum(
    probs[k] * gp.quicksum(t[k, i] * y[k, i] for i in range(n)) for k in range(m)
)

# Optimize model
model_2.setObjective(objective, GRB.MINIMIZE)
model_2.optimize()

# Store results in dataframe a
a = pd.DataFrame(
    {
        "Warehouse allocation": model_2.getAttr("VarName", y),
        "Buckets": model_2.getAttr("X", y),
    }
)

a[:15]

		Warehouse allocation	Buckets
0	0	Warehouse_Allocation[0,0]	26.0
	1	Warehouse_Allocation[0,1]	41.0
	2	Warehouse_Allocation[0,2]	9046.0
	3	Warehouse_Allocation[0,3]	2762.0
	4	Warehouse_Allocation[0,4]	1682.0
	5	Warehouse_Allocation[0,5]	1870.0
	6	Warehouse_Allocation[0,6]	375.0
	7	Warehouse_Allocation[0,7]	2322.0
	8	Warehouse_Allocation[0,8]	610.0
	9	Warehouse_Allocation[0,9]	1027.0
	10	Warehouse_Allocation[0,10]	1870.0
	11	Warehouse_Allocation[0,11]	3331.0
	12	Warehouse_Allocation[0,12]	1580.0
	13	Warehouse_Allocation[0,13]	150.0
	14	Warehouse_Allocation[0,14]	375.0

Data Science Extension#

We’ve seen how the objective function uses the uniform distribution to solve the problem. But what if, knowing that climate change is increasingly energizing large storms, we decide that past hurricane impacts aren’t representative of what’s to come? In this section, we want to prompt you to come up with predictive elements to improve our models. Feel free to use some of the ideas below or go in an entirely new direction!

Case Study Focus: Coastal Eastern African Nations#

Using the disaster impact data for coastal Eastern African nations, can you develop a model to predict how these impacts might escalate for Madagascar in the coming years? Consider not only the historical data, but also factors such as changes in sea surface temperatures, shifting storm tracks, population growth along vulnerable coastlines, and evolving infrastructure resilience. Further, can you integrate this predictive model into an optimization framework to better allocate resources for disaster preparedness and response?

Potential Approaches to Explore:

Comparing Time Series Models: Traditional statistical time series models like ARIMAX (AutoRegressive Integrated Moving Average with Explanatory Variables) are commonly used to predict future values based on past data. How do these models compare with more advanced Recurrent Neural Network (RNN)-based approaches. like Long Short-Term Memory (LSTM) networks or Gated Recurrent Units (GRUs), in,capturing long-term dependencies, especially under non-stationary conditions induced by climate change?
Incorporating Geospatial Data: Geospatial features, such as latitude, longitude, elevation, and proximity to bodies of water, can play a crucial role in predicting the impact of tropical storms. Can we encode geospatial information using techniques like convolutional neural networks (CNNs) for spatial feature extraction, or leverage more specialized models such as Geographical Weighted Regression (GWR) or Graph Neural Networks (GNNs) to account for spatial dependencies?
Incorporating Climate Change Projections: Beyond just historical data, consider how future climate projections can be integrated into the model. Can we use downscaled climate model outputs or ensemble approaches to account for different climate scenarios? How would these scenarios affect the frequency and intensity of tropical storms affecting coastal Eastern African nations?
Feature Engineering with Climate Indicators: Introduce climate change indicators as predictive features. For example, how do trends in sea surface temperatures (SSTs), El Niño-Southern Oscillation (ENSO) phases, or the Atlantic Multi-decadal Oscillation (AMO) correlate with storm intensification? Would incorporating these indicators as additional time series variables enhance predictive accuracy?
Hybrid and Ensemble Models: Can we leverage hybrid models that combine both statistical and deep learning approaches, or use ensemble methods that aggregate predictions from multiple models? For example, combining ARIMAX for short-term predictions with LSTM for capturing long-term trends may provide a more comprehensive forecasting tool.
Optimization Integration: Once a reliable predictive model is established, how can it be integrated into an optimization framework for resource allocation? For example, can we build an optimization model that minimizes both the cost of disaster preparedness and the potential loss from future storm impacts?
Model Explainability and Decision-Making: How can we ensure that the model is interpretable for decision-makers? Consider the use of techniques like SHAP (SHapley Additive exPlanations) values or LIME (Local Interpretable Model-agnostic Explanations) to explain which factors contribute most to the model’s predictions, helping policymakers make informed decisions.

By combining predictive analytics with optimization, we can not only forecast future disaster impacts, but also develop actionable strategies for minimizing those impacts. The goal is to make our models both more accurate and more useful in real-world applications, driving better outcomes for communities at risk.