Intro to the ESUPS Optimization Model#
We’ll start with a simplified version of the model and then build from there.
Consider the scenario where we’re expecting to have a specific disaster and want to transport the supplies from warehouses with predefined stock as quickly as possible to the disaster. Perhaps it’s hurricane season, or a volcano has been showing increasing activity.
First, we list the input data we have:
\(I\) is the set of warehouses
\(\tau_i\) is the driving time to ship a single item from warehouse \(i \in I\) to the disaster
\(b_i\) is the starting inventory (buckets) at each warehouse \(i \in I\)
\(d\) is the demand we need to satisfy, i.e., the number of items we need to ship to the disaster
Then, let’s define our decision variables, i.e. the things we can vary.
\(y_i\) is the amount of supplies to send from warehouse \(i \in I\) to our disaster
Now that we have described what we can change with our variables, we can figure out how to represent the objective function!
The constraints are:
Code the Basic Model#
So now let’s code and solve this model.
import gurobipy as gp
from gurobipy import GRB
import pandas as pd
path = "https://raw.githubusercontent.com/Gurobi/gurobi-casestudy-esups/refs/heads/main/docs/"
simple_Allocation = pd.read_csv(path + "data/simple_Allocation.csv")
# Prep data
BucketsNeeded = 13561
n = len(simple_Allocation)
t = simple_Allocation.drivingTime_hrs
b = simple_Allocation.Buckets
# Create model
model = gp.Model("simple_Allocation")
model.Params.OutputFlag = 0
# Add decision variables
y = model.addVars(n, name="Warehouse_Allocation")
# Add constraint to meet demand
model.addConstr(
gp.quicksum(y[i] for i in range(n)) == BucketsNeeded, name="Meet_Demand"
)
# Add in warehouse_constraints
for i in range(n):
model.addConstr(y[i] <= b[i], name=f"warehouse_endowment_{i}")
# Note we don't have a constraint for y >= 0 since it's assumed in the variable definition
# Add objective
objective = gp.quicksum(t[i] * y[i] for i in range(n))
model.setObjective(objective, GRB.MINIMIZE)
# Fire up the solver!
model.optimize()
Restricted license - for non-production use only - expires 2026-11-23
Now let’s analyze the results!
simple_Allocation_sol = simple_Allocation.copy()
simple_Allocation_sol["y"] = model.getAttr("X", y)
simple_Allocation_sol.sort_values("drivingTime_hrs")
| depotCity | drivingTime_hrs | Buckets | y | |
|---|---|---|---|---|
| 1 | Ambatondrazaka | 0.00 | 26 | 26.0 |
| 4 | Antananarivo Renivohitra | 6.00 | 9046 | 9046.0 |
| 11 | Miarinarivo | 7.00 | 3 | 3.0 |
| 14 | Toamasina I | 8.00 | 1580 | 1580.0 |
| 5 | Antsohihy | 10.00 | 610 | 610.0 |
| 2 | Ambositra | 11.00 | 41 | 0.0 |
| 7 | Fenerive Est | 11.00 | 6689 | 2296.0 |
| 0 | Ambanja | 14.00 | 375 | 0.0 |
| 9 | Maintirano | 14.25 | 2460 | 0.0 |
| 8 | Mahajanga I | 16.00 | 150 | 0.0 |
| 12 | Morondava | 16.00 | 736 | 0.0 |
| 10 | Manakara | 17.00 | 4235 | 0.0 |
| 6 | Farafangana | 19.00 | 5201 | 0.0 |
| 15 | Toliara | 22.00 | 1637 | 0.0 |
| 13 | Sambava | 23.00 | 5700 | 0.0 |
| 3 | Ambovombe | 26.00 | 2322 | 0.0 |
Now, you might think that if we want to position supplies to respond to a known disaster, we should just put them as close as possible. It’s an intuitive solution that can be solved with a simple greedy algorithm. But of course, life is never that simple.
Now that we’ve got the initial problem outlined, let’s start making it more realistic with two additions:
Instead of preparing for only one disaster, let’s prepare for all the disasters that might occur.
Instead of being an omniscient observer, let’s say we aren’t sure where the next disaster will be.
Now, these assumptions might be intimidating at first. How do you rigorously model information you don’t know? Isn’t this the realm of predictive models?
This reaction is totally normal and indeed, we often use predictive modeling when our information is incomplete. However, remember that this problem isn’t easily suited to ML techniques due to the low availability of data, small feature space, and large solution space. So how do we approach this with optimization? Well, now that we know how supplies will be dispensed for a given warehouse allocation, we can leverage the small set of historical disasters we do have to ask which allocations would have saved/minimized the time taken to respond to all of the disasters. Then, our optimization techniques will give us a small number that could feasibly satisfy this (where our constraints meet, see the introduction for more on this) and we can simply iterate over them.
But how do we go about it in practice?
Include All Disasters#
Before we begin, let’s explicitly define our new problem with the additional requirements outlined in the previous section so we’re all on the same page. Our first step is to add in the fact that there are more disasters than just one. We can do that by including a variable index to denote which disaster we’re talking about.
Let \(K\) be the set of disasters, and \(k\) be the index for the disaster scenario at hand, i.e., a storm, earthquake, or epidemic.
In scenario \(k\), the time for a warehouse \(i\) to send \(y_i\) items is \(\tau_{ik} \cdot y^k_i\)
Repeating for all warehouses gives us \( \sum_i \tau_{ik} \cdot y^k_i \)
Repeating for all disasters gives us \( \sum_k \sum_i \tau_{ik} \cdot y^k_i \)
The question that ESUPS, and subsequently this model, set out to answer is: Where should disaster relief supplies be stored to ensure they get to people in need as quickly and cheaply as possible in the event of a disaster?
By comparing the overall travel time for every single disaster that we have data on, we can make determinations on whether storing different quantities of materials in different warehouses would be better or worse given our historical data.
Before we get to solving, let’s add in one more factor. Different disasters occur at different rates. A landlocked nation may be less likely to experience a disaster caused by a tropical storm than an earthquake, so shouldn’t we weigh the response time to earthquakes more than that of a storm?
Accounting for Randomness#
One of the most difficult tasks that we can encounter in data science problems is randomness, or as it’s often called, “stochastic” elements. This can be seen in all kinds of ways, but in our case study, we’re going to look at how we account for not knowing which disaster will hit next (perhaps even several years in the future). The way we do this is to create a stochastic model, which may initially seem challenging—but in discrete events like this, it’s super easy.
If you’re familiar with expected value, this will quickly make sense—but no need for any prior experience! See the sub-notebook for a quick intro. The idea is that we weigh the outcome of the event (i.e., total travel time in this case) by the probability it occurs. For example, if an earthquake is 3 times as likely as a flood, we would rewrite the equation as:
Updated Model#
Now that we’ve had some background on expected value, let’s return to our model. Remember, our goal is to account for the inherent randomness associated with different disasters occurring.
Let \(P^k\) be the probability of disaster \(k\) and \(t^k\) be the total travel time involved in disaster \(k\) from the previous sections. Using our definition of expected value, this gives us
Now, you might notice that we have already written an equation for the total travel time for disaster. Substituting this in, we get
In our case, turning this problem into a stochastic optimization problem that can account for probability only involves one more value per disaster. Our final task right now is to minimize the time required to get supplies to the disaster sites, given how likely each disaster is to occur:
And we can see the constraints are almost unchanged:
Note that we made an important assumption when defining the warehouse capacity constraints: We assumed that not all disasters happen at the same time, and so we have the full warehouse capacity available for each disaster.
All the probabilities should sum up to one, i.e., . But how do we calculate the probabilities? Well, we have good long-term data on what disasters have affected which countries. For now, we can go through that data and calculate the probability of disaster by counting how many times it has occurred and dividing by the number of total disasters. You may have noticed that every single disaster will have the same probability, or in other words, this will produce a uniform distribution. As a result, can effectively be removed from our objective function, so our new objective will give the exact same results as our old one. We’ll talk about ways to potentially address this later on in the notebooks as an extension, but for the moment, the purpose of adding this is to show how easily the equation can be modified to be stochastic and to get you used to the idea. So now that we have the new objective function, we can let the model go ahead and use Gurobi to solve it for us!
However, we first need to read in the data and prepare it.
warehouses = pd.read_csv(path + "data/warehouses.csv")
warehouses
| Type | Lat | Long | Buckets | |
|---|---|---|---|---|
| 0 | Warehouses | -17.823700 | 48.426300 | 26 |
| 1 | Warehouses | -20.516700 | 47.250000 | 41 |
| 2 | Warehouses | -18.908500 | 47.537500 | 9046 |
| 3 | Warehouses | -17.384300 | 49.409800 | 2762 |
| 4 | Warehouses | -16.916700 | 49.900000 | 1682 |
| 5 | Warehouses | -15.430900 | 49.758300 | 1870 |
| 6 | Warehouses | -16.170200 | 49.774100 | 0 |
| 7 | Warehouses | -16.926100 | 49.587100 | 375 |
| 8 | Warehouses | -25.176133 | 46.089378 | 2322 |
| 9 | Warehouses | -25.031600 | 46.990000 | 0 |
| 10 | Warehouses | -24.206400 | 45.865800 | 610 |
| 11 | Warehouses | -23.350000 | 43.666700 | 1027 |
| 12 | Warehouses | -23.452200 | 45.078100 | 0 |
| 13 | Warehouses | -23.426050 | 47.059080 | 1870 |
| 14 | Warehouses | -22.816700 | 47.816700 | 3331 |
| 15 | Warehouses | -23.350500 | 47.608800 | 0 |
| 16 | Warehouses | -18.149900 | 49.402300 | 1580 |
| 17 | Warehouses | -16.950400 | 46.828100 | 0 |
| 18 | Warehouses | -15.716700 | 46.316700 | 150 |
| 19 | Warehouses | -13.680400 | 48.455500 | 375 |
| 20 | Warehouses | -18.960800 | 46.903600 | 3 |
| 21 | Warehouses | -18.064600 | 44.029500 | 2460 |
| 22 | Warehouses | -20.284700 | 44.317500 | 736 |
| 23 | Warehouses | -14.883300 | 50.283300 | 5700 |
| 24 | Warehouses | -14.266700 | 50.166700 | 0 |
| 25 | Warehouses | -14.876200 | 47.983500 | 610 |
| 26 | Warehouses | -22.150000 | 48.000000 | 4235 |
distanceMatrix_scenarios = pd.read_csv(path + "data/distanceMatrix_scenarios.csv")
distanceMatrix_scenarios
| scenario | warehouse | drivingTime_hrs | |
|---|---|---|---|
| 0 | 0 | 0 | 21.00 |
| 1 | 0 | 1 | 30.00 |
| 2 | 0 | 2 | 25.00 |
| 3 | 0 | 3 | 32.00 |
| 4 | 0 | 4 | 35.75 |
| ... | ... | ... | ... |
| 457 | 21 | 16 | 26.00 |
| 458 | 21 | 17 | 19.00 |
| 459 | 21 | 18 | 48.00 |
| 460 | 21 | 19 | 33.00 |
| 461 | 21 | 20 | 15.00 |
462 rows × 3 columns
disasters = pd.read_csv(path + "data/disasters.csv")
# one bucket is needed for 2.5 people on average
total_buckets = warehouses["Buckets"].sum()
disasters["demand"] = (
(disasters["People Impacted"] / 2.5).round().clip(upper=total_buckets).astype(int)
)
demand = disasters["demand"].values
probs = [1 / len(demand)] * len(demand)
disasters
| Type | Lat | Long | People Impacted | demand | |
|---|---|---|---|---|---|
| 0 | Storm | -12.266700 | 49.283300 | 118000 | 40811 |
| 1 | Storm | -14.266700 | 50.166700 | 100215 | 40086 |
| 2 | Epidemic | -14.876200 | 47.983500 | 21976 | 8790 |
| 3 | Epidemic | -15.716700 | 46.316700 | 15172 | 6069 |
| 4 | Flood | -16.950400 | 46.828100 | 20000 | 8000 |
| 5 | Flood | -17.384300 | 49.409800 | 28223 | 11289 |
| 6 | Storm | -17.823700 | 48.426300 | 84309 | 33724 |
| 7 | Storm | -18.064600 | 44.029500 | 526200 | 40811 |
| 8 | Storm | -18.149900 | 49.402300 | 600000 | 40811 |
| 9 | Epidemic | -18.769800 | 46.050000 | 3055 | 1222 |
| 10 | Storm | -18.908500 | 47.537500 | 55345 | 22138 |
| 11 | Storm | -18.960800 | 46.903600 | 1900 | 760 |
| 12 | Flood | -19.866700 | 47.033300 | 23369 | 9348 |
| 13 | Storm | -20.284700 | 44.317500 | 13561 | 5424 |
| 14 | Storm | -21.236700 | 48.346100 | 500 | 200 |
| 15 | Storm | -21.447000 | 47.087200 | 736938 | 40811 |
| 16 | Storm | -22.150000 | 48.000000 | 162086 | 40811 |
| 17 | Storm | -22.396100 | 46.121700 | 369272 | 40811 |
| 18 | Storm | -22.816700 | 47.816700 | 100000 | 40000 |
| 19 | Storm | -23.350000 | 43.666700 | 70000 | 28000 |
| 20 | Storm | -25.031600 | 46.990000 | 540043 | 40811 |
| 21 | Storm | -25.176133 | 46.089378 | 250000 | 40811 |
m = len(demand) # number of disasters/scenarios
t = distanceMatrix_scenarios.set_index(["scenario", "warehouse"]).squeeze().to_dict()
b = warehouses["Buckets"][warehouses["Buckets"] > 0].reset_index(drop=True)
n = len(b) # number of warehouses
model_2 = gp.Model("multiple_disasters")
model_2.Params.OutputFlag = 0
# Amount to take per warehouse
y = model_2.addVars(t.keys(), name="Warehouse_Allocation")
# Add constraints to meet demand for each disaster scenario k
for k in range(m):
# Demand constraints
model_2.addConstr(y.sum(k, "*") == demand[k], name=f"Meet_Demand_K:{k}")
# Warehouse constraints
for i in range(n):
model_2.addConstr(y[k, i] <= b[i], name=f"warehouse_endowment_K:{k}_I:{i}")
# Objective function to minimize the weighted driving time over all scenarios
objective = gp.quicksum(
probs[k] * gp.quicksum(t[k, i] * y[k, i] for i in range(n)) for k in range(m)
)
# Optimize model
model_2.setObjective(objective, GRB.MINIMIZE)
model_2.optimize()
# Store results in dataframe a
a = pd.DataFrame(
{
"Warehouse allocation": model_2.getAttr("VarName", y),
"Buckets": model_2.getAttr("X", y),
}
)
a[:15]
| Warehouse allocation | Buckets | ||
|---|---|---|---|
| 0 | 0 | Warehouse_Allocation[0,0] | 26.0 |
| 1 | Warehouse_Allocation[0,1] | 41.0 | |
| 2 | Warehouse_Allocation[0,2] | 9046.0 | |
| 3 | Warehouse_Allocation[0,3] | 2762.0 | |
| 4 | Warehouse_Allocation[0,4] | 1682.0 | |
| 5 | Warehouse_Allocation[0,5] | 1870.0 | |
| 6 | Warehouse_Allocation[0,6] | 375.0 | |
| 7 | Warehouse_Allocation[0,7] | 2322.0 | |
| 8 | Warehouse_Allocation[0,8] | 610.0 | |
| 9 | Warehouse_Allocation[0,9] | 1027.0 | |
| 10 | Warehouse_Allocation[0,10] | 1870.0 | |
| 11 | Warehouse_Allocation[0,11] | 3331.0 | |
| 12 | Warehouse_Allocation[0,12] | 1580.0 | |
| 13 | Warehouse_Allocation[0,13] | 150.0 | |
| 14 | Warehouse_Allocation[0,14] | 375.0 |
Data Science Extension#
We’ve seen how the objective function uses the uniform distribution to solve the problem. But what if, knowing that climate change is increasingly energizing large storms, we decide that past hurricane impacts aren’t representative of what’s to come? In this section, we want to prompt you to come up with predictive elements to improve our models. Feel free to use some of the ideas below or go in an entirely new direction!
Case Study Focus: Coastal Eastern African Nations#
Using the disaster impact data for coastal Eastern African nations, can you develop a model to predict how these impacts might escalate for Madagascar in the coming years? Consider not only the historical data, but also factors such as changes in sea surface temperatures, shifting storm tracks, population growth along vulnerable coastlines, and evolving infrastructure resilience. Further, can you integrate this predictive model into an optimization framework to better allocate resources for disaster preparedness and response?
Potential Approaches to Explore:
Comparing Time Series Models: Traditional statistical time series models like ARIMAX (AutoRegressive Integrated Moving Average with Explanatory Variables) are commonly used to predict future values based on past data. How do these models compare with more advanced Recurrent Neural Network (RNN)-based approaches. like Long Short-Term Memory (LSTM) networks or Gated Recurrent Units (GRUs), in,capturing long-term dependencies, especially under non-stationary conditions induced by climate change?
Incorporating Geospatial Data: Geospatial features, such as latitude, longitude, elevation, and proximity to bodies of water, can play a crucial role in predicting the impact of tropical storms. Can we encode geospatial information using techniques like convolutional neural networks (CNNs) for spatial feature extraction, or leverage more specialized models such as Geographical Weighted Regression (GWR) or Graph Neural Networks (GNNs) to account for spatial dependencies?
Incorporating Climate Change Projections: Beyond just historical data, consider how future climate projections can be integrated into the model. Can we use downscaled climate model outputs or ensemble approaches to account for different climate scenarios? How would these scenarios affect the frequency and intensity of tropical storms affecting coastal Eastern African nations?
Feature Engineering with Climate Indicators: Introduce climate change indicators as predictive features. For example, how do trends in sea surface temperatures (SSTs), El Niño-Southern Oscillation (ENSO) phases, or the Atlantic Multi-decadal Oscillation (AMO) correlate with storm intensification? Would incorporating these indicators as additional time series variables enhance predictive accuracy?
Hybrid and Ensemble Models: Can we leverage hybrid models that combine both statistical and deep learning approaches, or use ensemble methods that aggregate predictions from multiple models? For example, combining ARIMAX for short-term predictions with LSTM for capturing long-term trends may provide a more comprehensive forecasting tool.
Optimization Integration: Once a reliable predictive model is established, how can it be integrated into an optimization framework for resource allocation? For example, can we build an optimization model that minimizes both the cost of disaster preparedness and the potential loss from future storm impacts?
Model Explainability and Decision-Making: How can we ensure that the model is interpretable for decision-makers? Consider the use of techniques like SHAP (SHapley Additive exPlanations) values or LIME (Local Interpretable Model-agnostic Explanations) to explain which factors contribute most to the model’s predictions, helping policymakers make informed decisions.
By combining predictive analytics with optimization, we can not only forecast future disaster impacts, but also develop actionable strategies for minimizing those impacts. The goal is to make our models both more accurate and more useful in real-world applications, driving better outcomes for communities at risk.