Airline seat allocation problem#

Attribution#

The following problem statement is adapted from an exercise and examples presented by Birge and Louveaux (2011).

  • Birge, J. R., & Louveaux, F. (2011). Introduction to stochastic programming. Springer Science & Business Media.

The adaptations include a change to parameters for consistency among reformulations of the problem, and additional treatments for sample average approximation (SAA) with chance constraints.

Problem description#

An airline is deciding how to partition a new plane for the Amsterdam-Buenos Aires route. This plane can seat 200 economy-class passengers. A section can be created for first-class seats, but each of these seats takes the space of 2 economy-class seats. A business-class section can also be created, but each of these takes the space of 1.5 economy-class seats. The profit for a first-class ticket is three times the profit of an economy ticket, while a business-class ticket has a profit of two times an economy ticketā€™s profit. Once the plane is partitioned into these seating classes, it cannot be changed.

The airlines knows that the plane will not always be full in every section. The airline has initially identified three scenarios to consider with about equal frequency:

  1. Weekday morning and evening traffic;

  2. Weekend traffic; and

  3. Weekday midday traffic.

Under Scenario 1 the airline thinks they can sell as many as 20 first-class tickets, 50 business-class tickets, and 200 economy tickets. Under Scenario 2 these figures are 10 , 24, and 175, while under Scenario 3, they are 6, 10, and 150, respectively. The following table summarizes the forecast demand for these three scenarios.

Scenario

First-class seats

Business-class seats

Economy-class seats

(1) weekday morning and evening

20

50

200

(2) weekend

10

24

175

(3) weekday midday

6

10

150

Average Scenario

12

28

175

The goal of the airline is to maximize ticket revenue. For marketing purposes, the airline will not sell more tickets than seats in each of the sections (hence no overbooking strategy). We further assume customers seeking a first-class or business-class seat will not downgrade if those seats are unavailable.

Installation and imports#

# install dependencies and select solver
%pip install -q amplpy pandas matplotlib numpy scipy

SOLVER = "highs"

from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["highs"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics
Using default Community Edition License for Colab. Get yours at: https://ampl.com/ce
Licensed to AMPL Community Edition License for the AMPL Model Colaboratory (https://colab.ampl.com).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Problem data#

Pandas DataFrames and Series are used to encode problem data in the following cell, and to encode problem solutions in subsequent cells.

# scenario data
demand = pd.DataFrame(
    {
        "morning and evening": {"F": 20, "B": 50, "E": 200},
        "weekend": {"F": 10, "B": 24, "E": 175},
        "midday": {"F": 6, "B": 10, "E": 150},
    }
).T

# global revenue and seat factor data
capacity = 200
revenue_factor = pd.Series({"F": 3.0, "B": 2.0, "E": 1.0})
seat_factor = pd.Series({"F": 2.0, "B": 1.5, "E": 1.0})

Analytics#

Prior to optimization, a useful first step is to prepare an analytics function to display performance for any given allocation of seats. The first-stage decision variables are the number of seats allocated for each class \(c\in C\). We would like to provide a analysis showing the operational consequences for any proposed allocation of seats. For this purpose, we create a function seat_report() that show the tickets that can be sold in each scenario, the resulting revenue, and the unsatisfied demand (ā€˜spillageā€™).

To establish a basis for analyzing possible solutions to the airlineā€™s problem, this function first is demonstrated for the case where the airplane is configured as entirely economy-class.

# function to report analytics for any given seat allocations
def seat_report(seats, demand):
    classes = seats.index

    # report seat allocation
    equivalent_seats = pd.DataFrame(
        {
            "seat allocation": {c: seats[c] for c in classes},
            "economy equivalent seat allocation": {
                c: seats[c] * seat_factor[c] for c in classes
            },
        }
    ).T
    equivalent_seats["TOTAL"] = equivalent_seats.sum(axis=1)
    print("\nSeat Allocation")
    display(equivalent_seats)

    # tickets sold is the minimum of available seats and demand
    tickets = pd.DataFrame()
    for c in classes:
        tickets[c] = np.minimum(seats[c], demand[c])
    print("\nTickets Sold")
    display(tickets)

    # seats unsold
    unsold = pd.DataFrame()
    for c in classes:
        unsold[c] = seats[c] - tickets[c]
    print("\nSeats not Sold")
    display(unsold)

    # spillage (unmet demand)
    spillage = demand - tickets
    print("\nSpillage (Unfulfilled Demand)")
    display(spillage)

    # compute revenue
    revenue = tickets.dot(revenue_factor)
    print(
        f"\nExpected Revenue (in units of economy ticket price): {revenue.mean():.2f}"
    )

    # charts
    fig, ax = plt.subplots(2, 1, figsize=(8, 6))
    revenue.plot(ax=ax[0], kind="barh", title="Revenue by Scenario")
    ax[0].plot([revenue.mean()] * 2, ax[0].get_ylim(), "--", lw=1.4)
    ax[0].set_xlabel("Revenue")

    tickets[classes].plot(
        ax=ax[1], kind="barh", rot=0, stacked=False, title="Demand Scenarios"
    )
    demand[classes].plot(
        ax=ax[1],
        kind="barh",
        rot=0,
        stacked=False,
        title="Demand Scenarios",
        alpha=0.2,
    )
    for c in classes:
        ax[1].plot([seats[c]] * 2, ax[1].get_ylim(), "--", lw=1.4)
    ax[1].set_xlabel("Seats")
    fig.tight_layout()

    return


# a trial solution
seats_all_economy = pd.Series({"F": 0, "B": 0, "E": 200})
seat_report(seats_all_economy, demand)
Seat Allocation
F B E TOTAL
seat allocation 0.0 0.0 200.0 200.0
economy equivalent seat allocation 0.0 0.0 200.0 200.0
Tickets Sold
F B E
morning and evening 0 0 200
weekend 0 0 175
midday 0 0 150
Seats not Sold
F B E
morning and evening 0 0 0
weekend 0 0 25
midday 0 0 50
Spillage (Unfulfilled Demand)
F B E
morning and evening 20 50 0
weekend 10 24 0
midday 6 10 0
Expected Revenue (in units of economy ticket price): 175.00
../../_images/990310f94d7c6f3812afffe80640504f2f5e160fc994b31a573a701db31928f4.png

Model 1. Deterministic solution for the average demand scenario#

A common starting point in stochastic optimization is to solve the deterministic problem where future demands are fixed at their mean values and compute the corresponding optimal solution. The resulting value of the objective has been called the expectation of the expected value problem (EEV) by Birge, or the expected value of the mean (EVM) solution by others.

Let us introduce the set \(C\) of the three possible classes, i.e., \(C=\{F,B,E\}\). The objective function is to maximize ticket revenue.

\[ \max_{s_c, t_c} \quad \sum_{c\in C} r_c t_c \]

where \(r_c\) is the revenue from selling a ticket for a seat in class \(c\in C\).

Let \(s_c\) denote the number of seats of class \(c \in C\) installed in the new plane. Let \(f_c\) be the scale factor denoting the number of economy seats displaced by one seat in class \(c \in C\). Then, since there is a total of 200 economy-class seats that could fit on the plane, the capacity constraint reads as

\[ \sum_{c\in C} f_c s_c \leq 200, \]

Let \(\mu_c\) be the mean demand for seats of class \(c\in C\), and let \(t_c\) be the number of tickets of class \(c\in C\) that are sold. To ensure we do not sell more tickets than available seats nor more than demand, we need to add two more constraints:

\[\begin{split} \begin{align*} t_c & \leq s_c \qquad \forall \, c\in C \\ t_c & \leq \mu_c \qquad \forall \, c\in C \end{align*} \end{split}\]

Lastly, both ticket and seat variables need to be nonnegative integers, so we add the constraints \(\bm{t}, \bm{s} \in \mathbb{Z}_+\).

The following cell presents an AMPL model implementing this model.

%%writefile airline_deterministic.mod

param capacity;

set CLASSES;

param seat_factor{CLASSES};
param demand{CLASSES};
param revenue_factor{CLASSES};

# first stage variables and constraints
var seats{CLASSES} integer >= 0;

s.t. plane_seats: sum{c in CLASSES}(seats[c] * seat_factor[c]) <= capacity;

# second stage variable and constraints
var tickets{CLASSES} integer >= 0;

s.t. demand_limits {c in CLASSES}: tickets[c] <= demand[c];
s.t. seat_limits {c in CLASSES}: tickets[c] <= seats[c];

# objective
maximize revenue: sum{c in CLASSES}(tickets[c] * revenue_factor[c]);
Overwriting airline_deterministic.mod
def airline_deterministic(demand):
    # Create AMPL instance and load the model
    ampl = AMPL()
    ampl.read("airline_deterministic.mod")

    # load the data
    ampl.set["CLASSES"] = demand.columns.tolist()
    ampl.param["demand"] = demand.mean()
    ampl.param["revenue_factor"] = revenue_factor
    ampl.param["seat_factor"] = seat_factor
    ampl.param["capacity"] = capacity

    # set solver
    ampl.option["solver"] = SOLVER

    return ampl


# Solve a given model and return a Pandas series of the seats for each class
def airline_solve(model):
    model.solve()
    return pd.Series(model.var["seats"].to_dict()).reindex(index=["F", "B", "E"])


# Solve deterministic model to obtain the expectation of the expected value problem (EEV)
model_eev = airline_deterministic(demand)
seats_eev = airline_solve(model_eev)
seat_report(seats_eev, demand)
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 226
2 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 12.0 28.0 134.0 174.0
economy equivalent seat allocation 24.0 42.0 134.0 200.0
Tickets Sold
F B E
morning and evening 12 28 134
weekend 10 24 134
midday 6 10 134
Seats not Sold
F B E
morning and evening 0 0 0
weekend 2 4 0
midday 6 18 0
Spillage (Unfulfilled Demand)
F B E
morning and evening 8 22 66
weekend 0 0 41
midday 0 0 16
Expected Revenue (in units of economy ticket price): 203.33
../../_images/7592897be106cf6f9f50440fe7ad8e153d856926e13185379f6d7801c99d18e8.png

Model 2. Two-stage stochastic optimization and its extensive form#

If we assume demand is not certain, we can formulate a two-stage stochastic optimization problem. The first-stage or here-and-now variables are the \(s_c\)ā€™s, those related to the seat allocations. Due to their dependence on the realized demand \(\boldsymbol{z}\), the variables \(t_c\)ā€™s describing the number of tickets sold, are second-stage or recourse decision variables. The full problem formulation is as follows: the first stage problem is

\[\begin{split} \begin{align*} \max \quad & \mathbb E_{z} Q(\boldsymbol{s},\boldsymbol{z}) \\ \text{s.t.} \quad & \sum_{c\in C} f_c s_c \leq 200,\\ & \boldsymbol{s} \in \mathbb{Z}_+, \end{align*} \end{split}\]

where \(Q(\boldsymbol{s},\boldsymbol{z})\) is the value of the second-stage problem, defined as

\[\begin{split} \begin{align*} Q(\boldsymbol{s},\boldsymbol{z}) := \max \quad & \sum_{c\in C} r_c t_{c}\\ \text{s.t.} \quad & t_c \leq s_c & \forall \, c\in C \\ & t_c \leq z_c & \forall \, c\in C \\ & \boldsymbol{t} \in \mathbb{Z}_+. \end{align*} \end{split}\]

In view of the assumption that there is only a finite number \(N=|S|\) of scenarios for ticket demand, we can write the extensive form of the two-stage stochastic optimization problem above and solve it exactly. To do so, we modify the second-stage variables \(t_{c,s}\) so that they are indexed by both class \(c\) and scenario \(s\). The expectation can thus be replaced with the average revenue over the \(N\) scenarios, that is

\[ \max \quad \sum_{s \in S} \frac{1}{N} \sum_{c\in C} r_c t_{c, s}, \]

where the fraction \(\frac{1}{N}\) appears since we assume that all \(N\) scenarios are equally likely. The first stage constraint remains unchanged, while the second stage constraints are tuplicated for each scenario \(s \in S\), namely

\[\begin{split} \begin{align*} t_{c,s} & \leq s_c & \forall \, c\in C \\ t_{c,s} & \leq z_{c, s} & \forall \, (c, s) \in C \times S \end{align*} \end{split}\]

The following cell presents an AMPL model implementing this model and solving it for the \(N=3\) equiprobable scenarios introduced above.

%%writefile airline_stochastic.mod

param capacity;

set CLASSES;
set SCENARIOS;

param demand{CLASSES, SCENARIOS};
param seat_factor{CLASSES};
param revenue_factor{CLASSES};

# first stage variables and constraints
var seats{CLASSES} integer >= 0;

s.t. plane_seats: sum{c in CLASSES}(seats[c] * seat_factor[c]) <= capacity;

# second stage variable and constraints
var tickets{CLASSES, SCENARIOS} integer >= 0;

s.t. demand_limits {c in CLASSES, s in SCENARIOS}: tickets[c, s] <= demand[c, s];
s.t. seat_limits {c in CLASSES, s in SCENARIOS}: tickets[c, s] <= seats[c];

# objective
maximize revenue: sum{c in CLASSES, s in SCENARIOS}(tickets[c, s] * revenue_factor[c]);
Overwriting airline_stochastic.mod
def airline_stochastic(demand):
    # Create AMPL instance and load the model
    ampl = AMPL()
    ampl.read("airline_stochastic.mod")

    # load the data
    ampl.set["CLASSES"] = demand.columns.tolist()
    ampl.set["SCENARIOS"] = demand.index.values.tolist()
    ampl.param["demand"] = demand.T
    ampl.param["revenue_factor"] = revenue_factor
    ampl.param["seat_factor"] = seat_factor
    ampl.param["capacity"] = capacity

    # set solver
    ampl.option["solver"] = SOLVER

    return ampl


# create and solve model for the three scenarios defined above
model_stochastic = airline_stochastic(demand)
seats_stochastic = airline_solve(model_stochastic)
seat_report(seats_stochastic, demand)
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 628
7 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 10.0 20.0 150.0 180.0
economy equivalent seat allocation 20.0 30.0 150.0 200.0
Tickets Sold
F B E
morning and evening 10 20 150
weekend 10 20 150
midday 6 10 150
Seats not Sold
F B E
morning and evening 0 0 0
weekend 0 0 0
midday 4 10 0
Spillage (Unfulfilled Demand)
F B E
morning and evening 10 30 50
weekend 0 4 25
midday 0 0 0
Expected Revenue (in units of economy ticket price): 209.33
../../_images/9fd6ef7ba7921f5236c2da54ed639be251ab986b31ed3e1136ffff946d61b902.png

Model 3. Adding chance constraints#

The airline wishes a special guarantee for its clients enrolled in its loyalty program. In particular, it wants \(98\%\) probability to cover the demand of first-class seats and \(95\%\) probability to cover the demand of business-class seats (by clients of the loyalty program). First-class passengers are covered if they can purchase a first-class seat. Business-class passengers are covered if they purchase a business-class seat or upgrade to a first-class seat.

Assume the demand of loyalty-program passengers is normally distributed as \(z_F \sim \mathcal N(\mu_F, \sigma_F^2)\) and \(z_B \sim \mathcal N(\mu_B, \sigma_B^2)\) for first-class and business, respectively, where the parameters are given in the table below. For completeness, we also include the parameters for economy-class passengers.

\(\mu_{\cdot}\)

\(\sigma_{\cdot}\)

F

12

4

B

28

8

E

175

20

We further assume that the demands for first-class and business-class seats are independent of each other and of the scenario (time of the week).

Let \(s_F\) be the number of first-class seats and \(s_B\) the number of business seats. The probabilistic constraints are

\[ \mathbb P(s_F \geq z_F ) \geq 0.98, \qquad \text{ and } \qquad \mathbb P(s_F + s_B \geq z_F + z_B ) \geq 0.95. \]

These are can be written equivalently as linear constraints, specifically

\[ \frac{s_F - \mu_F}{\sigma_F} \geq \Phi^{-1}(0.98) \approx 2.054 \qquad \text{ and } \qquad \frac{(s_F + s_B) - (\mu_F + \mu_B)}{\sqrt{\sigma_F^2 + \sigma_B^2}} \geq \Phi^{-1}(0.95) \approx 1.645. \]

For the second constraint, we use the fact that the sum of the two independent random variables \(z_F\) and \(z_B\) is normally distributed with mean \(\mu_F + \mu_B\) and variance \(\sigma_F^2 + \sigma_B^2\).

We add these equivalent linear counterparts of the two chance constraints to the stochastic optimization model. Rather than writing a function to create a whole new model, we can use the prior function to create and add the two chance constraints using decorators.

# import the scipy.stats.norm object needed to use the quantile function of the standard normal distribution
from scipy.stats import norm

# define the mean and standard deviation of the demand for each class
mu = demand.mean()
sigma = {"F": 4, "B": 16, "E": 20}
display(pd.DataFrame({"mu": mu, "sigma": sigma}))


# create a new model with chance constraints that takes as input also
# the target quality of service (QoS) levels for classes F and B
def airline_cc(demand, QoSF=0.98, QoSFB=0.95):
    # create two-stage stochastic model as before
    m = airline_stochastic(demand)

    # add equivalent counterparts of the two chance constraints to the first stage problem
    # the two coefficients related the inverse CDF of the standard normal are computed using the norm.ppf function

    first_class = "s.t. first_class: seats['F'] - {} >= {};".format(
        mu["F"], norm.ppf(QoSF) * sigma["F"]
    )
    m.eval(first_class)
    business_class = "s.t. business_class: seats['F'] + seats['B'] - {} >= {};".format(
        mu["F"] + mu["B"], norm.ppf(QoSFB) * np.sqrt(sigma["F"] ** 2 + sigma["B"] ** 2)
    )
    m.eval(business_class)
    print(m.con["first_class"])
    print(m.con["business_class"])

    return m


# create and solve model
model_cc = airline_cc(demand)
seats_cc = airline_solve(model_cc)
seat_report(seats_cc, demand)
mu sigma
F 12.0 4
B 28.0 16
E 175.0 20
subject to first_class: seats['F'] - 12 >= 8.21499564252729;
subject to business_class: seats['F'] + seats['B'] - 40 >=
  27.127620970404912;
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 531
4 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 21.0 47.0 87.0 155.0
economy equivalent seat allocation 42.0 70.5 87.0 199.5
Tickets Sold
F B E
morning and evening 20 47 87
weekend 10 24 87
midday 6 10 87
Seats not Sold
F B E
morning and evening 1 0 0
weekend 11 23 0
midday 15 37 0
Spillage (Unfulfilled Demand)
F B E
morning and evening 0 3 113
weekend 0 0 88
midday 0 0 63
Expected Revenue (in units of economy ticket price): 177.00
../../_images/1da413d0fe923c9ceb8700b941232d9433429f51b34a4ff6d0e2dc90e980fdce.png

Model 4. Solving the case of continuous demand distributions using the SAA method#

Let us now move past the simplifying assumption that there are only three equally likely scenarios and consider the case where the demand is described by a random vector \((z_F, z_B, z_E)\), where \(z_c\) is the demand for seats of class \(c\in C\). The demand for class \(c\) is assumed to be independent and normally distributed with mean \(\mu_c\) and variance \(\sigma_c^2\) as reported in the following table

\(\mu\)

\(\sigma\)

F

12

4

B

28

8

E

175

20

Note that we model the demand for each class using a continuous random variable, which is a simplification of the real world, where the ticket demand is always a discrete nonnegative number. Therefore, we round down all the obtained random numbers.

However, now that the number of scenarios is not finite anymore, we cannot solve the problem exactly. Instead, we can use the SAA method to approximate the expected value appearing in the objective function. The first step of the SAA is to generate a collection of \(N\) scenarios \((z_{F,s}, z_{B,s}, z_{E,s})\) for \(s=1,\ldots,N\). We can do this by sampling from the normal distributions with the given means and variances.

For sake of generality, we create a script to generate scenarios in which the three demands have a general correlation structure captured by a correlation matrix \(\bm{\rho}\).

Scenario generation (uncorrelated case)#

# sample size
N = 1000

# define the mean mu and standard deviation sigma of the demand for each class
mu = demand.mean()
sigma = {"F": 4, "B": 16, "E": 20}
classes = demand.columns

# build covariance matrix from covariances and correlations
s = np.array(list(sigma.values()))
S = np.diag(s) @ np.diag(s)
print("\nModel Covariance")
df = pd.DataFrame(S, index=classes, columns=classes)
display(df)

# generate N samples, round each demand entry to nearest integer, and correct non-negative values
seed = 0
rng = np.random.default_rng(seed)
samples = rng.multivariate_normal(list(mu), S, N).round()
demand_saa = pd.DataFrame(samples, columns=classes)
demand_saa[demand_saa < 0] = 0

# report sample means and standard deviations for each class
demand_saa_stats = pd.DataFrame(
    {
        "mu (mean)": mu,
        "sample mean": demand_saa.mean(),
        "sigma (std)": sigma,
        "sample std": demand_saa.std(),
    }
)
display(demand_saa_stats)

fig, ax = plt.subplots(1, 3, figsize=(10, 3))
for i, ci in enumerate(classes):
    demand_saa[ci].hist(ax=ax[i], bins=20)
    ax[i].set_title(f"Histogram demand class {ci}")
fig.tight_layout()
Model Covariance
F B E
F 16 0 0
B 0 256 0
E 0 0 400
mu (mean) sample mean sigma (std) sample std
F 12.0 11.889 4 4.088936
B 28.0 28.626 16 15.051534
E 175.0 172.966 20 19.839669
../../_images/1e721a0fdcf60686f831eaf9e338bad4cb7a1eea2d7b3988f3c27e1c4b6e2388.png

We also introduce a new function to report the solution and its features in this more general continuous demand case.

# function to report analytics for SAA case
def seat_report_saa(seats, demand):
    classes = seats.index

    # report seat allocation
    equivalent_seats = pd.DataFrame(
        {
            "seat allocation": {c: seats[c] for c in classes},
            "economy equivalent seat allocation": {
                c: seats[c] * seat_factor[c] for c in classes
            },
        }
    ).T
    equivalent_seats["TOTAL"] = equivalent_seats.sum(axis=1)
    print("\nSeat Allocation")
    display(equivalent_seats)

    # tickets sold
    tickets = pd.DataFrame()
    for c in classes:
        tickets[c] = np.minimum(seats[c], demand[c])

    print("\nMean Tickets Sold")
    display(tickets.mean())

    # seats unsold
    unsold = pd.DataFrame()
    for c in classes:
        unsold[c] = seats[c] - tickets[c]
    print("\nMean Seats not Sold")
    display(unsold.mean())

    # spillage (unmet demand)
    spillage = demand - tickets
    print("\nMean Spillage (Unfulfilled Demand)")
    display(spillage.mean())

    # compute revenue
    revenue = tickets.dot(revenue_factor)
    print(
        f"\nExpected Revenue (in units of economy ticket price): {revenue.mean():.2f}"
    )

    # charts
    fig, ax = plt.subplots(1, 1, figsize=(8, 4))
    revenue.hist(ax=ax, bins=20)
    ax.set_title("Revenue Histogram")

    fig, ax = plt.subplots(1, 3, figsize=(12, 3))
    for i, c in enumerate(classes):
        tickets[c].hist(ax=ax[i], bins=20, alpha=0.4)
        demand[c].hist(ax=ax[i], alpha=0.4)
        ax[i].legend(["Tickets Sold", "Demand"])
        ax[i].set_xlim(0, 300)

    fig.tight_layout()
    return

We can now solve the stochastic optimization problem using the generated \(N=1000\) scenarios. Note that we can use the previously defined function airline_stochastic to solve the model by simply calling it with a different dataframe as argument.

model_ssa = airline_stochastic(demand_saa)
seats_saa = airline_solve(model_ssa)
seat_report_saa(seats_saa, demand_saa)
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 211180
2082 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 11.0 20.0 148.0 179.0
economy equivalent seat allocation 22.0 30.0 148.0 200.0
Mean Tickets Sold
F      9.764
B     17.416
E    147.056
dtype: float64
Mean Seats not Sold
F    1.236
B    2.584
E    0.944
dtype: float64
Mean Spillage (Unfulfilled Demand)
F     2.125
B    11.210
E    25.910
dtype: float64
Expected Revenue (in units of economy ticket price): 211.18
../../_images/d4361a2542b7a2a26ec9a1befc6a49d71b29f4bc1e6d48f440145daef26df76b.png ../../_images/66b9eb1315a6936379b7c4345337e403c30bc1ff0fdfa396f4384074633c9a5b.png

Model 5. Adding correlations between different demand types#

Now assume the ticket demand for the three categories is captured by a \(3\)-dimensional multivariate normal distribution mean \(\mu=(\mu_F, \mu_B, \mu_E)\), variances \((\sigma_F^2, \sigma_B^2, \sigma_E^2)\) and a symmetric correlation matrix

\[\begin{split} P = \left( \begin{array}{ccc} 1 & \rho_{FB} & \rho_{FE} \\ \rho_{BF} & 1 & \rho_{BE}\\ \rho_{EF} & \rho_{EB} & 1 \\ \end{array} \right) \end{split}\]

The covariance matrix is given by \(\Sigma = \text{diag}(\sigma)\ P\ \text{diag}(\sigma)\) or

\[\begin{split} \Sigma= \left( \begin{array}{ccc} \sigma_F^2 & \rho_{FB}\sigma_F\sigma_B & \rho_{FE}\sigma_F\sigma_E \\ \rho_{BF}\sigma_B\sigma_F & \sigma_B^2 & \rho_{BE}\sigma_B\sigma_E\\ \rho_{EF}\sigma_E\sigma_F & \rho_{EB}\sigma_E\sigma_B & \sigma_E^2 \\ \end{array} \right) \end{split}\]

We assume \(\rho_{FB} = 0.6\), \(\rho_{BE} = 0.4\) and \(\rho_{FE} = 0.2\).

We now sample \(N=1000\) scenarios from this multivariate correlated normal distribution and use the SAA method to approximate the solution of the two-stage stochastic optimization problem.

# sample size
N = 1000

# define the mean mu and standard deviation sigma of the demand for each class
mu = demand.mean()
sigma = {"F": 4, "B": 16, "E": 20}
display(pd.DataFrame({"mu": mu, "sigma": sigma}))

# correlation matrix
P = np.array([[1, 0.6, 0.2], [0.6, 1, 0.4], [0.2, 0.4, 1]])

# build covariance matrix from covariances and correlations
s = np.array(list(sigma.values()))
S = np.diag(s) @ P @ np.diag(s)

# create samples
np.random.seed(1)
samples = np.random.multivariate_normal(list(mu), S, N).round()

# truncate to integers and non-negative values
classes = demand.columns
demand_saa = pd.DataFrame(samples, columns=classes)
demand_saa[demand_saa < 0] = 0

df = pd.DataFrame(mu, columns=["mu"])
df["sample means"] = demand_saa.mean()
display(df)

df = pd.DataFrame(pd.Series(sigma), columns=["sigma"])
df["sample std dev"] = demand_saa.std()
display(df)

print("\nModel Covariance")
df = pd.DataFrame(S, index=classes, columns=classes)
display(df)

print("\nSample Covariance")
display(pd.DataFrame(demand_saa.cov()))

fig, ax = plt.subplots(3, 3, figsize=(10, 10))
for i, ci in enumerate(classes):
    for j, cj in enumerate(classes):
        if i == j:
            demand_saa[ci].hist(ax=ax[i, i], bins=20)
            ax[i, i].set_title(f"Histogram {ci}")
        else:
            ax[i, j].plot(demand_saa[ci], demand_saa[cj], ".", alpha=0.4)
            ax[i, j].set_xlabel(ci)
            ax[i, j].set_ylabel(cj)
            ax[i, j].set_title(f"Convariance: {ci} vs {cj}")
fig.tight_layout()
mu sigma
F 12.0 4
B 28.0 16
E 175.0 20
mu sample means
F 12.0 12.058
B 28.0 28.355
E 175.0 175.436
sigma sample std dev
F 4 4.046852
B 16 15.879708
E 20 19.503597
Model Covariance
F B E
F 16.0 38.4 16.0
B 38.4 256.0 128.0
E 16.0 128.0 400.0
Sample Covariance
F B E
F 16.377013 40.352763 15.402114
B 40.352763 252.165140 121.395616
E 15.402114 121.395616 380.390294
../../_images/a12e4629cc459b8aec1217a49d6a22c811a322246b4f3dc5d2f6c9a91ce0645f.png
model_ssa = airline_stochastic(demand_saa)
seats_saa = airline_solve(model_ssa)
seat_report_saa(seats_saa, demand_saa)
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 210891
2103 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 11.0 20.0 148.0 179.0
economy equivalent seat allocation 22.0 30.0 148.0 200.0
Mean Tickets Sold
F      9.868
B     17.021
E    147.245
dtype: float64
Mean Seats not Sold
F    1.132
B    2.979
E    0.755
dtype: float64
Mean Spillage (Unfulfilled Demand)
F     2.190
B    11.334
E    28.191
dtype: float64
Expected Revenue (in units of economy ticket price): 210.89
../../_images/ee1060b2645ed4dbef2c74ea21412965e803bd6d8333793c520706035255c61e.png ../../_images/5565bdd6d4b707d1cb890a1fac4c5eacff03ec664ad8e480f538936440f29453.png

Model 6. Tackling chance constraints using SAA in the case of correlated demand#

The linear counterparts of the chance constraints used above in Model 3 were derived under the assumption of independent normal distributions of demand for first-class and business travel. That assumption no longer holds for the case where demand scenarios are sampled from correlated distributions.

This final model replaces the chance constraints by approximating them using two linear constraints that explicitly track unsatisfied demand. In doing so, we introduce two new sets of integer variables \(y_s\) and \(w_s\) and a big-M constant and approximate the true multivariate distribution with the empirical one obtained from the sample.

The first stage remains unchanged and so does the objective value of the second stage. The adjusted second-stage constraints are:

\[\begin{split} \begin{align*} t_c & \leq s_c & \forall c\in C \\ t_c & \leq z_{c, s} & \forall (c, s) \in C \times S \\ s_F + M y_s & \geq z_{F, s} & \forall s \in S\\ s_F + s_B + M w_s & \geq z_{F, s} + z_{B,s} & \forall s \in S \\ \frac{1}{N} \sum_{s\in S} y_s & \leq 1 - 0.98 \\ \frac{1}{N} \sum_{s\in S} z_s & \leq 1 - 0.95 \end{align*} \end{split}\]

where \(y_s\) and \(w_s\) are binary variables indicating those scenarios which do not satisfy the requirements of the airlineā€™s loyalty programs for first-class and business-class passengers.

The following cell implements this new model. Note that the running time for the cell can be up to a few minutes for a large number of scenarios.

%%writefile airline_final.mod

param capacity;

set CLASSES;
set SCENARIOS;

param nSCENARIOS := card(SCENARIOS);
param bigM;

param demand{CLASSES, SCENARIOS};
param seat_factor{CLASSES};
param revenue_factor{CLASSES};

param first_class_id symbolic;
param business_class_id symbolic;

param first_class_demand_cover_prop;
param business_class_demand_cover_prop;

# first stage variables and constraints
var seats{CLASSES} integer >= 0;

s.t. plane_seats: sum{c in CLASSES}(seats[c] * seat_factor[c]) <= capacity;

# second stage variable and constraints
var tickets{CLASSES, SCENARIOS} integer >= 0;
var first_class {SCENARIOS} binary;
var business_class {SCENARIOS} binary;

s.t. demand_limits {c in CLASSES, s in SCENARIOS}: tickets[c, s] <= demand[c, s];
s.t. seat_limits {c in CLASSES, s in SCENARIOS}: tickets[c, s] <= seats[c];

s.t. first_class_loyality {s in SCENARIOS}:
    seats[first_class_id] + bigM * first_class[s] >= demand[first_class_id, s];
s.t. first_class_loyality_rate:
    sum{s in SCENARIOS} first_class[s] <= (1 - first_class_demand_cover_prop) * nSCENARIOS;
s.t. business_class_loyality {s in SCENARIOS}:
    seats[first_class_id] + seats[business_class_id] + bigM * business_class[s] >=
        demand[business_class_id, s] + demand[first_class_id, s];
s.t. business_class_loyality_rate:
    sum{s in SCENARIOS} business_class[s] <= (1 - business_class_demand_cover_prop) * nSCENARIOS;

# objective
maximize revenue: sum{c in CLASSES, s in SCENARIOS}(tickets[c, s] * revenue_factor[c]);
Overwriting airline_final.mod
# define big M constant for chance constraints counterparts
bigM = 100


def airline_final(demand):
    # Create AMPL instance and load the model
    ampl = AMPL()
    ampl.read("airline_final.mod")

    # load the data
    ampl.set["CLASSES"] = demand.columns.tolist()
    ampl.set["SCENARIOS"] = demand.index.values.tolist()
    ampl.param["demand"] = demand.T
    ampl.param["revenue_factor"] = revenue_factor
    ampl.param["seat_factor"] = seat_factor
    ampl.param["capacity"] = capacity
    ampl.param["bigM"] = bigM
    ampl.param["first_class_id"] = "F"
    ampl.param["business_class_id"] = "B"
    ampl.param["first_class_demand_cover_prop"] = 0.98
    ampl.param["business_class_demand_cover_prop"] = 0.95

    # set solver
    ampl.option["solver"] = SOLVER

    return ampl


# create model
model = airline_final(demand_saa)
seats = airline_solve(model)
seat_report_saa(seats, demand_saa)
HiGHS 1.5.3: HiGHS 1.5.3: optimal solution; objective 170908
3591 simplex iterations
1 branching nodes
 

Seat Allocation
F B E TOTAL
seat allocation 20.0 54.0 79.0 153.0
economy equivalent seat allocation 40.0 81.0 79.0 200.0
Mean Tickets Sold
F    12.018
B    27.927
E    79.000
dtype: float64
Mean Seats not Sold
F     7.982
B    26.073
E     0.000
dtype: float64
Mean Spillage (Unfulfilled Demand)
F     0.040
B     0.428
E    96.436
dtype: float64
Expected Revenue (in units of economy ticket price): 170.91
../../_images/2aef9e5b30ba11871c73a4930a6258c1e2e483ae4d0a2ac1c4af20f3570c0383.png ../../_images/7eb8ec294f4442cd7c2950dc77f9a254c629cd5a901acde7c59b1bcc59e5a230.png

Exercise

Compared the results of using positive correlation in demand for first and business-class tickets results in lower expected revenue. What happens if there is no correlation, or there is negative correlation? Verify your predictions by simulation.