Dinner seating arrangement#

# install dependencies and select solver
%pip install -q amplpy

SOLVER = "cbc"

from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["cbc"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics
from IPython.display import display
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd

Problem description#

Consider organizing a wedding dinner. Your objective is for guests from different families to mingle with each other. One way to encourage mingling is to seat people at the tables so that no more people than \(k\) members of the same family take a seat at the same table.

How could we solve a problem like this? First, we need the problem data – for each family \(f\) we need to know the number of its members \(m_f\), and for each table \(t\) we need to know its capacity \(c_t\). Using this data and the tools we learned so far, we can formulate this problem as a linear optimization (LO).

We can use variable \(x_{ft}\) for the number of persons from family \(f\) to be seated at table \(t\). Since we were not provided with any objective function, we can focus on finding a feasible solution by setting the objective function to be constant, say \(0\), which means that we do not differentiate between feasible solutions.

The mathematical formulation of this seating problem is:

\[\begin{split} \begin{align*} \min_{x_{ft}} \quad & 0 \label{ch4eq.Dina.problem.1}\\ \text{s.t.} \quad & \sum\limits_{f} x_{ft} \leq c_t & \forall \, t \in T \\ & \sum\limits_{t} x_{ft} = m_f & \forall \, f \in F \\ & 0 \leq x_{ft} \leq k. \end{align*} \end{split}\]

The constraints ensure that the seating capacity for each table is not exceeded, that each family is fully seated, and that the number of elements of each family at each table does not exceed the threshold \(k\).

Implementation#

The problem statement will be satisfied by finding a feasible solution, if one exists. Because no specific objective has been specified, the mathematical formulation uses a constant 0 as essentially a placeholder for the objective function. Some optimization solvers, however, issue warning messages if they detect a constant objective, and others will fail to execute at all. A simple work around for these cases is to replace the constant objective with a ‘dummy’ variable that doesn’t appear elsewhere in the optimization problem.

%%writefile table_seat.mod

set F;
set T;

param M{F};
param C{T};

param k;

var x{F, T} >= 0, <= k;
var dummy >= 0, <= 1, default 0;

minimize goal: dummy;

s.t. capacity {t in T}: sum{f in F} x[f, t] <= C[t];
s.t. seat {f in F}: sum{t in T} x[f, t] == M[f];
Overwriting table_seat.mod
def table_seat(members, capacity, k):
    m = AMPL()
    m.read("table_seat.mod")

    m.set["F"] = range(len(members))
    m.set["T"] = range(len(capacity))

    m.param["M"] = members
    m.param["C"] = capacity

    m.param["k"] = k

    m.option["solver"] = SOLVER

    return m


def get_solution(model):
    df = model.var["x"].to_pandas()
    df.index.names = ["family", "table"]
    df = df.unstack()
    df.columns = df.columns.get_level_values(1)

    return df.round(5)


def report(model, results, type=int):
    solve_result = model.get_value("solve_result")
    print(f"Solve result: {solve_result}")
    if solve_result == "solved":
        soln = get_solution(model).astype(type)
        display(soln)
        print(f'objective:       {model.obj["goal"].value()}')
        print(f"places at table: {list(soln.sum(axis=0))}")
        print(f"members seated:  {list(soln.sum(axis=1))}")

Let us now consider and solve a specific instance of this problem with six families with sizes \(m = (6, 8, 2, 9, 13, 1)\), five tables with capacities \(c = (8, 8, 10, 4, 9)\), and a threshold \(k=3\) for the number of members of each family that can be seated at the same table.

seatplan = table_seat(members=[6, 8, 2, 9, 13, 1], capacity=[8, 8, 10, 4, 9], k=3)
%time results = seatplan.solve()
report(seatplan, results, type=float)
cbc 2.10.7: cbc 2.10.7: optimal solution; objective 0
0 simplex iterations
CPU times: user 0 ns, sys: 1.35 ms, total: 1.35 ms
Wall time: 16.4 ms
Solve result: solved
table 0 1 2 3 4
family
0 0.0 3.0 1.0 0.0 2.0
1 3.0 0.0 3.0 0.0 2.0
2 0.0 0.0 0.0 0.0 2.0
3 1.0 2.0 3.0 3.0 0.0
4 3.0 3.0 3.0 1.0 3.0
5 1.0 0.0 0.0 0.0 0.0
objective:       0.0
places at table: [8.0, 8.0, 10.0, 4.0, 9.0]
members seated:  [6.0, 8.0, 2.0, 9.0, 13.0, 1.0]

A peculiar fact is that although we did not explicitly require that all variables \(x_{ft}\) be integer, the optimal solution turned out to be integer anyway. This is no coincidence as it follows from a certain property of the problem we solve. This also means we can solve larger versions of this problem with LO instead of MILO solvers to find integer solutions, gaining a large computational advantage.

Minimize the maximum group size#

Our objective was that we make members of different families mingle as much as possible. Is \(k = 3\) the lowest possible number for which a feasible table allocation exists or can we make the tables even more diverse by bringing this number down?

In order to find out, we change the objective function and try to minimize \(k\), obtaining the following problem:

%%writefile table_seat_minimize_max_group_at_table.mod

set F;
set T;

param M{F};
param C{T};

var x{F,T} >= 0;
var k >= 0;

minimize goal: k;
    
s.t. capacity {t in T}: sum{f in F} x[f,t] <= C[t];
s.t. seat {f in F}: sum{t in T} x[f,t] == M[f];
s.t. bound {f in F, t in T}: x[f,t] <= k;
Overwriting table_seat_minimize_max_group_at_table.mod
def table_seat_minimize_max_group_at_table(members, capacity):
    m = AMPL()
    m.read("table_seat_minimize_max_group_at_table.mod")

    m.set["F"] = range(len(members))
    m.set["T"] = range(len(capacity))

    m.param["M"] = members
    m.param["C"] = capacity

    m.option["solver"] = SOLVER

    return m

We now solve the same instance as before.

seatplan = table_seat_minimize_max_group_at_table(
    members=[6, 8, 2, 9, 13, 1], capacity=[8, 8, 10, 4, 9]
)
%time results = seatplan.solve()
report(seatplan, results, type=float)
cbc 2.10.7: cbc 2.10.7: optimal solution; objective 2.6
0 simplex iterations
CPU times: user 0 ns, sys: 1.47 ms, total: 1.47 ms
Wall time: 16.3 ms
Solve result: solved
table 0 1 2 3 4
family
0 0.0 2.6 0.8 0.0 2.6
1 2.6 0.2 2.6 0.0 2.6
2 0.0 0.0 0.8 0.0 1.2
3 2.6 2.6 2.4 1.4 0.0
4 2.6 2.6 2.6 2.6 2.6
5 0.2 0.0 0.8 0.0 0.0
objective:       2.6
places at table: [8.0, 8.0, 10.0, 4.0, 9.0]
members seated:  [6.0, 8.0, 2.0, 9.0, 13.0, 1.0]

Unfortunately, this solution is no longer integer. Mathematically, this is because the “structure” that previously ensured integer solutions at no extra cost has been lost as a result of making \(k\) a decision variable. To find the solution to this problem we would need to impose that the variables are integers.

Minimize number of tables#

We now add an integer variable y to minimize the number of tables in our model. In order to better compare the diferences between a LO and a MILO implementation, we introduce the AMPL option relax_integrality. When the option relax_integrality is set to 1 all the integer variables in the model are treated as linear. Conversely, if relax_integrality is set to 0 the integrality restrictions are restored in the model.

%%writefile table_seat_minimize_number_of_tables.mod

set F;
set T;

param M{F};
param C{T};

param k;

var x{F,T} integer >= 0, <= k;
var y{T} binary;

minimize goal: sum{t in T} y[t];

s.t. capacity{t in T}: sum{f in F} x[f,t] <= C[t] * y[t];
s.t. seat{f in F}: sum{t in T} x[f,t] == M[f];
Overwriting table_seat_minimize_number_of_tables.mod
def table_seat_minimize_number_of_tables(members, capacity, k, relax=False):
    m = AMPL()
    m.read("table_seat_minimize_number_of_tables.mod")

    m.set["F"] = range(len(members))
    m.set["T"] = range(len(capacity))

    m.param["M"] = members
    m.param["C"] = capacity
    m.param["k"] = k

    m.option["solver"] = SOLVER

    if relax:
        m.option["relax_integrality"] = 1

    return m
members = [6, 8, 2, 9, 13, 1]
capacity = [8, 8, 10, 4, 9]
k = 3

print("\nMILO problem\n")
seatplan = table_seat_minimize_number_of_tables(members, capacity, k)
%time results = seatplan.solve()
report(seatplan, results, type=int)

print("\nRelaxed problem\n")
seatplan = table_seat_minimize_number_of_tables(members, capacity, k, relax=True)
%time results = seatplan.solve()
report(seatplan, results, type=float)
MILO problem

cbc 2.10.7: cbc 2.10.7: optimal solution; objective 5
0 simplex iterations
CPU times: user 477 ”s, sys: 78 ”s, total: 555 ”s
Wall time: 17.3 ms
Solve result: solved
table 0 1 2 3 4
family
0 2 0 3 0 1
1 0 2 3 0 3
2 0 0 0 2 0
3 3 3 1 1 1
4 3 3 3 1 3
5 0 0 0 0 1
objective:       5.0
places at table: [8, 8, 10, 4, 9]
members seated:  [6, 8, 2, 9, 13, 1]

Relaxed problem

cbc 2.10.7: cbc 2.10.7: optimal solution; objective 5
0 simplex iterations
CPU times: user 464 ”s, sys: 75 ”s, total: 539 ”s
Wall time: 15.6 ms
Solve result: solved
table 0 1 2 3 4
family
0 0.0 2.0 1.0 0.0 3.0
1 0.0 2.0 3.0 0.0 3.0
2 1.0 0.0 0.0 1.0 0.0
3 3.0 3.0 3.0 0.0 0.0
4 3.0 1.0 3.0 3.0 3.0
5 1.0 0.0 0.0 0.0 0.0
objective:       5.0
places at table: [8.0, 8.0, 10.0, 4.0, 9.0]
members seated:  [6.0, 8.0, 2.0, 9.0, 13.0, 1.0]

Reformulation as max flow problem#

However, using an MILO solver is not necessarily the best approach for problems like this. Many real-life situations (assigning people to work/course groups) require solving really large problems. There are existing algorithms that can leverage the special network structure of the problem at hand and scale better than LO solvers. To see this we can visualize the seating problem using a graph where:

  • the nodes on the left-hand side stand for the families and the numbers next to them provide the family size

  • the nodes on the left-hand side stand for the tables and the numbers next to them provide the table size

  • each left-to-right arrow stand comes with a number denoting the capacity of arc \((f, t)\) – how many people of family \(f\) can be assigned to table \(t\).

If we see each family as a place of supply (people) and tables as places of demand (people), then we can see our original problem as literally sending people from families \(f\) to tables \(t\) so that everyone is assigned to some table, the tables’ capacities are respected, and no table gets more than \(k = 3\) members of the same family.

An AMPL version of this model is given in the next cell. After that we will show how to reformulate the calculation using network algorithms.

%%writefile table_seat_maximize_members_flow_to_tables.mod

set F;
set T;

param M{F};
param C{T};

param k;

var x{F,T} integer >= 0, <= k;

maximize goal: sum{f in F, t in T} x[f, t];

s.t. capacity {t in T}: sum{f in F} x[f,t] <= C[t];
s.t. seat {f in F}: sum{t in T} x[f,t] == M[f];
Overwriting table_seat_maximize_members_flow_to_tables.mod
def table_seat_maximize_members_flow_to_tables(members, capacity, k, relax=False):
    m = AMPL()
    m.read("table_seat_maximize_members_flow_to_tables.mod")

    m.set["F"] = range(len(members))
    m.set["T"] = range(len(capacity))

    m.param["M"] = members
    m.param["C"] = capacity
    m.param["k"] = k

    m.option["solver"] = SOLVER

    if relax:
        m.option["relax_integrality"] = 1

    return m


members = [6, 8, 2, 9, 13, 1]
capacity = [8, 8, 10, 4, 9]
k = 3

print("\nMILO problem\n")
seatplan = table_seat_maximize_members_flow_to_tables(members, capacity, k)
%time results = seatplan.solve()
report(seatplan, results, type=int)

print("\nRelaxed problem\n")
seatplan = table_seat_maximize_members_flow_to_tables(members, capacity, k, relax=True)
%time results = seatplan.solve()
report(seatplan, results, type=float)
MILO problem

cbc 2.10.7: cbc 2.10.7: optimal solution; objective 39
0 simplex iterations
CPU times: user 305 ”s, sys: 51 ”s, total: 356 ”s
Wall time: 17.3 ms
Solve result: solved
table 0 1 2 3 4
family
0 1 2 3 0 0
1 1 0 3 1 3
2 0 2 0 0 0
3 3 0 1 2 3
4 3 3 3 1 3
5 0 1 0 0 0
objective:       39.0
places at table: [8, 8, 10, 4, 9]
members seated:  [6, 8, 2, 9, 13, 1]

Relaxed problem

cbc 2.10.7: cbc 2.10.7: optimal solution; objective 39
0 simplex iterations
CPU times: user 398 ”s, sys: 67 ”s, total: 465 ”s
Wall time: 16 ms
Solve result: solved
table 0 1 2 3 4
family
0 0.0 3.0 0.0 0.0 3.0
1 1.0 1.0 3.0 0.0 3.0
2 0.0 0.0 2.0 0.0 0.0
3 3.0 3.0 2.0 1.0 0.0
4 3.0 1.0 3.0 3.0 3.0
5 1.0 0.0 0.0 0.0 0.0
objective:       39.0
places at table: [8.0, 8.0, 10.0, 4.0, 9.0]
members seated:  [6.0, 8.0, 2.0, 9.0, 13.0, 1.0]

By adding two more nodes to the graph above, we can formulate the problem as a slightly different flow problem where all the data is formulated as the arc capacity, see figure below. In a network like this, we can imagine a problem of sending resources from the root node “door” to the sink node “seat”, subject to the restriction that for any node apart from \(s\) and \(t\), the sum of incoming and outgoing flows are equal (balance constraint). If there exists a flow in this new graph that respects the arc capacities and the sum of outgoing flows at \(s\) is equal to \(\sum_{f \in F} m_f = 39\), it means that there exists a family-to-table assignment that meets our requirements.

def model_as_network(members, capacity, k):
    # create lists of families and tables
    families = [f"f{i}" for i in range(len(members))]
    tables = [f"t{j}" for j in range(len(capacity))]

    # create digraphy object
    G = nx.DiGraph()

    # add edges
    G.add_edges_from(["door", f, {"capacity": n}] for f, n in zip(families, members))
    G.add_edges_from([(f, t) for f in families for t in tables], capacity=k)
    G.add_edges_from([t, "seat", {"capacity": n}] for t, n in zip(tables, capacity))

    return G
members = [6, 8, 2, 9, 13, 1]
capacity = [8, 8, 10, 4, 9]

G = model_as_network(members, capacity=[8, 8, 10, 4, 9], k=3)

labels = {(e[0], e[1]): e[2] for e in G.edges(data="capacity")}
%time flow_value, flow_dict = nx.maximum_flow(G, 'door', 'seat')
CPU times: user 3.37 ms, sys: 0 ns, total: 3.37 ms
Wall time: 3.38 ms
families = [f"f{i:.0f}" for i in range(len(members))]
tables = [f"t{j:.0f}" for j in range(len(capacity))]

pd.DataFrame(flow_dict).loc[tables, families].astype("int")
f0 f1 f2 f3 f4 f5
t0 2 0 0 2 3 1
t1 0 1 1 3 3 0
t2 1 3 0 3 3 0
t3 0 1 0 0 3 0
t4 3 3 1 1 1 0

Even for this very small example, we see that network algorithms generate a solution significantly faster.