The 2d-Bin Packing Problem (2d-BPP) is a venerable optimization problem that is very simple to state but difficult to solve. We flesh out our idea in this unpublished paper, but this post will give a brief overview of the most interesting insight in the paper.

Stated simply the 2d-BPP is: given identical rectangular bins of integer size and rectangular objects of (potentially) different integer sizes, the 2D-BPP is to, using as few bins as possible, to assign objects to bins and positions such that no two bins overlap. Figure 1 illustrates a toy example where five objects are correctly packed into a single bin.

2d-BPP is a great way to abstract the problem of packing objects into a container, cutting a sheet of material into rectangular units, laying out the occupancy of a 3d printer bed, or even allocating CPU and memory resources to virtual machines. Simple variants allow objects to be rotated or to be arbitrary polygons. More complex variants include the 2D stock cutting problem, which minimizes the sum of material and cutting costs, motivated by industry needs. Because of the importance of the applications, 2d-BPP has been extensively explored and many approximate and exact solutions have been proposed. The approximate approaches are typically based on heuristics and can be very fast, but they are not guaranteed to find the optimal solution; in contrast, the exact approaches are typically based on Integer Linear Programming (ILP) and are generally limited to small instance sizes.

The current state-of-the-art exact method is based on the Combinatorial Benders Decomposition (CBD). It splits the problem into two parts: assigning objects to bins, and arranging objects within bins. Its speed comes from the way that it quickly discovers which sets objects cannot share a single bin and intelligently generalizing this knowledge to other sets of objects. At each step, it creates and solves sub-problems corresponding to arranging objects within each bin using a very simple method.

Instead, we’ll look at the erstwhile state-of-the-art approach Positions and Covering (P&C), and show how we can greatly improve it by recognizing that a key step can be rewritten using convolutions. The resultant algorithm is called Hough and Cover (H&C), and it is faster than CBD on a range of standard bin-packing datasets. Let’s begin by understanding the geometry behind P&C.

The Geometry of P&C

P&C is an ILP-based method, meaning we don’t directly solve the problem. Instead, we encode our problem with linear constraints, and then hand it over to a black-box solver. The solver then returns a solution, which we can then decode into a solution to the original problem.

For simplicity, we’ll assume that we have a single bin, and we are looking for any non-overlapping packing. We are given a bin of finite size \(\text{bin}_h \times \text{bin}_w\) and \(k\) objects to pack into that bin. Each object \(p\) has height \(h_p\) and width \(w_p\). We wish to assign the top-left corner of each object to some position such that no two objects overlap. All ILP-based methods must have some way to express the following constraints:

  1. Each object must appear exactly once.
  2. Each object must lie entirely within the bin.
  3. No two objects may overlap.

Positions and Covering is named because it encodes the problem with two transformations: first positions, then coverings. (In a more general sense, any geometry-inspired ILP-based method can be thought of as transforming the representation of the problem to efficiently express constraints.)

Positions

For every object \(p\) we create a matrix of \({0, 1}\)-decision variables the same size as the bin: \(A_p : \mathbb B^{bin_h \times bin_w}\). Each matrix should contain a single \(1\), corresponding to the top-left corner of the object. That is: \[ A_{p}[i, j] = \begin{cases} 1 & \text{if object $p$ is at $(i, j)$} \\
0 & \text{otherwise} \end{cases} \]

For this matrix to be valid, we need to encode the first two of the three constraints. We do this by setting the sum of all positions in the matrix that correspond to valid top-left corners to be \(1\), and the remaining cells to be \(0\). You can see an example of this in the left side of Figure 2, where the object D is represented by a \(1\) at its top-left corner and zeros otherwise.

Coverings

To address the third constraint they then introduce the coverings transformation, which expands the 1s in the positions matrices to cover all the cells the object would cover. You can see this in right side of Figure 2, where the object D is represented by a \(1\) in every cell it covers.

For every object \(p\), we construct a coverings matrix \(C_p : \mathbb B^{bin_h,bin_w}\). where: \[ C_{p}[i, j] = \begin{cases} 1 & \text{if object $p$ occupies cell $(i, j)$} \\
0 & \text{otherwise} \end{cases} \] We can generate this from the coverings matrix by applying a sliding window of size \(h_p \times w_p\) to the positions matrix. Note that the construction of the coverings matrix from the positions matrix is a convolution operation, which is exactly what we’ll later exploit to speed up the algorithm.

To complete the description of P&C, we need to add the constraint that no two objects may overlap, which we do by creating a single overlap matrix \(O : \mathbb B^{bin_h,bin_w}\) that counts the number of objects that cover each position in the bin. \[ O[i, j] = \sum_{p} C_{p}[i, j] \qquad \forall i, \forall j \] We add the constraint that the overlap matrix is no more than 1 in each position, guaranteeing that objects do not overlap. This completes the description of P&C.

P&C Complexity

We observe that the coverings matrix is a very expensive operation, requiring one constraint for the outer product of each possible position in each object and the size of the bin. This corresponds to \(\mathcal O(k \times b_h \times b_w \times \max_p(h_p) \times \max_p(w_p)) \subset \mathcal O(k \times b \times b_h^2 \times b_w^2)\) constraints. This is despite the number of variables being similar to the positions stage at \(\mathcal O(k \times b \times b_h \times b_w)\). While the number of constraints is not always related to the time to solve these problems, we conjecture that a more compact representation will greatly speed up computation. We present a method that uses a much more compact representation and achieves greater speed.

Hough and Cover

The positions stage of Positions and Covering is a parameter-space transform that is better known as the Hough transform, a cornerstone of classic computer vision patented in the 1960s. We name our technique in honor of this venerable technique. H&C uses the same positions stage as P&C, and our main contribution is speeding up the coverings stage.

We begin by observing that the covering matrix in P&C can be constructed using this peculiar matrix pattern and the cumulative sum operation: \[ \textrm{CumSum}\left( \begin{array}{rrrr} 1 & 0 & 0 & -1 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
-1 & 0 & 0 & 1 \\
\end{array}\right) = \left( \begin{array}{rrrr} 1 & 1 & 1 & 0 \\
1 & 1 & 1 & 0 \\
1 & 1 & 1 & 0 \\
0 & 0 & 0 & 0 \\
\end{array}\right) \]

The delta matrix \(D : \mathbb Z^{k \times b \times h_b \times w_b}\) is constructed from a positions matrix \(A\) using this pattern. There are two key features of this matrix: first, that all values are in \(\{-1,0,1\}\), and second that it only has non-zero values in 4 positions. \[ D_{p}[i, j] = \begin{array}{ll} \phantom{+} A_{p,i,j} & \text{(top-left}) \\
+ A_{p,i-h_p,j-w_p} & \text{(bottom-right)} \\
- A_{p,i-h_p,j} & \text{(bottom-left)} \\
- A_{p,i,j-w_p} & \text{(top-right)} \\
\end{array} \qquad\forall p, \forall i, \forall j \label{eqn:makedelta} \\
\]

The cumulative sum operation computes, for each cell in an input delta matrix, the sum of all values that are above it, to the left of it, or in the cell itself. Same as in numpy, pandas, or MatLab. We can implement this operation in a straightforward way: \[ C_{p}[i, j] = \sum_{g=1}^{i} \sum_{h=1}^{j} D_{p}[g, h] \qquad \forall p, \forall q \] Or, we can specify \(C\) recursively, as explained here: \[ C_{p}[i,j] = C_{p}[i-1, j] + C_{p}[i, j-1] - C_{p}[i-1, j-1] + D_{p}[i, j] \qquad \forall p, \forall i, \forall j \]

Representing the cumulative sum operation as the latter instead of the former allows us to significantly reduce the number of constraints needed to express them.

Now, for the key insight that speed up this algorithm. So far, we have written the overlap matrix as: \[ O = \sum_{p=1}^k \text{CumSum}(D_{p}) \] instead, we can write this as: \[ O = \text{CumSum}(\sum_{p=1}^k D_{p}) \] This does not change the meaning of the overlap matrix, but it does change the number of constraints needed to express it. This is key to the speedup of H&C. Given this insight, we can now describe the H&C algorithm:

H&C Algorithm

We first convert the positions matrix \(A_p\) to a delta matrix \(D_p\) for each object \(p\). We then compute the sum of all delta matrices \(E : \mathbb Z^{b\times h_b\times w_b}\), taking the sum over all objects as illustrated in Figure 3: \[ E[i, j] = \sum_{p=1}^k D_{p}[i, j] \qquad \forall i, \forall j \] We then construct the overlap matrix by taking the cumulative sum of this: \[ O = \textrm{CumSum}(E) \] Now \(O[i, j]\) contains the number of objects occupying position \((i, j)\). We now add the constraint that no objects can overlap: \[ O[i, j] \leq 1 \qquad \forall i, \forall j \] And this completes the H&C construction.

H&C Complexity

This re-ordering of operations greatly reduces the size of the optimization problem. Counting the constraints needed to encode this:

  • the delta matrix incurs a constant number of constraints per element of the positions matrix for a total of \(\mathcal O(k \times bin_h \times bin_w)\) constraints;
  • the summation step applies to each variable exactly once to incur \(\mathcal O(k \times bin_h \times bin_w)\) constraints;
  • the cumulative-sum step touches each element in the overlap matrix a fixed number of times, leading to \(\mathcal O(b \times bin_h \times bin_w)\).

The dominating term is \(\mathcal O(k \times bin_h \times bin_w)\), which is a tremendous improvement over P&C, which has \(\mathcal O(k \times bin_h^2 \times bin_w^2)\) constraints.

In the paper we conduct experiments to show that H&C creates smaller problems than P&C in practice, and that these problems solve faster. We’ll skip that here, instead lets talk about why this works:

Why does this work?

Thus far, we have introduced the delta matrix without providing theoretical justification for its construction. This relies heavily on some key properties of linear operations, which we briefly review.

Properties of Linear Operations

Linear operations are operations over vectors that preserve vector addition and scalar multiplication; that is, some function \(f\) is linear if and only if \(f(\vec a + \vec b) = f(\vec a) + f(\vec b)\) and if \(f(c \times \vec a) = c \times f(\vec a)\) for any vectors \(\vec a\) and \(\vec b\) and scalar \(c\). We rely on three key properties:

  1. Any operation that can be written as matrix multiplication (i.e., \(f(\vec b) = A \vec b\)) is linear. This also means that the sum of a vector, the sum of arbitrary subsets of a vector, and the convolution between two vectors are all linear operations.
  2. Matrix operations are associative and distributive. Convolution operations are also commutative, a fact that makes it possible for us to re-order the formation of delta matrices.
  3. If some function can be written as a matrix multiplication or convolution with a non-negative matrix, then it preserves convexity which is necessary to solve our ILP.

Delta Matrix Explained

Now that we have these key properties, we can derive the structure of the delta matrix. Consider the one-dimensional mapping from positions to coverings: \[ (0, 1, 0, …, 0, 0, 0) \rightarrow (0, 1, 1, …, 1, 0, 0) \] We note that for an object \(m\)-elements long in an \(n\)-element sequence, this will require \(\mathcal{O}(mn)\) constraints to express. At each of \(n\) positions in the sequence, the value is determined by the value at up to \(m\) possible starting positions.

We observe that we can use the one-dimensional \(\text{CumSum}\) operation here as well: \[ \text{CumSum}((0, 1, 0, …, 0, -1, 0)) \rightarrow (0, 1, 1, …, 1, 0, 0) \] This has the property that only two values are needed to specify an object of any length, and so the number of constraints needed is reduced to \(\mathcal{O}(n)\) from \(\mathcal{O}(mn)\).

Connecting this insight to the 2-dimensional case begins with observing that generating the coverings matrix is a convolution operation. That is, for some object \(p\): \[ C_p = A_p * \mathbb 1^{h_p \times w_p} = A_p * (\mathbb 1^{h_p \times 1} * \mathbb 1^{1 \times w_p}) \] Where \(*\) denotes convolution, and \(\mathbb 1\) denotes a vector or matrix of ones.) We replace each of the one-vectors with the delta equivalent: \[ = A_p * (\textrm{CumSum}(\text{delta}(h_p \times 1)) * \textrm{CumSum}(\text{delta}(1 \times w_p))) \] Since the cumulative sum is a linear operation, we can apply the distributive property: \[ = A_p * \textrm{CumSum}(\text{delta}(h_p \times 1) * \text{delta}(1 \times w_p)) \] And since the cumulative sum operation can be written as a convolution, we can invoke the associative property and move the positions matrix inside the operation: \[ = \textrm{CumSum}(A_p * \text{delta}(h_p \times 1) * \text{delta}(1 \times w_p)) \] \[ = \textrm{CumSum}\left(A_p * \left(\begin{array}{r}\phantom- 1 \\
\phantom- 0 \\
\phantom- \vdots \\
\phantom- 0 \\
-1 \end{array}\right) * \left(\begin{array}{rrrrr} 1 & 0 & \cdots & 0 & -1\end{array}\right)\right) \] Our two-dimensional delta function is the convolution of each one-dimensional vector, so we can write the delta matrix as: \[ = \textrm{CumSum}\left(A_p * \left(\begin{array}{rrrrr} \phantom- 1 & 0 & \cdots & 0 & -1 \\
\phantom- 0 & 0 & & 0 & \phantom- 0 \\
& \vdots & & \vdots \\
\phantom- 0 & 0 & & 0 & \phantom- 0 \\
-1 & 0 & \cdots & 0 & \phantom- 1 \end{array}\right)\right) \] This is the rearrangement that allows us to reduce the number of constraints in H&C.

The delta matrix extends one additional row and column past the original object. That is, when convolving with the bottom-right-most position of \(A_p\), the delta matrix extends out past the rightmost column and lowest row of \(C_p\). We observe that the value in those cells in \(C_p\) after the cumulative sum operation must always be zero, which means they can be safely truncated.

Preprocessing, CBD, and Future Work

CBD, the current state-of-the-art method, uses a much less sophisticated technique for arranging objects in bins than H&C. In the CBD paper, the authors credit the performance of their method with two key factors: first, that they solve as few arranging problems as possible by generalizing effectively from one infeasible arrangement to as many as possible; and second, their suite of preprocessing techniques. The preprocessing techniques alone are sufficient to solve some types of bin-packing problems (such as most of Class 9 of the Unibo dataset.)

In principle, we could use H&C to solve the sub-problems and see if the resulting method is faster than CBD. However, the source code for CBD is not available, and there is a huge amount of engineering overhead in recreating the entire pipeline. Instead of expending that effort, I’ll just post this on the internet and hope that someone finds the basic technique behind H&C as cool as I do.