Introduction

The Knapsack Problem is the best problem to learn about NP-Hardness because it's simple, in terms of how the problem is defined.
It's simple also because it's approximation algorithm is not NP-Hard.

Formulation of the Knapsack problem: \(\max\{\sum_{i=1}^n p_ix_i : \sum_{i=1}^n w_i x_i \leq b\wedge x\in\{0, 1\}^n \}\) and to eliminates edge cases, let's also define that:

  1. For all item's profits: \(p_i \in \mathbb{R}_+\)
  2. For all item's weight: \(0 \leq w_i \leq b; w_i\in \mathbb{R}_+\)
  3. I am going to use index starting with 0 and ends with n-1 for all parts after this:

Dynamic Programming

Dynamic Programming is possible if one of the value, \(\vec{w}\) or \(\vec{p}\) is a vector of integers. This makes things easy because it allows the Dynamic Programming algorithm to optimize for subsets of item sharing the same weights/profits.
For the contrary, imagine the weights and profits of the item is random points on the real number line, then it's impossible to have the different subsets having the same profits/weights.
There are 2 solution to solving the Knapsack problem:

  1. Primal: Maximizing the profits for a certain amount of weights.
  2. Dual: Minimizing the weights for a certain amount profits.
This is one of the applications of duality in Linear Programming.

Primal

Inputs: \(\forall w_i : w_i \in \mathbb{Z}_+\), \(\forall p_i: p_i \in \mathbb{R}_+\)
Observe that, one of the vector can have numbers that are real.

The complexity of the Primal exact algorithm is the size o the table: \(\mathcal{O}(n*b)\)

Dual

The dual tries to minimize the weight of the solution for a fixed profits. And it allows weights to be positive real, and the profits has to be integers.
Inputs: \(\forall p_i: p_i \in \mathbb{Z}_+\), \(\forall w_i: w_i \in \mathbb{R}_+\)

The dual formulation will have a table with this many rows: \(\sum_{i} p_i\); but the upper bound for profits can be reduced using approximation algorithm: Greedy Algorithm.


Approximation Algorithms

We need approximation algorithms for the reasons that:

Greedy Algorithm

The greedy algorithm ranks the items by its value (measure as \(p_i/w_i\)), and then it takes a fractional amount the the last item item that cannot fit into the budget.

The greedy algorithm ranks the items by its value (measure as \(p_i/w_i\)), and then it takes a fractional amount the the last item item that cannot fit into the budget.


          Rank all items by its value. 
          while: There is remaining budget: 
            take the item if the budget allows
            else: 
              take a fractional amount of that item to exhaust all budget, then break;
        

If the greedy algorithm's solution is all integers, then it's the absolute optimal (almost impossible, if this happens, and you made the Knapsack problem yourself, you must be drunk)
This algorithm is extremely fast, and it provides an upperbound for a given Knapsack problem.

Claim I:

There there doesn't exist any subset that have an objective value that is larger than the objective value of Greedy Algorithm(This is obvious if we prove it with linear programming).

Claim II:

The slack on the optimal solution is less than the minimum item with minimum weight in the Greedy solution. The optimal solution has an objective value that is at least 1/2 of the greedy algorithm.

Proof:

\(\tilde{S}\): The integeral solution produced by the greedy algorithm; \(S^+\): The optimal integral, solution. They are both a set of indices.
Let k denotes the slack variable: \[k := b - \sum_{i\in S^+} w_i\]
Given the optimality assumption of \(S^+\), we know that we cannot move any element in \(\tilde{S}\) to increase the objective value of \(S^+\), therefore: \[\min_{i\in\tilde{S}}\{w_i\} > k\] Now assume somehow, the item with minimum weight in \(\min_{i\in\tilde{S}}\{w_i\} > b/2\), then the solution of greedy must contain that one solution (there is not enough budget for 2 items), then the aboslute optimal will have to be at most \(2\sum_{i\in\tilde{S^+}}p_i\), otherwise, \(S^+\) will contain an item has more value with less weight, which is a contradiction to how greedy algorithm works.

For the ther case, assume that the value \(\min_{i\in\tilde{S}}\{w_i\} \leq b/2\), then \(k < b/2\), making not enough room for the absolute optimal to make more than double the profits compare to the greedy algorithm.

Dual with Rounding

The algorithm round the profits to integers and then use the Dual DP(Dynamic Programming) to solve the problem, here is the algorithm:
Define the multiplier as: \[m:= \frac{n}{\max_i\{p_i\}\epsilon}\] where \(0 < \epsilon < 1\), then define \(p_i' := \lfloor{m*p_i}\rfloor\), as the rounded profits; then redefined \(p_i := m*p_i\)
Then use the Dual DP to solve the problem.

Claim I:

This algorithm asserts a lower bound on the optimal solution, let \(p(S):= \sum_{i\in S}p_i\) (profits are not rounded, but rescaled), and let the solution found by the rounded profits be \(\tilde{S}\), and let the absolute optimal solution be: \(S^*\), then: \(p'(\tilde{S}) = p'(S^*) \geq (1 -\epsilon)p(S^*)\).

Proof:

First, it's obviously true that: \(p'(\tilde{S}) = p'(S^*)\), because DP asserts the optimality for the inputs, hence there is no way the absolute optimal can have high profits in the rounded problem.
Starts with this basic facts: \[\lfloor p_i \rfloor + 1 \geq p_i\] \[\lfloor p_i \rfloor \geq p_i - 1\] Then we have: \[p'(S^*) \geq p(S^*) - n\] Consider the quantity: \(n\) \[\frac{n}{\epsilon} = \max_{i}\{p_i\} \leq p(S^*)\] The \(p_i\) inside of the maximum operator is not scaled! \(S^*\) non-empty by preconditions. then we have: \[p'(S^*) \geq p(S^*) - n \geq p(S^*) - \epsilon p(S^*) = (1 - \epsilon)p(S^*)\]
Therefore we know the this algorithm must give us an optimal solution at least \(\epsilon\) portion of the optimality.
Furthermore, it also gives us an upperbound on the solution which is: \[ \frac{p'(\tilde{S})}{1 - \epsilon} \geq p(S^*) \] and then: \[ \frac{p'(\tilde{S})}{m(1 - \epsilon)} \geq \frac{p(S^*)}{m} = Opt_{abs} \] Which is an upper bound, as \(\epsilon \rightarrow 0\), we know that \(m\) increases together with \(p'(\tilde{S})\), and it will eventually be equal because the round error gets less as the multiplier increase.

Claim II:

There is another rounding strategy that might appear to be more appealing but I don't know yet how to prove it. The new multiplier is set to separate profits in such a way that after scaling, item's profits lies within their integer slots. Then m can be defined as: \[ m := \frac{2} { \underset{0 \leq i, j \leq n - 1}{\text{min}} \{|p_i - p_j|\} } \]

We need to assume that the profits of each items are unique to avoid denominator being zero. In the case where all tiems has the same profits, no scaling and rounding is needed.

This scaling strategy will asserts that all the item's profits preserves it's orderings after the scaling, this will produce a smaller scaling factor and it "seems to" preserve the optimal solution (please contact me if you know how to prove or disprove this).

Primal with Rounding

Primal Rounding rounds the weights of the items to integer down, and then use DP to solve. This is generally a good idea however changing the constraints of the function can devistating and it might even break one of the preconditions listed.

The round strategy seek for a way of putting all the weights of the items into their unique integer slot so that the oderings of the item's wieghts are preserved the scaling and rounding. In this case, assuming weights of all items are unique: \[ m := \frac{2} { \underset{0 \leq i, j \leq n - 1}{\text{min}} \{|w_i - w_j|\} } \]

Claim I:

Round the weights of the items down can give an upperbound as that is pretty off.

An Instance:

Consider these 2 items with the following weights: \[(1.099 + 0.1, 1.099 - 0.1)\] Then it's not hard to determine that the scaling factor m will be: \(m=10\) Then after scaling, we have the following weights for the items: \[(10.99 + 1, 10.99 - 1)\] Rounding things down, we have: \[(11, 9)\] where each item's weight is off by 0.99. Now let's recover the weight by dividing the rounded value by the multipler to get: \[(1.1, 0.9)\] and then each weight of the item will be off by +0.099.

Then in this case, the weight can be off by 10% when solving it based on the rounded weight.
The key here is to choose items only taking 2 value, and choosing the value in such a way that it maximizes the rounding error.

Now consider a more extreme case where the weights of the item only takes on 2 values and they are: \(\{1.499 + 0.5, 1.499 - 0.5\}\), then in this case the scaling factor will be: m = 1, hence we can round thing down directly to get: \(\{1, 0\}\). Then in this case, the weight of each item is off by 49.9%.
Now imagine we have a list of items taking the values of: \[\lbrack 1.999, 0.999, 1.999, 0.999... \rbrack\] Repeating 2n times, then the errors will be able to build up. Suppose the budget \(b\) we have is 2n, also assume that items with weight \(0.999\) has no profits, then after rounding it, the budget will remain unchanged as \(n\) gets arbitrarily large while the weight is reduced by 50%, allowing us to include 2 times many items as the actual optimal solution.
Therefore, in the worst case, the upperbound will be 1/2 of the optimal value of the solution. Comparatively this is way slower than the greedy algorithm and yet they have equally bad worst case. However how likely for each of them to hit the worst case worth another lengthy discussion; but I believe it's unlikely.

Claim II:

(There is a way to place a better bound on the output, but it will depend on the inputs of the algorithm. )

Why All These?

We are particularly interested in the approximation algorithm because a tight upperbound will serve the Branch and Bound algorithm well.

It's not smart to use the same approximation algorithm for branch a bound either, depending on the particular inputs, some heuristic works better than others. Hence, if we can mix all the approximation algorithm together to produce a best upperbound for the algorithm, then it will speed up the run-time of the Branch and Bound algorithm.


Branch & Bound Algorithm

The branch and bound algorithm uses a heuristic algorithm that produces an upper bound for maximization problem. Here we are going to use different heuristics to aid with the algorithm.

Determing Factors for B&B:

A branch and bound algorithm is an improvement on bruteforce search algorithm for NP-Hard problem, instead of searching all the solution spaces, it uses heuristic algorithms that is fast(Approximation algorithms) to determine which subset of solutions spaces it's going to investigate.
2 of the most important ingredient of the branch and bound algorithm are:


The easier heuristic to use is the greedy algorithm, it's fast, but it's not always very accurate.

Psuedocodes

Define a class name: Problem

Which has the following variables in its field:

And it has one method that is defined as the following:

The Codes

Inputs:

function:
def soln(P: Type[Problem], S_star, Opt_star):
stack = {P}
while stack is not empty:
P0 = stack.pop()
IntIdx, Frac, Opt = P0.greedy()
if Opt > Opt_star:
if frac is None:
Opt_star := Opt // solution is all integers
else:
stack.push(Problem(I + [frac], I2.remove(frac), P0.P, P0.W, P0.B - P0.W[frac]))
stack.push(Problem(I, I2.remove(frac), P0.P, P0.W, P0.B - P0.W[frac]))
if IntIdx has better profits than S_star:
S_star = P0.I1 + IntIdx
return S_star, Opt_star

Additional Notes

When the heuristics are set to be the greedy algorithm, this branch and bound algorithm become equivalent to the solving the same system of inqaulities using the Branch and Bound algorithm for Linear inqaulities.

A mixed heuristics can be used, however we need to identify the characteristics from a sub-proboem such that, using approximation algorithms will give a better run-time than the The greedy algorithm.

Finally, a "warm-start" feasible solution can be provided in advance to speed up the branch and bound process, this is helpful because a high value for the global objective value will cut some of the branching process.

However in my code, I used the integral part of the greedy solution to give a warm start to the Branch and Bound algorithm, which is the most efficient to use.

Pathological Inputs

We claim that the BB algorithm is exponential, however it's relatively rare occurences for realy life problem. This particular Pathological instance works the same for BB algorithm for linear programming and it's taken from the course notes of math 409, credits goes to Thomas Rothvoss, a professor at UW.

The pothological instance is formulated as the following: \[ \max \left\lbrace x_0| \frac{x_0}{2} + \sum_{i = 1}^n = \frac{n}{2}; x \in \{0, 1\}^{n+1} \right\rbrace \] With the constraint that \(n \geq 4\) and n is even (odd number of items because index starts with zero).

proof:

Here, we assume that, a warm start solution for the problem is not given on the start of the algorithm.

Each node on the BB tree represents a sub-problem where some of the value for \(x_i\) is fixed, and hence. Model it with 2 index sets: \(I_0\) and \(I_1\).

For at each node, we have a candidate solution: \(\tilde{x}_i\) and it's defined as the following: \[ \tilde{x}_i := \begin{cases} 1 & i = 0\\ 1 & i \in I_1\\ 0 & i \in I_0\\ \lambda & else \end{cases} \]

As long as \(|I_0| + |I_1| \leq \frac{n}{3}\), and \(\tilde{x}_0 = 1\), then there is a feasible LP that cannot be pruned yet.

Exam the following expression: \[ \frac{\tilde{x}_0}{2} + \sum_{i=1}^n \tilde{x}_i = \frac{1}{2} + |I_1| + (n - |I_0| - |I_1|)\lambda = \frac{n}{2} \] Where the quantity: \(n - |I_0| - |I_1|\) the variable that are still free and can be fractional.

Which gives: \[ \sum_{i=1}^n \tilde{x}_i = |I_1| + (n - |I_0| - |I_1|)\lambda = \frac{n}{2} - \frac{1}{2} \]

Remember that: \(0 \leq |I_1| \leq \frac{n}{3}\) and \(|I_0| + |I_1| \leq \frac{n}{3}\) which implies that: \[ n - |I_0| - |I_1| \geq \frac{2n}{3} \]

There always exists a choice of \(\lambda\) in between zero an one such that the sub-problem is solvable. In addtion, the solution to the sub-problem will be a fractional value too because of the assumption that n is even and the \(1/2\) on the right hand side, therefore, the solution is also non-integer.

As a result, the tree will keep branching for \(n/3\) in depth, meaning that the algorithm will be exponential in runtime.


Knapsack but Linear Programming

This is a linear programming problem in its nature, and all above discussions are only for mathematical interests however in practices, the reduction of this problem to linear programming is the easiest once a solver is avaible.

Extended Knapsack

This problem so far deals with binary choices, means for each item, we only have the choice to: include, or not include. However, as we already show that, inside the approximation algorithm, items can be rounded to the same weights, and it it's equivalent of having the same items, but with non-binary decision variable. Hence the new Knapsack probem is concerned with taking multiple of the same item, or more specifically: $$x_i \in \lbrack 0, c_i\rbrack \cap \mathbb{Z}_+$$ Which is the only changes made to the 0/1 Knapsack Problem.

The major implementation challenge is an inevitable one: Floating Point Truncation Errors. For this reason, we cannot get absolute optimality for every well defined inputs. I think this is out of scope for this dicussion about Knapsack and NP-Hardness, but I will review and and anal=yze it in the future. The problem is discovered by inputing some items with unrealistic distribution, and it mannage to produce inaccuracy on the optimality for both the COIN_CBC solver and my solver. This problem is profound and might be hard to address. But I believe it's easy to do rational computations in python, so we at least have something to fallback, so please don't worry about it too much, but it's a real legit concern.

The branch and bound implementation of the Extended-Knapsack can be viewed here:

Branch and Bound Analogy

It's up to the reader to realize the trivial fact that the branch and bound algorithm for linear programming is equivalent to the branch and bound algorithm using greedy approximation as the heuristic.

Gomory Cut and Approximation

It's left as an excercise for the readers to comtemplate the intuitive connection between using scaling and rounding as heuristic and Branch and Bound algorithm versus using Gomory cut on the linear programming problem.


Performance & Banchmark

For some readers this along with the intro might be the the only parts that they are interested in reading about.

Exactly, This is some huge disparity in time comsuption, so here I will link all the python codes that is involved in making up this plot and then explain how it works.


from random import random as rnd
from numpy import random as np_rnd
from time import time
from knapsack.core import *
import sys; import os
from quick_csv import core
from quick_json import quick_json


def rand_problem_ints(range_:int, itemsCount:int, sparseness):
    assert 0 < sparseness < 1
    weights = [int(rnd() * range_) for I in range(itemsCount)]
    profits = [int(rnd() * range_) for I in range(itemsCount)]
    MaxWeights = int(sparseness * (sum(weights)))
    return profits, weights, MaxWeights


def run_solve_on(problemList, solver:callable):
    Time, Opt = [], []
    for P, W, B in problemList:
        Start = time()
        Value = solver(P,W,B)
        Time.append(time() - Start)
        print(f"Timer: {Time[-1]}")
        Opt.append(Value)
    return Time, Opt


def rand_problem_exponential(scale: int, itemsCount: int, satruration):
    assert 0 < satruration < 1
    Profits, Weights = np_rnd.exponential(scale=scale, size=(2, itemsCount))
    Profits, Weights = map(lambda x: int(x), Profits), map(lambda x:int(x), Weights)
    Profits, Weights = list(Profits), list(Weights)
    Budget = int(satruration*sum(W for W in Weights))
    return Profits, Weights, Budget


def bench_bb_with_dp(trials:int):

    def dp_solve(P, W, B):
        _, Opt = knapsack_dp_dual(P,W,B)
        return Opt

    def bb_solve(P, W, B):
        _, Opt = branch_and_bound(P, W, B)
        return Opt

    ItemCount = 20
    ItemProfitsWeightsRange = 1000
    KnapSackSparseness = 0.1
    ProblemList = [rand_problem_ints(ItemProfitsWeightsRange, ItemCount, KnapSackSparseness) for P in range(trials)]
    ProblemList += [rand_problem_exponential(ItemProfitsWeightsRange, ItemCount, KnapSackSparseness) for P in range(trials)]
    bb_time, bb_opt = run_solve_on(ProblemList, bb_solve)
    dp_time, dp_opt = run_solve_on(ProblemList, dp_solve)
    CSVHeader = ["bb_time", "dp_time", "bb_opt", "dp_opt"]
    CSVCols = [bb_time, dp_time, bb_opt, dp_opt]
    def MakeJsonResults():
        Json = {}
        Json["bb_time"], Json["dp_time"] = bb_time, dp_time
        return Json
    core.csv_col_save("bb, dp bench.csv", colHeader= CSVHeader, cols=CSVCols)
    quick_json.json_encode(MakeJsonResults(), filename="bb, dp bench.json")


    print("Tests Detailed: ")
    print(f"The same set of problem is run on both BB and DP, time and optimal value is recored. ")
    print("The optimal value for both solver should be the same and the time cost is the interests. ")
    print(f"Item Count: {ItemProfitsWeightsRange}, Item's Profits and Weight range: (0, {ItemProfitsWeightsRange}), "
          f"Knapsack Sparseness: {KnapSackSparseness}")


def main():
    bench_bb_with_dp(10)
    pass


def test():
    print(rand_problem_exponential(10000, 5, 0.5))


if __name__ == "__main__":
    print(f"swd: {os.getcwd()}")
    print(sys.argv)
    if len(sys.argv) != 1:
        test()
    else:
        main()

        

To compare the performance of both the BB and the DP algorithm, items with integers weights and profits are used, but they are in very huge value so that it's comparable to items with profits and weights as float values.

There are 2 major ways a random problem are generated, by the "rand_problem_ints" and the "rand_problem_exponential" method in the script.

Epilogue

We have explored 2 of the ways of how dynamic programming can solve the problem, how rounding the floats values allowed us to approximate the solution for the problem and use it to as heuristic for the Branch and Bound algorithm in the end.

Throughout the exploration, we observe the intricacy of the zero one knapsack problem and revealed the fact that, Dynamic Programming is just the beginning of things and how this problem in particular, can be a very good reduced example for LP, where, the simplex solving routine has been substituted with the Heuristic we developed in the session.

I wish to write more on this, but have to pause the project here because of time limitations. There are a still a lot of problems/subjects left unexplored such as:

And much much more connections to discover from this problem to other area of mathematics and computer algorithms.