Public
CPSC 320, Intermediate Algorithm design and analysis, at UBC
Published by tubbdoose
CPSC 320, Intermediate Algorithm design and analysis, at UBC
Algorithms
Algorithms are an order of instructions for locating a solution to a problem. Problems have an input in some data structure and accept set of solutions in some data structure according to problem-specific requirements. An important branch of computer science is dedicated to designing algorithms for a given problem. Some algorithms are better than others.
Algorithm Analysis
The field of algorithm analysis is dedicated to evaluating and comparing the complexities of algorithms. Complexity is the amount of resources (space, time) an algorithm uses to arrive at a solution.
Reduction
Reduction is a technique in Algorithm Analysis
Reduction involves mapping a problem input to an input for a different problem, solving that new problem, and then mapping the new problem's solution to a solution to the original problem.
Reduction is useful in algorithm analysis because if a problem A can be successfully reduced to problem B then we know problem B takes as least as many resources as problem A + the resources for mapping. For the reduction to be useful, the resources required for mapping should be insignificant compared to the resources required to solve problem B.
A gadget is a unit of a problem specifically arranged to represent a fundamental unit of another problem. A good strategy for reductions is forming gadgets and then figuring out how to combine them to represent any given problem of the other type.
To prove that a reduction works we must prove that solution A is correct if and only if solution B is correct (implications both ways).
Asymptotic analysis
Asymptotic analysis is a technique in Algorithm Analysis
With respect to algorithms, asymptotic analysis is used to classify algorithms by how their complexities grow with the size of the input.
An asymptote of a function is another function where the difference between it and the original function approaches zero a one of the parameters of the functions approaches
Big O
Big Omega
Big Theta
Small O
Small Omega
Many algorithms operate by iterating through possible solutions, improving them each time, and returning once the optimal solution is reached. Sometimes the algorithm can get "lucky" by arriving at the optimal solution in the first iteration (best case), and other times it might have to go through some maximum number of solutions first (worst case). It is common to characterize an algorithm by how the complexity grows in both of these cases separately. The average case is also sometimes considered.
Master theorem
Master theorem is used for recursive algorithms in Asymptotic analysis
Some algorithms call themselves in order to solve a problem. This is called recursion. I had a nice example but it got deleted and I don't want to rewrite it right now.
The Master theorem is a method for obtaining a Big Theta set which the recurrence function belongs to.
Given a recurrence of the form
If
If
If
where
Algorithm Design
We use algorithm analysis to guide how we design algorithms. A goal of computer science is to design algorithms which have the smallest worst case asymptotic run time. Another concern is the optimality of algorithms. To claim an algorithm provides the optimal solution to a problem, we must use logical proofs.
Greedy algorithm
Greedy algorithm is an Algorithm Design
A Greedy algorithm is an algorithm where a single value is maximized in each iteration of decision making.
Consider the problem of lighting the minimal number of lampposts on a street, given the length of the street, the positions of lampposts, the light radius of a lamppost, and the requirement that the whole length of the street be lit.
A greedy algorithm for solving this problem may maximize the distance between remaining lampposts while ensuring none are more than 2 x the lamppost radius.
There may or may not be an optimal greedy algorithm for a given problem. When there is no greedy algorithm that is optimal for a problem, it is because that problem has multiple local maxima.
Here are two useful proof patterns for proving the optimality of a greedy algorithm. To prove optimality we need to prove that the greedy algorithm is at least as good as any other solution.
The greedy stays ahead proof is a proof by induction. Assuming the greedy solution is at least as good as any other for the first
Imagine generally an optimal solution. Then, without loss of generality, prove that exchanging elements of this solution with those that would result make it closer to the greedy solution would not make the solution worse.
If this is true then exchanges can be made until the optimal solution is transformed into the greedy solution and the solution will be made no worse.
Divide and Conquer
Divide and Conquer is an Algorithm Design
Divide and conquer is an algorithm design where a problem is recursively broken into subproblems until they are trivial, the solutions to these subproblems are then combined until a solution to the original problem is formed.
An example of a divide and conquer algorithm is merge sort. In merge sort an array is recursively split in half until each subproblem only contains one elements. A one element sort problem is trivial. The sorted arrays returned as solutions to subproblems are then merged until they are all merged into one sorted array.
A discussed in Master theorem, recursive algorithms like this result in recursive complexity functions. The master theorem is used to do asymptotic analysis on these recursive functions.
Prune and Search
Prune and Search is an Algorithm Design
Prune and search is an algorithm design where the set of possible solutions is recursively pruned until only one solution remains.
An important prune and search algorithm is binary search. Given a sorted array, find the index of some given query element. Binary search checks the middle element of the array, returns this index if it matches the query element, and otherwise returns a call of itself with a new array that only includes elements in the direction of the query element (larger than middle element if the query is larger, smaller than middle element if the query is smaller.
Prune and search is similar to divide and conquer in that it recursively creates subproblems, but is different because prune and search only considers one subproblem per iteration and there is no recombining of subproblem solutions.
Memoization
Memoization is an Algorithm Design
Memoization is an algorithm design the solution is recursively split into subproblems and the solutions to these subproblems are stored in an array (memoization table) so that they are computed no more than once.
Here is a memoization algorithm for the fibonacci sequence
def fib_mem(n, memo={})
if n in memo:
return memo[n]
elif n == 1 or n == 0:
return n
else:
memo[n] = fib_mem(n-1) + fib_mem(n-2)
return memo[n]
Memoization can reduce the run time of algorithms significantly. If the above algorithm recomputed the fibonacci sequence again for every subproblem, the solution would take exponential time because increasing n
by 1
would double the number of calls to fib_mem
. With memoization the number of calls only increase by 1
as n
is incremented.
Dynamic programming
Dynamic programming is an Algorithm Design
Dynamic programming is similar to memoization except it is iterative instead of recursive. A dynamic programming algorithm is bottom up, meaning it beings at the trivial subproblems and works up to the original problem. This is unlike memoization which is top-down, beginning at the original problem and splitting into subproblems until the trivial cases are reached.
def fib(n)
fib = [0,1]
for i in range(2, n+1):
fib.append(fib(n-1) + fib(n-2))
return fib[n]
Problems
There are two types of problems. Decision problems where the solution is one of true or false, and optimization problems, where the solution maximizes/minimizes some value.
For every optimization problem there is a decision problem, is there a solution where this value holds. If we can solve this decision problem, then we can do binary search on it (which takes polynomial time) to obtain a solution to the optimization problem.
NP Complexity
NP Complexity is a complexity class for Problems
Being in NP means that given a solution, there exists an algorithm that can verify it is correct in polynomial time.
P is the set of problems that are, even in worst case, solvable in polynomial time. All problems in P are also in NP.
NP-Hard is the set of problems that are at least as difficult as any problem verifiable in polynomial time (in NP). For a problem to be in NP-Hard, every problem in NP must be reducible to it in polynomial time.
NP-complete is the intersection of NP and NP-Hard. Solutions to these problems are verifiable in polynomial time (in NP), and the problems are at least as hard as any other problem where solutions are verifiable in polynomial time (in NP-Hard).
The P vs NP problem is an unsolved problem is computer science. It asks is every polynomial verifiable problem solvable in polynomial time?. We assume
The Cook-Levin theorem finds that the any problem in NP can be reduced to the SAT problem in polynomial time. This is the definition of NP-Hard. The SAT problem is also in NP. Therefore it is in NP-Complete.
The SAT problem involves finding a valid assignment of variable truth values for a given boolean formula. For example,
We can view boolean formulas as disjunction statements called clauses (
There is a proof that the SAT problem can be reduced in polynomial time to a special case where every clause has exactly 3 variables. This is called 3-SAT.
Because SAT can be reduced to 3-SAT in polynomial time, and SAT is NP-Complete, we know 3-SAT is NP-Complete. This is very useful. A common proof for NP-Completeness of a problem is reducing the 3-SAT to this problem in polynomial time.
NP
NP is a class of problems concerning NP Complexity
NP is the set of problems where a solution can be verified in polynomial time,
NP Hard
NP Hard is a class of problems concerning NP Complexity
NP-Hard is the set of problems that are at least as difficult as any problem verifiable in polynomial time (in NP). For a problem to be in NP-Hard, every problem in NP must be reducible to it.
A famous NP-Hard problem is the travelling salesman problem.
NP-Complete
NP-Complete is in NP Hard
NP-Complete is the intersection of NP and NP-Hard. A solution to these problems can be verified in polynomial time
A problem where it is important that it is NP-Complete is the integer factoring problem. Given an integer, provide it's integer factors. This problem is used in RSA encryption where two prime numbers (the private key) are used to decrypt information and their product (the public key) is used to encrypt. The public key is available to anyone so that anyone can encrypt information but only those with the private key can decrypt it. The NP-Completeness of the integer factorization problem is also the enabling concept behind blockchains. But blockchain is cringe and bad and not useful and bad.
If
SAT
SAT is a Problems
3-SAT
3-SAT is a specific case of SAT
SAT is reducible to 3-SAT. As SAT is NP-Complete, we know 3-SAT is NP-Complete. This means we can use this problem to prove NP-Completeness by reducing it to other problems.
Stable matching
Stable matching is a Problems
The Stable matching problem given, a set of applicants and a set of employers, each of whom have a preference order for the other, find a stable matching. A stable matching is one where no applicant and employer prefer each other over who they are matched with.
The Gale–Shapley algorithm solves the stable matching problem. Applicants apply to employers in the order of their preference list. Employers tentatively give an offer to the best applicant they have seen so far and revoke any previous offers. Applicants who haven't been accepted go to their next employer. This continues until everyone has an offer.
Clustering
Clustering is a Problems
Given a set of vectors
A greedy algorithm is optimal for solving this problem. We calculate the distances between every vector, sort the vectors by maximal distance, and then in order of distance descending, place vectors in the subset that maximizes the minimal distance between elements in other subsets.
Subset sum
Subset sum is a Problems
*Given a multiset (a set where elements may appear multiple times) of integers
This is a decision problem. It is NP-Complete.
Graph problems
Graph problems are Problems
Bipartile matching
Bipartile matching is a Graph problems
3-Dimensional matching
3-Dimensional matching is a Graph problems
Steiner tree
Steiner tree is a Graph problems
Minimum spanning tree
Minimum spanning tree is a Graph problems
Hamiltonian path
Hamiltonian path is a Graph problems
Traveling salesperson
Traveling salesperson is a Graph problems
Vertex cover
Vertex cover is a Graph problems
Graph colouring
Graph colouring is a Graph problems
Greedy algorithm
is an
Divide and Conquer
is an
Prune and Search
is an
Memoization
is an
Dynamic programming
is an
Master theorem
is used for recursive algorithms in
Reduction
is a technique in
Asymptotic analysis
is a technique in
NP Complexity
is a complexity class for
SAT
is a
Stable matching
is a
Clustering
is a
Subset sum
is a
Graph problems
are
Algorithms
Algorithm Analysis
concerns analyzing
Algorithm Design
concerns the design of
Problems
are solved by
Bipartile matching
is a
3-Dimensional matching
is a
Steiner tree
is a
P
is in
NP
is a class of problems concerning
NP Hard
is a class of problems concerning
NPI
is in
NP-Complete
is in
is in
Minimum spanning tree
is a
Hamiltonian path
is a
Traveling salesperson
is a
Vertex cover
is a
Graph colouring
is a
3-SAT
is a specific case of
is a