# Algorithm Design and Analysis

These notes are for a version of the course taught by Prof. Bahman Kalantari in the spring of 2017.

Professor’s Website - http://www.cs.rutgers.edu/~kalantar/

Grading Policy:

- Homework & Quizzes: 15%
- Midterm I & II: 20% each
- Final Exam: 45%

Exam 1 Date: Feb. 15th, 2017

## Lecture 1 - 1/18/2017

The aim of this course is to study algorithms - routines or sets of steps to solve problems.

In this course we wish to analyze the runtime and memory efficiency of algorithms. We will explore how to best go about designing algorithms which have optimal runtimes.

## Lecture 2 1/23/2017

In this lecture we explore the growth of functions and some asymptotic notation.

Given two functions which map natural numbers to all real numbers

We write

To put that into english we can that that, if and only if (double implication) there exists some constant which is greater than 0 such that the condition that is satisfied for every which is greater than some reasonable then we can say that . (f is big-oh of g)

Example: , Our restriction is that is 1

**Q**: In this case, is ?

**A**: Yes, if we set then we get which is true for any which is greater than or equal to 1.

There are two other definitions similar to big-oh as well.

**Definition 2**

**Definition 3**

**Definition 4**

(Little-Oh)

**Review - The L’Hopital Rule**

Given two functions such that then we can also say that

**Theorem**

Suppose two functions: and . Our claim is that

Also suppose we look at functions that are exponential in nature (vs polynomial functions). Given the two functions and

Then if we continue to differentiate we find that we get .

If we are then to continuously differentiate our other function we get

Then if we are to take the limit of the ratio (now that we have applied L’Hopital’s rule) we get

We see from this that exponential function of a base raised to an input value power is infinitely worse than a high order polynomial function.

**Transitive Property of Big-Oh**

Given

**Fibonacci Sequence**

The sequence is defined as

A naive algorithm might do something like

```
def fib(n)
if n == 0 return 0
if n == 1 return 1
if n > 1 return fib(n-1) + fib(n-2)
```

However this algorithm isn’t efficient because we keep computing many of the same values over and over again for just a single calculation, while a simpler lookup array could be far faster.

Another way to calculate fibonacci values is via the following equation

Homework in the book. Pages 8 and 9.

- 1(a)-(j)
- 2
- 3
- 4(a)

### Lecture 3 - 1/25/2017

Review of last lecture:

Two functions

**Big Oh**

**Omega**

**Theta**

**Little Oh**

Given , and , which is larger (asymptotically)?

Assume

- Take log of each
- We immediately see that is smaller asymptotically smaller than

Given the functions floor and ceil we know

We also know that

We should also say that

Which gives us that is bounded.

Useful Summation Formulas

What about the summation: ??

We can generalize and make it

Say

Then

From this we can simply see that

Moving on to algorithms - **searching for a key in a sorted array**

A sorted array is an array with the condition such that

- One option is to run binary search.

Let be the worst case number of comparisons for binary search. We can then say that

We can say now that the worst case is , but what about the average case? Or the best case.

Let be the average # of comparisons in binary search, assuming all numbers are distinct.

Given that and

For Let the number of positions where is takes comparisons to find the correct location.

Assume the following table for

lb | L1 | L2 | L3 | L4 | L5 | rb | |

loc | 3 | 2 | 3 | 1 | 3 | 2 | 3 |

The average time is then given by

In more mathematical terms there are locations plus gaps which gives us locations.

The Average time then becomes

### Lecture 4 - 1/30/2017

Given a list of elements, we want to determine whether a specific element is contained in a list of the elements. We define

- - The worst case run time
- - The average case run time

In binary search we fine that .

Representing the number of items as giving

To calculate the average case we get

where is the number of position where it takes comparisons.

#### Divide and Conquer Algorithms

Many algorithms are recursive in nature and call themselves one or more times. This is called using a “divide and conquer” approach because the algorithms solve small subproblems before coming back together to complete the entire problem.

Steps

- Divide the problem into smaller subproblems
- Conquer: Solve the subproblems recursively. If the subproblem size is small we solve it with a straightforward approach.
- Combine solutions for subproblems into solutions for the original problem.

One of the classic divide and conquer problems is **mergesort**.

The key problem in mergesort is combining two sorted lists. The function might be defined as where

- A is an array
- p, q, and r are indices.

Assume is sorted and is sorted

The best approach will take no more than comparisons. This works by using two pointers and incrementing one of them on the list until we find a value which is larger than the other pointer. We switch pointer comparisons until we reach the end of one list. Then we append the rest of the other list.

Example:

```
Given the list
5 2 4 6 1 3 2 6
Show the mergesort steps
5 2 4 6 | 1 3 2 6
5 2 | 4 6 | 1 3 | 2 6
5 | 2 | 4 | 6 | 1 | 3 | 2 | 6
2 5 | 4 6 | 1 3 | 2 6
2 4 5 6 | 1 2 3 6
1 2 2 3 4 5 6 6
```

The algorithm for mergesort can be written:

```
margesort(A, p, r):
if p < r
q <-- floor( (p + r) / 2)
mergesort(A, p, q)
mergesort(A, q+1, r)
mergesort(A, p, q, r)
```

To analyze this type of algorithm we can relate the overall runtime to the runtime of the subproblems.

For mergesort, given number of comparison to sort a list of numbers with mergesort

This formula can be generalized to the form

Where

- number of subproblems
- = size of each subproblem
- time to divide
- time to combine

If we apply the mergesort algorithm to this to analyze the runtime we can first state that

Where

If we plug in into our final answer we find

In multiplying two numbers together we find that the way we normally do it on paper takes approximately operations. Wheras the c algorithm implemented in computers are able to do it in approximately

### Lecture 5 - 2/1/2017

*Review*

In divide and conquer algorithms we can represent the runtime as the following function

- : Number of subproblems
- : size of each subproblem
- : Cost to divide
- : Cost of combining

Proving that the non-base case for is

More proofs for solving recurrence relations

Using recurrence relations we can find the runtime of divide and conquer algorithms.

**Master Theorem** -

Where:

and

### Lecture 6 - 2/6/2017

Quicksort

- Split and compare elements of the from from left to right with A[1] starting with A[2]
- If an element is greater than A[1] it stays in place
- Otherwise, swap with the first key which is either greater than or equal to it in the greater section

- Finally swap A[1] with the last of the smaller keys (n-1 comparisons)

For a much better explanation with visuals see this link

In the worst possible case, quicksort is

With randomized recursion where we choose a random pivot index from 1 to n, then which is the average number of comparisons in randomized recursion.

We get that
**Theorem**: $$A(n) \leq cn\cdot log(n)

### Lecture 7 - 2/8/2017

*Sorry! No notes for today*

### Lecture 8 - 2/13/2017

Heaps

A structure which satisfies the heap property:

In human terms this means that given a key that we can say that a structure satisfies the heap property if the parent element of is always greater than or equal to its children.

This is different from a binary tree where we know the relationship between the left and right nodes must be that the left node is always less than the right. In a heap this isn’t true. In a heap all we know is that the parent is always greater than the children.

So when popping a new item from a heap we must ensure that the heap property is still satisfied. In order to do this we remove the item from the heap, then we fix the heap by sifting the new largest value of the roots children as the new root. Then look at that nodes children and bring the larger one up, and so on.

The recurrence relation for heapsort is

If we break this down by setting we can get

Then breaking down the recurrence relation we

By going until we get

## Lecture 10 - 2/27/17 - Graphs

Graphs are special constructs which include vertices and edges. We represent a graph by

Where represents the set of all vertices in the graph. represents the edges in the graph. When drawing a graph, an edge is associated with two vertices and is typically represented by a line drawn between the two vertices which it connects.

### BFS (Breadth-First Search)

Breadth-first search means starting from a specific node and searching for a value within a graph by exploring all nodes within a similar distance of the first before moving outwards to search more distant nodes.