proof by induction
proof by contradiction
proof by contrapositive
dis-proof by counter example
axiomatic definition of ADT behavior
List ADT
Stack ADT
Queue ADT
Bounded Stack ADT
Bounded Queue ADT
Map ADT
Set ADT
MSet ADT (multiset)
List data structure
general ops: get kth item
insert at slot k
delete at slot k
implementing lists:
as array
as linked cells
Sorted list are especially nice
order information can assist in list searching
binary search can find an item in O(log N) time worst case
special lists:
Stack
general ops: push, pop, top, empty, size
keep pointer to most recent slot (t)
push is insert at slot t
pop is delete from slot t
push, pop are O(1) since we access the top slot ddirectly via pointer
no searching through the list, no rearrangement of locations
this applies to implementations by links or array
Queue
general ops: add, remove, front, empty, size
keep pointers to front slot, back slot
add is insert at slot front
remove is delete from slot back
add, remove are O(1) since we access the slots directly by pointer
no searching through the list, no rearrangement of locations
this applies to implementations by links or array
Bubble sort
O(N^2) time required to sort list of N items
pro: easy to program
con: very slow, useless for sorting lists of any serious size
Tree data structure in general (n-ary tree)
Knuth transform, converts n-ary tree to binary tree
Binary Tree data structure
Binary Search Tree
Tree balancing: AVL, Splay
traversals:
in-order, pre-order, post-order
in-order gives sorted list from a BST
pre-order, post-order give lists dependent on tree structure
BST Sort
Not a real sorting algorithm name, but a way to remember that one fairly
efficient way to sort a list is to first build a binary search tree,
then traverse it in the right way
1) build a BST from the list: O(N log N)
one insert in a BST: O(log N) for tree with N elments
doing N inserts: O(N log N)
2) in-order traversal to get the sorted order of elements
O(N) since we have to visit each of the N nodes in the BST
So total worst case time complexity is
O(N log N) + O(N) which is O(N * (1+ log N))
which is O(N log N)
Pigeon hole principle
Hash Table, hash function
how to pick table size
how to make an effective hash function
what to do about collisions
hash into linked lists (or other collection data structures)
table size should be prime, and same the size as # elements expected
probing to find an open cell in the table
linear probing function
quadratic probing function
"expo" probing function
table size should be prime, and twice the # elements expected
rehasing to extend the table size
"Big Oh" notation for expressing growth rates of functions
List of length N
insert is O(N)
remove is O(1) once you find the element
find is O(N)
Queue containing N items
add is O(1)
remove is O(1)
front is O(1)
can't get middle elements like general List
Stack containing N items
push is O(1)
pop is O(1)
top is O(1)
can't get middle elements like general List
Binary Search Tree with N nodes
find is O(log N)
insert is O(log N)
remove is O(log N) because we have to find it to remove it
traversals are O(N) (list every element in some order)
we get ordering (sorting) information from the tree structure
degenerate case of BST is a List
Hash table of N elements
insert is O(1)
find is O(1)
delete is O(1)
get no ordering information from the table
Set
basic ops: add, delete, isIn (find), empty, delete
sets can be implemented as BST to make operations take
O(log N) worst case
add is O(log N) like insert in BST
delete is O(log N) like delete in BST
find is O(log N) like find in BST
isIn is a version of find, returns boolean rather than value record
empty is O(1)
Map
basic ops: (assume implemented as BST)
add (key, value) pair is O(log n)
delete (key, value) pair is O(log n)
get (key) is O(log N), returns the value stored with this key
can also get order-dependent information like
findMin, findMax in O(log N)
sort items in O(N)
basic ops: (assume implemented as Hashtable)
add (key, value) pair is O(1)
delete (key, value) pair is O(1)
get (key) is O(1), returns the value stored with this key
no order-dependent ops are possible
binary heap
binary tree with two properties:
heap order property
structure property
always complete binary tree, last level filled from L to R
representation of binary heap as array
heaps can be Min or Max
priority queue
implement priority queue as binary heap
operations:
getMin (Max)
O(1) worst, avg since it is at the root of the heap tree
removeMin (Max)
O(log N) worst case
O(log N) average case
add
O(log N) worst case
O(1) average case
build
nieve is O(N log N) by doing N adds
cleaver is O(N)
axioms for alternate version of priority queue
items are (value, priority) pairs
heap sort
build a binary heap
do repeated removeMin() to create the sort
basic graph theory
vertex (node), edge (arc), edge weight
undirected graph
directed graph
path in a graph
cycle in a graph
bi-partite graph
complete graph
planar graph
depth-first search (traversal) in a graph (use a stack)
breaadth-first search (traversal) in a graph (use a queue)
representation of a graph
adjacency matrix
useful when graph is dense ( |E| approx.= |V|^2 )
adjacency lists
useful when graph is sparse ( |E| << |V|^2 )
although an undirected edge between nodes A and B is the
same theoretically as two directed edges A --> B and B --> A,
we often represent them that way in adjacency lists for
an undirected graph
topological sort
Dijkstra's algorithm for shortest paths
spanning tree
minimum spanning tree
Kruskal's algorithm for MST
Prim's algorithm for MST
uses for graphs: course prerequisite structure, maps,
modeling digital networks, airline and auto routes,
flow of traffic, work flow in an organization,
finite state machines, document structure, hypermedia,
linguistic representations, all tree uses (trees are graphs)
tons and tons of stuff
Euler path
path through a graph that crosses each edge exactly once
Euler circuit
EP that begins and ends on the same node
Hamiltonian path
path through a graph that visits each node exactly once
Hamiltonian circuit
HP that begins and ends on the same node
Traveling salesman and P/NP/NP-complete
Halting problem and uncomputability
sorting
we can prove that no sort that uses only pairwise comparisons
can do better than N log N best case. This means that all
comparison sorts are Big-Omega(N log N), bounded below.
They might be worse, however, and some are.
insertion sort
O(N^2) worst case
sometimes used as the "finisher" in quicksort
selection sort
O(N^2) worst case
similar in concept to insertion sort but operationally it
manipulates the array a bit differently
heap sort
O(N log N) best case
O(N log N) worst case
O(N log N) average case
merge sort
O(N log N) best case
O(N log N) wosrt case
O(N log N) average case
cost: uses lots of extra memory, needs a second array as large
as the one you are sorting
quicksort
O(N log N) best case
O(N^2) worst case
O(N log N) average case
does not need much extra space
get extra efficiency by cutting off recursion when the lists
become smaller than, say, length 15 and then doing some other
sort like insertion sort on the small lists to finish up
bogosort
random shuffling until you get the one that is in order
O(infinity) worst case
an extreme, just an illustration, not a serious shuffle
sorting into buckets
a way to sort in O(N) time
can be used when the full range of possible input values is known
allocate an array of int A and initialize all slots to 0
then when you see an input number n, do A[n]++
this seems to violate the Big-Omega(N log N) proof, but this
sort does not do only pairwise comparisons. It effectively
does lots of comparisons at once by putting a number right
into an array slot. Since we believe TANSTAAFL, the tradeoff
is that is uses lots of memory (not all array slots are filled)
and you have to know in advance the range of inputs.
Other O(N) sorts
we can also sort N items in N time if we use parallel hardware
the Big-Oh complexity results for our sort algorithms have
all assumed we were running them on a single processor.
sorts are either stable or unstable
stable means it preserves the original order of items with
the same value when there are duplicate values in the input.
often a sort can be implemented to be stable or unstable, and
in many cases making it stable sacrifices some efficiency
bubble sort is stable
insertion sort is stable
selection sort as we did it in class is not stable
it can be made stable with no change in Big-Oh complexity
but at a cost in practical efficiency
merge sort is not necessarily stable but is easy to make so
(just make sure the merge takes from the left-most array if
it can for duplicates)
quick sort is not stable
Analysis of merge sort
solve the recurrance equation analyzing the behavior of mergesort
T(N) = 2*T(N/2) + N
we want to solve this so that we get something like
T(N) = ....
where the function T does not appear on the right hand side
we find T(N) is O(N log N)