COMP 410 Topics Page

proof by induction
proof by contradiction
proof by contrapositive
dis-proof by counter example


axiomatic definition of ADT behavior
  List ADT
  Stack ADT
  Queue ADT
  Bounded Stack ADT
  Bounded Queue ADT
  Map ADT
  Set ADT
  MSet ADT (multiset)


List data structure
  general ops: get kth item
               insert at slot k
               delete at slot k

implementing lists:
  as array
  as linked cells

Sorted list are especially nice
order information can assist in list searching
binary search can find an item in O(log N) time worst case
         
special lists:
  Stack
    general ops: push, pop, top, empty, size
    keep pointer to most recent slot (t)
    push is insert at slot t
    pop is delete from slot t
    push, pop are O(1) since we access the top slot ddirectly via pointer
      no searching through the list, no rearrangement of locations
      this applies to implementations by links or array
    
  Queue
    general ops: add, remove, front, empty, size
    keep pointers to front slot, back slot
    add is insert at slot front
    remove is delete from slot back
    add, remove are O(1) since we access the slots directly by pointer
      no searching through the list, no rearrangement of locations
      this applies to implementations by links or array


Bubble sort
  O(N^2) time required to sort list of N items
  pro: easy to program
  con: very slow, useless for sorting lists of any serious size


Tree data structure in general (n-ary tree)
  Knuth transform, converts n-ary tree to binary tree
Binary Tree data structure
Binary Search Tree

Tree balancing: AVL, Splay

traversals:
  in-order, pre-order, post-order
  in-order gives sorted list from a BST
  pre-order, post-order give lists dependent on tree structure

BST Sort
  Not a real sorting algorithm name, but a way to remember that one fairly 
  efficient way to sort a list is to first build a binary search tree,
  then traverse it in the right way
    1) build a BST from the list: O(N log N)
         one insert in a BST: O(log N) for tree with N elments
         doing N inserts: O(N log N)
    2) in-order traversal to get the sorted order of elements
         O(N) since we have to visit each of the N nodes in the BST
  So total worst case time complexity is
    O(N log N) + O(N) which is O(N * (1+ log N))
                      which is O(N log N)


Pigeon hole principle
Hash Table, hash function
  how to pick table size
  how to make an effective hash function
  what to do about collisions
    hash into linked lists (or other collection data structures)
      table size should be prime, and same the size as # elements expected
    probing to find an open cell in the table
      linear probing function
      quadratic probing function
      "expo" probing function
      table size should be prime, and twice the # elements expected
  rehasing to extend the table size
  

"Big Oh" notation for expressing growth rates of functions
  List of length N
    insert is O(N)
    remove is O(1) once you find the element
    find is O(N)

  Queue containing N items
    add is O(1)
    remove is O(1)
    front is O(1)
    can't get middle elements like general List

  Stack containing N items
    push is O(1)
    pop is O(1)
    top is O(1)
    can't get middle elements like general List

  Binary Search Tree with N nodes
    find is O(log N)
    insert is O(log N)
    remove is O(log N) because we have to find it to remove it
    traversals are O(N) (list every element in some order)
    we get ordering (sorting) information from the tree structure
    degenerate case of BST is a List
 
  Hash table of N elements
    insert is O(1)
    find is O(1)
    delete is O(1) 
    get no ordering information from the table

  Set
    basic ops: add, delete, isIn (find), empty, delete
    sets can be implemented as BST to make operations take
       O(log N) worst case 
       add is O(log N) like insert in BST
       delete is O(log N) like delete in BST
       find is O(log N) like find in BST
         isIn is a version of find, returns boolean rather than value record
       empty is O(1)
 
  Map
    basic ops: (assume implemented as BST)
      add (key, value) pair is O(log n)
      delete (key, value) pair is O(log n)
      get (key) is O(log N), returns the value stored with this key
      can also get order-dependent information like
        findMin, findMax  in O(log N)
        sort items in O(N)
    basic ops: (assume implemented as Hashtable)
      add (key, value) pair is O(1)
      delete (key, value) pair is O(1)
      get (key) is O(1), returns the value stored with this key
         no order-dependent ops are possible

binary heap 
  binary tree with two properties:
    heap order property
    structure property 
      always complete binary tree, last level filled from L to R
  representation of binary heap as array
  heaps can be Min or Max

priority queue
  implement priority queue as binary heap
  operations:
    getMin (Max) 
      O(1) worst, avg since it is at the root of the heap tree
    removeMin (Max) 
      O(log N) worst case
      O(log N) average case
    add 
      O(log N) worst case
      O(1) average case
    build 
      nieve is O(N log N) by doing N adds
      cleaver is O(N) 


axioms for alternate version of priority queue
  items are (value, priority) pairs

heap sort 
  build a binary heap
  do repeated removeMin() to create the sort

basic graph theory
  vertex (node), edge (arc), edge weight
  undirected graph
  directed graph
  path in a graph
  cycle in a graph
  bi-partite graph
  complete graph
  planar graph
  
  depth-first search (traversal) in a graph (use a stack)
  breaadth-first search (traversal) in a graph (use a queue)
  
  representation of a graph
    adjacency matrix
      useful when graph is dense ( |E| approx.= |V|^2 )
    adjacency lists
      useful when graph is sparse ( |E| << |V|^2 )
    although an undirected edge between nodes A and B is the 
      same theoretically as two directed edges A --> B and B --> A, 
      we often represent them that way in adjacency lists for 
      an undirected graph

  topological sort
  Dijkstra's algorithm for shortest paths
  spanning tree
  minimum spanning tree
    Kruskal's algorithm for MST
    Prim's algorithm for MST
  uses for graphs: course prerequisite structure, maps,
    modeling digital networks, airline and auto routes, 
    flow of traffic, work flow in an organization, 
    finite state machines, document structure, hypermedia,
    linguistic representations, all tree uses (trees are graphs)
    tons and tons of stuff 

  Euler path 
    path through a graph that crosses each edge exactly once
  Euler circuit
    EP that begins and ends on the same node
  Hamiltonian path
    path through a graph that visits each node exactly once
  Hamiltonian circuit
    HP that begins and ends on the same node
  Traveling salesman and P/NP/NP-complete
  Halting problem and uncomputability

sorting
  we can prove that no sort that uses only pairwise comparisons
    can do better than N log N best case.  This means that all
    comparison sorts are Big-Omega(N log N), bounded below.
    They might be worse, however, and some are.
insertion sort
  O(N^2) worst case
  sometimes used as the "finisher" in quicksort
selection sort
  O(N^2) worst case
  similar in concept to insertion sort but operationally it 
    manipulates the array a bit differently
heap sort
  O(N log N) best case
  O(N log N) worst case
  O(N log N) average case
merge sort
  O(N log N) best case
  O(N log N) wosrt case
  O(N log N) average case
  cost: uses lots of extra memory, needs a second array as large
        as the one you are sorting
quicksort
  O(N log N) best case
  O(N^2) worst case
  O(N log N) average case
  does not need much extra space
  get extra efficiency by cutting off recursion when the lists
    become smaller than, say, length 15 and then doing some other
    sort like insertion sort on the small lists to finish up
bogosort
  random shuffling until you get the one that is in order
  O(infinity) worst case
  an extreme, just an illustration, not a serious shuffle

sorting into buckets
  a way to sort in O(N) time
  can be used when the full range of possible input values is known
  allocate an array of int A and initialize all slots to 0
    then when you see an input number n, do A[n]++
  this seems to violate the Big-Omega(N log N) proof, but this
    sort does not do only pairwise comparisons.  It effectively
    does lots of comparisons at once by putting a number right
    into an array slot.  Since we believe TANSTAAFL, the tradeoff
    is that is uses lots of memory (not all array slots are filled)
    and you have to know in advance the range of inputs.

Other O(N) sorts
  we can also sort N items in N time if we use parallel hardware
    the Big-Oh complexity results for our sort algorithms have
    all assumed we were running them on a single processor.

sorts are either stable or unstable
  stable means it preserves the original order of items with 
    the same value when there are duplicate values in the input.
  often a sort can be implemented to be stable or unstable, and 
    in many cases making it stable sacrifices some efficiency
  bubble sort is stable
  insertion sort is stable
  selection sort as we did it in class is not stable
    it can be made stable with no change in Big-Oh complexity
    but at a cost in practical efficiency
  merge sort is not necessarily stable but is easy to make so
    (just make sure the merge takes from the left-most array if
     it can for duplicates)
  quick sort is not stable

Analysis of merge sort
solve the recurrance equation analyzing the behavior of mergesort
  T(N) = 2*T(N/2) + N
  we want to solve this so that we get something like
   
     T(N) = ....

  where the function T does not appear on the right hand side
  we find T(N) is O(N log N)

UNC-CH COMP 410

Topics covered in class

Midterm exam will cover this material

End Midterm material

Material after the Midterm