COMP 210 Topics Page


Basic graph theory
  vertex (node), edge (arc), edge weight
  undirected graph
  directed graph
  path in a graph
  cycle in a graph
  bi-partite graph
    a tree is a bi-partite graph
  complete graph
  planar graph
  
  depth-first search (traversal) in a graph (use a stack)
  breadth-first search (traversal) in a graph (use a queue)
  
  representation of a graph
    adjacency matrix
      useful when graph is dense ( |E| approx.= |V|^2 )
    adjacency lists
      useful when graph is sparse ( |E| << |V|^2 )
    although an undirected edge between nodes A and B is not the 
      same theoretically as two directed edges A --> B and B --> A, 
      we often represent them that way in adjacency lists for 
      an undirected graph

  Topological sort

  Single Source Shortest Path algorithm for shortest paths in unweighted digraphs
  Dijkstra's algorithm for shortest paths in weighted digraphs

  Spanning tree
  Minimum spanning tree
    Kruskal's algorithm for MST
    Prim's algorithm for MST

  Uses for graphs: course prerequisite structure, maps,
    modeling digital networks, airline and auto routes, 
    flow of traffic, work flow in an organization, 
    finite state machines, document structure, hypermedia,
    linguistic representations, all tree uses (trees are graphs)
    tons and tons of stuff 

  Euler path 
    path through a graph that crosses each edge exactly once
  Euler circuit
    EP that begins and ends on the same node
  We have an efficient algorithm for finding Euler path/circuit

  Hamiltonian path
    path through a graph that visits each node exactly once
  Hamiltonian circuit
    HP that begins and ends on the same node
  Traveling salesman is example of Hamiltonian path/circuit
  No efficient (polynomial time, like O(N^2) ) algorithm is known for solving it


Sorting
  we can prove that no sort that uses only pairwise comparisons
    can do better (faster) than N log N best case.  This means that all
    comparison sorts are Big-Omega(N log N), bounded below.
    They might be worse, however, and some are.
  bubble sort
    O(N^2) worst case
  insertion sort
    O(N^2) worst case
    works with pair-wise swaps (item with nearest neighbor)
    sometimes used as the "finisher" in quicksort
  selection sort
    O(N^2) worst case
    similar in concept to insertion sort but operationally it 
      manipulates the array by possibly long-range swaps
  heap sort
    O(N log N) best case
    O(N log N) worst case
    O(N log N) average case
  merge sort
    O(N log N) best case
    O(N log N) wosrt case
    O(N log N) average case
    cost: often uses lots of extra memory, needs a second array as large
          as the one you are sorting
  quicksort
    O(N log N) best case
    O(N^2) worst case
    O(N log N) average case
    does not need much extra space
    get extra efficiency by cutting off recursion when the lists
      become smaller than, say, length 15 and then doing some other
      sort like insertion sort on the small lists to finish up

  bogosort
    random shuffling until you get the one that is in order
    O(infinity) worst case
    an extreme, just an illustration, not a serious sort

  sorting into buckets
    a way to sort in slightly worse than O(N) time
    can be used when the full range of possible input values is known
    allocate an array of int A and initialize all slots to 0
      then when you see an input number n, do A[n]++
    this seems to violate the Big-Omega(N log N) proof, but this
      sort does not do only pairwise comparisons.  It effectively
      does lots of comparisons at once by putting a number right
      into an array slot.  Since we believe TANSTAAFL, the tradeoff
      is that is uses lots of memory (not all array slots are filled)
      and you have to know in advance the range of inputs.
 
  sorts are either stable or unstable
    stable means it preserves the original order of items with 
      the same value when there are duplicate values in the input.
    often a sort can be implemented to be stable or unstable, and 
      in many cases making it stable sacrifices some efficiency
    bubble sort is stable
    insertion sort is stable
    selection sort as we did it in class is not stable
      it can be made stable with no change in Big-Oh complexity
      but at a cost in practical efficiency
    merge sort is not necessarily stable but is easy to make so
      (just make sure the merge takes from the left-most array if
       it can for duplicates)
    quick sort is not stable
    heap sort is not stable (heap is the issue)


//===========================================================================

Sets and Maps
ADT ops, behavior, and ways to implement them (lists, trees, hashing, etc.)

//===========================================================================

Hashing, Hash Tables, Hash Maps
  Pigeon hole principle (chicken hole principle)
  Hash Table, hash function
    how to pick table size
    how to make an effective hash function
    what to do about collisions
      hash into linked lists (or other collection data structures)
        table size should be prime, and same the size as # elements expected
      probing to find an open cell in the table
        linear probing function
        quadratic probing function
        "expo" probing function
      table size should be prime, and twice the # elements expected
    Rehasing to extend the table size

  Hash Maps... storing associated information/object along with a hashed key

//===========================================================================

binary heap
  a binary heap is a completely different thing from the runtime heap... 
    runtime heap is not heap-structured, not related at all
    just unfortunate and maybe a bit confusing to share the name "heap"

  binary tree with two properties:
    heap order property
    structure property 
      always a complete binary tree, with 
          last level being filled from L to R but perhaps not full
  representation of binary heap as array
  heaps can be Min or Max

  operations:
    getMin (Max) 
      O(1) worst, avg since it is at the root of the heap tree
    removeMin (Max) 
      O(log N) worst case
      O(log N) average case
    add 
      O(log N) worst case
      O(1) average case (75% of the heap elements in last 2 tree levels)
    incElt, decElt
      cost of searching for the Elt, then plus cost of rearranging the heap array when
                                     the priority is incremented or decremented
      searching the array linearly is O(N)
      using a HashMap (like Assn 4) is O(1)
      cost of rearranging the heap array in O(log N)
    build 
      simple way is O(N log N) by doing N adds
      cleaver way is O(N) if you have all the elements on hand, use special "build"

//===========================================================================

priority queue
  implementing priority queue with list, BST
     and asymptotic analysis
  implement priority queue using binary heap
     and asymptotic analysis
  elements stored have value and priority
    priority is used to put elements into binary heap 
  operations:
    enq: add item to PrQUE
    deq: take item out of the PrQUE, and that item is "highest" (or lowest) priority
    front: produce element from queue that has "highest" (or lowest) priority
    size, empty, etc...

//===========================================================================

Tree representations (linked cells, implicit as array)
  We have represented a tree (binary tree) as
    -- linked cells (basic unbalanced BST, also used for balanced methods like AVL, Splay)
    -- array (binary heap)
         in array representation, the links are implicit
         and are obtained via array subscript arithmetic
    -- know why implicit, array representation works ok for binary heap, not for 
         general case of binary tree

//===========================================================================

Balanced BST
Tree balancing:
  AVL: imbalance condition, rotations to re-balance
       rotations alter tree structure but maintain the order property
       worst case time complexity for insert, remove, contains
       show how AVL tree is altered by insert

  Splay: no balance/imbalance condition kept
      rather we "splay" a node to the root every operation that touched a node
        -- insert, delete, contains, findMin, findMax
      splay is move a node up to the root via rotations (similar to AVL) 
        that maintain the BST node relation properties
 

      asymptotic analysis
        any ONE INDIVIDUAL operation has a possible worst case time complexity of O(N)
           for a splay tree of N nodes
        same to insert, delete, contains, findMin, findMax...
        this means a splay tree can stretch out... get mis-shapen like an 
          unbalanced BST

        BUT... is we do enough ops in a row that do splaying... the tree will
          colapse of "fold up" from ugly shaped into something more balanced.
 
        so we say that an op like insert, delete, contains, findMin, findMax
          in a long enough sequence, will amortize out to O(log N) for each
          of them... the long expensive ops will be balanced in the sequence by
          several short near constant cost ops.

        So we say that insert, delete, contains, findMin, findMax
          all have worst case time complexity of "amortized O(log N)"

        this is due to locality of reference
        splay trees give good performace for apps that exhibit locality of reference.

        locality of reference... know it.. 

Binary Tree terms
  full, perfect, complete
UNC-CH COMP 210

Topics covered in class

The Final Exam is comprehensive

Material after the Midterm