COMP 410 Topics Page

This is a guide, not a guarantee. The exam covers all material we have discussed in class up to and including skip lists, and all concepts covered in the code we have written. The following is a list of topics included in this material:
List data structure
  general ops: get kth item
               insert at slot k
               delete at slot k

implementing lists:
  as array
  as linked cells

Ordered lists are especially nice
order information can assist in list searching
binary search can find an item in O(log N) time worst case
understand binary searching in an ordered sequence
         
special lists:
  Stack
    general ops: push, pop, top, empty, size
    keep pointer to most recent slot (t)
    push is insert at slot t
    pop is delete from slot t
    push, pop are O(1) since we access the top slot ddirectly via pointer
      no searching through the list, no rearrangement of locations
      this applies to implementations by links or array
    
  Queue
    general ops: add, remove, front, empty, size
    keep pointers to front slot, back slot
    add is insert at slot front
    remove is delete from slot back
    add, remove are O(1) since we access the slots directly by pointer
      no searching through the list, no rearrangement of locations
      this applies to implementations by links or array



Bubble sort
  O(N^2) time required to sort list of N items
  pro: easy to program
  con: very slow, useless for sorting lists of any serious size


Tree data structure in general (n-ary tree)

Binary Tree data structure
  Any n-ary tree can be represented as a binary tree

Binary Search Tree
  order property: all nodes in left child subtree have smaller
                  value than root, all nodes in right child subtree
                  have values greater

traversals:
  in-order, pre-order, post-order
    in-order gives sorted list from a BST
    pre-order, post-order give lists dependent on tree structure
  breadth-first
  

BST Sort
  Not a real sorting algorithm name, but a way to remember that sometimes
  fairly efficient way to sort a list is to first build a binary search tree,
  then traverse it in the proper way

  Average case analysis:
    1) build a BST from the list: O(N log N)
         one insert in a BST: O(log N) for tree with N elments
         doing N inserts: O(N log N)
    2) in-order traversal to get the sorted order of elements
         O(N) since we have to visit each of the N nodes in the BST
    So total avg case time complexity is
       O(N log N) + O(N) which is O(N * (1 + log N))
                         which is O(N log N)
  Worst case analysis:
    1) build a BST from the list: O(N * N) is O(N^2)
         one insert in a BST: O(N) for tree with N elments
         doing N inserts: O(N * N) is O(N^2)
    2) in-order traversal to get the sorted order of elements
         O(N) since we have to visit each of the N nodes in the BST
    So total worst case time complexity is
       O(N^2) + O(N) which is dominated by O(N^2)


"Big Oh" notation for expressing growth rates of functions
  List of length N
    insert is O(N)
    remove is O(1) once you find the element
    find is O(N)

  Queue containing N items
    add is O(1)
    remove is O(1)
    front is O(1)
    can't get middle elements like general List

  Stack containing N items
    push is O(1)
    pop is O(1)
    top is O(1)
    can't get middle elements like general List

  Binary Search Tree with N nodes
    find is O(log N)
    insert is O(log N)
    remove is O(log N) because we have to find it to remove it
    traversals are O(N) (list every element in some order)
    we get ordering (sorting) information from the tree structure
    degenerate case of BST is a List
 
"Big Oh" is an upper bound on growth speed of an operation/algorithm
         but it is not necessarily a good (tight) upper bound

         if some operation F is O(N^2) in worst case time use
                      then F is also O(N^3)... and 
                      then F is also O(2^N)
                      etc.
         but O(N^2) is the lowest (tightest) bound of those given.
         we would say O(N^2) is the "best" worst case time complexity of those choices
                             or the "least bad" worst case

Time-space trade-off
  -- sometimes we can gain some speed (save execution time) for an operation 
     by using more space (memory)
  -- we can also sometimes design ways to use less space for an operation, but 
     then lose speed when doing so

  examples?

Recursion: it's turtles all the way down (until the armadillo)
  -- a programming pattern in which the code in the body of a function contains one or 
     more calls to itself 
  -- well-formed recursive function must have
       -- a base case, which returns a simple result for a simple "small enough" 
          instance of the problem defined by the parameter values, and this result 
          is returned without a recursive call
       -- a recursive case, in which a recursive call is made (a call to the very function
          being defined); the recursive call is made on a "smaller" instance of the 
          problem being solved; the problem (its size) is defined by the parameters passed to
          the recursive call.
  -- a recursive function may have more that one base case, and may have more than one
     recursive case
  -- the base case(s) should proceed the recursive case(s)... testing for base case is testing
     for stopping the recursion, and it the first thing to do in the code
  -- proper termination (preventing infinite recursing) involves 
       1) making sure each recursive call is passed a smaller problem
       2) the problem is decreasing towards a base case
       3) the base case checks are written so that the problem size eventually trips
          a base case (and does not bypass it)

  -- poorly formed recursions can blow out the call stack
  -- using recursion when iteration is better can blow out the call stack
  -- proven theory: 
       -- any program that uses iteration (loops) and no recursion, can be re-written 
          to use recursion and no loops, while still computing the same function (results)
       -- any program that uses recursion no loops, can be re-written to use loops and no
          recursion, while still computing the same function (results)
       -- this means recursion give us expressive convenience for some problems, 
          but no extra computing power... there are no function that we can compute with 
          recursion, but cannot compute without it.
     
  

Runtime stack and runtime heap 
  the runtime system is how a programming language does the computations that 
  programs call for

  runtime stack allocates stack frames (call frames) when procedure/methods are called
    -- manages this dynamic memory is LIFO order, so the first call to end is the 
       most recent one started
    -- a stack frame (call frame) has memory space for the parameter values sent in
       when a function is called, the return address of the code that called it, 
       the local variables in the function , and some bookkeeping data 

  runtime heap is dynamic memory from which objects are allocated on calls to "new"

  heap is not managed with any particular discipline like LIFO etc.

  heap can get fraqgmented, stack cannot
  heap fragmentation is dealt with via gargage collection in Java

  how can the runtime stack blow out (run out of space)?  
    -- too many concurrently active function calls
    -- this can happen with recursion... especially a recursion that is poorly written 
       and does not hit a terminating base case

binary heap 
  a binary heap is a completely different thing from the runtime heap... 
    runtime heap is not heap-structured, not related at all
    just unfortunate and maybe a bit confusing to share the name "heap"

  binary tree with two properties:
    heap order property
    structure property 
      always a complete binary tree, with 
          last level being filled from L to R but perhaps not full
  representation of binary heap as array
  heaps can be Min or Max

  operations:
    getMin (Max) 
      O(1) worst, avg since it is at the root of the heap tree
    removeMin (Max) 
      O(log N) worst case
      O(log N) average case
    add 
      O(log N) worst case
      O(1) average case (75% of the heap elements in last 2 tree levels)
    build 
      simple way is O(N log N) by doing N adds
      cleaver way is O(N) if you have all the elements on hand, use special "build"

priority queue
  limplementing priority queue with list, BST
  implement priority queue using binary heap
  elements stored have value and priority
    priority is used to put elements into binary heap 
  operations:
    enq: add item to PrQUE
    deq: take item out of the PrQUE, and that item is "highest" (or lowest) priority
    front: produce element from queue that has "highest" (or lowest) priority
    size, empty, etc...


heap sort 
  special "build" a binary heap
  do repeated removeMin() to create the ordered sequence
  worst case time complexity if we do repeated insert to construct heap
  worst case time complexity if we do special build to construct heap


Sorting
  we have discussed and analyzed these sorting methods:
    -- Bubble Sort: rearranging the items in an array/list to put them in order
                    "gold standard of badness" since it's O(N^2) worst case as well as avg case
    -- keeping a basic List sorted (using inSort operation)
    -- BST sort: build a binary search tree and do in-order traversal
                 (worst case analysis, and average case analysis)
    -- Sorting with a binary heap
    -- sorting using skip list


Tree representations
  we have represented a tree (binary tree) as
    -- linked cells (basic unbalanced BST, also used for balanced methods like AVL, Splay)
    -- array (binary heap)
         in array representation, the links are implicit
         and are obtained via array subscript arithmetic
    -- know why array representation works ok for binary heap, not for general case of binary tree



Balanced BST
Tree balancing:
  AVL: imbalance condition, rotations to re-balance
       rotations alter tree structure but maintain the order property
       worst case time complexity for insert, remove, contains
       show how AVL tree is altered by insert, remove

  Splay: another approach to balancing
         splaying done any time a node is accessed
         splaying is applying preserving rotations until accesses node
           is moved to the root
         amortized worst case time complexity
         show how splaying works in a tree for insert, remove, find, etc.


Skip Lists
  like a linked list except items are kept in order
  alternative to balanced binary search trees
  multi-level linked cells
    each higher level skips over more of the items at the lower levels
  probabilistic data structure... 
    meaning we roll dice to decide aspects of the structure as we build
    also means if we build a skip list twice with the same input data sequence
      we will almost certainly get two different linked structures
  average time complexity: find O(log N), insert O(log N)
  worst case time complexity: find O(N), insert O(N)
    however the worst case is probabilistically very very unlikely
    average case is probabilistically very likely
UNC-CH COMP 410

Topics covered in class

Midterm exam will cover this material