axiomatic definition of ADT behavior List ADT Stack ADT Queue ADT Bounded Stack ADT Bounded Queue ADT
List data structure general ops: get kth item insert at slot k delete at slot k implementing lists: as array as linked cells Ordered lists are especially nice order information can assist in list searching binary search can find an item in O(log N) time worst case special lists: Stack general ops: push, pop, top, empty, size keep pointer to most recent slot (t) push is insert at slot t pop is delete from slot t push, pop are O(1) since we access the top slot ddirectly via pointer no searching through the list, no rearrangement of locations this applies to implementations by links or array Queue general ops: add, remove, front, empty, size keep pointers to front slot, back slot add is insert at slot front remove is delete from slot back add, remove are O(1) since we access the slots directly by pointer no searching through the list, no rearrangement of locations this applies to implementations by links or array
Bubble sort O(N^2) time required to sort list of N items pro: easy to program con: very slow, useless for sorting lists of any serious size
Tree data structure in general (n-ary tree) Binary Tree data structure Any n-ary tree can be represented as a binary tree Binary Search Tree order property: all nodes in left child subtree have smaller value than root, all nodes in right child subtree have values greater traversals: in-order, pre-order, post-order in-order gives sorted list from a BST pre-order, post-order give lists dependent on tree structure breadth-first BST Sort Not a real sorting algorithm name, but a way to remember that sometimes fairly efficient way to sort a list is to first build a binary search tree, then traverse it in the right way Average case analysis: 1) build a BST from the list: O(N log N) one insert in a BST: O(log N) for tree with N elments doing N inserts: O(N log N) 2) in-order traversal to get the sorted order of elements O(N) since we have to visit each of the N nodes in the BST So total avg case time complexity is O(N log N) + O(N) which is O(N * (1 + log N)) which is O(N log N) Worst case analysis: 1) build a BST from the list: O(N * N) is O(N^2) one insert in a BST: O(N) for tree with N elments doing N inserts: O(N * N) is O(N^2) 2) in-order traversal to get the sorted order of elements O(N) since we have to visit each of the N nodes in the BST So total worst case time complexity is O(N^2) + O(N) which is dominated by O(N^2)
"Big Oh" notation for expressing growth rates of functions List of length N insert is O(N) remove is O(1) once you find the element find is O(N) Queue containing N items add is O(1) remove is O(1) front is O(1) can't get middle elements like general List Stack containing N items push is O(1) pop is O(1) top is O(1) can't get middle elements like general List Binary Search Tree with N nodes find is O(log N) insert is O(log N) remove is O(log N) because we have to find it to remove it traversals are O(N) (list every element in some order) we get ordering (sorting) information from the tree structure degenerate case of BST is a List "Big Oh" is an upper bound on growth speed of an operation/algorithm but it is not necessarily a good (tight) upper bound if some operation F is O(N^2) in worst case time use then F is also O(N^3)... and then F is also O(2^N) etc. but O(N^2) is the lowest (tightest) bound of those given. we would say O(N^2) is the "best" worst case time complexity of those choices or the "least bad" worst case
Time-space trade-off -- sometimes we can gain some speed (save execution time) for an operation by using more space (memory) -- we can also sometimes design ways to use less space for an operation, but then lose speed when doing so examples?
Recursion: it's turtles all the way down (until the armadillo) -- a programming pattern in which the code in the body of a function contains one or more calls to itself -- well-formed recursive function must have -- a base case, which returns a simple result for a simple "small enough" instance of the problem defined by the parameter values, and this result is returned without a recursive call -- a recursive case, in which a recursive call is made (a call to the very function being defined); the recursive call is made on a "smaller" instance of the problem being solved; the problem (its size) is defined by the parameters passed to the recursive call. -- a recursive function may have more that one base case, and may have more than one recursive case -- the base case(s) should proceed the recursive case(s)... testing for base case is testing for stopping the recursion, and it the first thing to do in the code -- proper termination (preventing infinite recursing) involves 1) making sure each recursive call is passed a smaller problem 2) the problem is decreasing towards a base case 3) the base case checks are written so that the problem size eventually trips a base case (and does not bypass it) -- poorly formed recursions can blow out the call stack -- using recursion when iteration is better can blow out the call stack -- proven theory: -- any program that uses iteration (loops) and no recursion, can be re-written to use recursion and no loops, while still computing the same function (results) -- any program that uses recursion no loops, can be re-written to use loops and no recursion, while still computing the same function (results) -- this means recursion give us expressive convenience for some problems, but no extra computing power... there are no function that we can compute with recursion, but cannot compute without it.
Runtime stack and runtime heap the runtime system is how a programming language does the computations that programs call for runtime stack allocates stack frames (call frames) when procedure/methods are called -- manages this dynamic memory is LIFO order, so the first call to end is the most recent one started -- a stack frame (call frame) has memory space for the parameter values sent in when a function is called, the return address of the code that called it, the local variables in the function , and some bookkeeping data runtime heap is dynamic memory from which objects are allocated on calls to "new" heap is not managed with any particular discipline like LIFO etc. heap can get fraqgmented, stack cannot heap fragmentation is dealt with via gargage collection in Java how can the runtime stack blow out (run out of space)? -- too many concurrently active function calls -- this can happen with recursion... especially a recursion that is poorly written and does not hit a terminating base case
binary heap a binary heap is a completely different thing from the runtime heap... runtime heap is not heap-structured, not related at all just unfortunate and maybe a bit confusing to share the name "heap" binary tree with two properties: heap order property structure property always a complete binary tree, with last level being filled from L to R but perhaps not full representation of binary heap as array heaps can be Min or Max operations: getMin (Max) O(1) worst, avg since it is at the root of the heap tree removeMin (Max) O(log N) worst case O(log N) average case add O(log N) worst case O(1) average case (75% of the heap elements in last 2 tree levels) build simple way is O(N log N) by doing N adds cleaver way is O(N) if you have all the elements on hand, use special "build" priority queue implement priority queue using binary heap elements stored have value and priority priority is used to put elements into binary heap operations: enq: add item to PrQUE deq: take item out of the PrQUE, and that item is "highest" (or lowest) priority front: produce element from queue that has "highest" (or lowest) priority size, empty, etc... heap sort special "build" a binary heap do repeated removeMin() to create the ordered sequence worst case time complexity if we do repeated insert to construct heap worst case time complexity if we do special build to construct heap
Sorting we have discussed and analyzed these sorting methods: -- Bubble Sort: rearranging the items in an array/list to put them in order "gold standard of badness" since it's O(N^2) worst case as well as avg case -- keeping a basic List sorted (using inSort operation) -- BST sort: build a binary search tree and do in-order traversal (worst case analysis, and average case analysis) -- Sorting with a binary heap
Tree representations we have represented a tree (binary tree) as -- linked cells (basic unbalanced BST, also used for balanced methods like AVL, Splay) -- array (binary heap) in array representation, the links are implicit and are obtained via array subscript arithmetic -- know why array representation works ok for binary heap, not for general case of binary tree