Binary/Splay Tree Implementation

Part 3: BST vs. Splay Tree Analysis

You are going to use the BST and Splay tree code you wrote for parts 1 and 2. Your main program will create BSTs and SPLTs, fill them with random data, measure the heights, and report.

What to Hand-in

Hand in your code as per the instructions under Assignment 4 Part 3 in the WorkBase.
Submit a single text file with the entire Java source... main class, random gen code, as well as your BST and SPLT classes. We should be able to cut and paste this single Java source into a Java IDE and have it compile, execute, and produce your output for grading.

Driver/tester main method

The BST and SPLT code is the same as from parts 1 and 2. The main driver class will be different. This time it will not be interactive. Rather it will simply execute and print some analytical output comparing your BST to your SPLT.

Do the following things in the main class:
  1. 15 times: create a BST, randomly fill it with 65535 strings, compute and save its height
  2. Compute the average height of the 15 BSTs
  3. 15 times: create a SPLT, randomly fill it with 65535 strings, compute its height, save the height
  4. Compute the average height of the 15 SPLTs
  5. Report results (see output form below)


If you did not write a method for your BST to compute height do it now. Height of the tree is the longest path from the root to a leaf, and we count links -- not nodes. Remember that the height of a tree with only a root (one node) is 0. The height of a perfect binary tree with 3 nodes (a root and two childen) is 1; the height of a perfect binary tree with 7 nodes is 2; etc.

You may compute this recursively as needed (a bit inefficient) or you may maintain the information in the nodes and keep it updated as inserts and deletes are done. Computing it recursively as needed is a bit inefficient ( O(N) ). Is it worse to call it after each insert and put the height in a field in the root to be read when the height method is called?

Write it carefully, and test it on small trees to make sure you have correctly handled the special extreme cases such as a tree that is a root and no children (height 0), a null tree (height -1, or an error), and nodes with one child there and one child null.

Then make your program generate and fill a BST 15 separate times, each time generating a BST with 65,535 nodes containing random character string values; 65,535 is (2^16)-1. This means generate a random string and insert it into the BST; do this 65,535 times. Remember to disallow duplicates and make sure the the tree has a full 65,535 uniquely-valued nodes.

For each of the 15 runs, compute the final tree height. Report the 15 heights, and also the average tree height over the 15 runs.

Sample Output

Make your ourput look this way. For ease of grading I would like the averages right up top, so keep the heights you compute in an array or some data structure so you can report them after the averages.

BST average height: 27.514
SPLT average height: 23.783
BST Runs, each tree with 65,535 random strings
1: height is 25
2: height is 26
3: height is 32
4: height is 29
15: height is 26

SPLT Runs, each tree with 65,535 random strings
1: height is 24
2: height is 21 
3: height is 19
4: height is 23
15: height is 22

Final Notes

Consider what you find, and mentally compare this average to what you would expect the height to be if this were a full binary tree (or nearly so, like an AVL tree) with the same number of nodes. Remember that the #nodes in a complete full binary tree with height h is 2^(h+1)-1. Examples:

   A root with two children leaves is 3 nodes; 
   height is 1; 
   the formula computes 2^(1+1)-1 = 2^2 - 1 = 4-1 = 3

   Complete full binary tree of height 2
   a root with 2 children, each child with 2 leaf children
   Formula: 2^(2+1)-1 = 2^3 - 1 = 8-1 = 7 nodes

   Your tree you know has 65,535 nodes
   2^(15+1)-1 = 2^16 - 1 
   so this says if it is full and complete is would have height 15
See how close yours comes to 15. It will be longer, since a BST will not work out to be complete and full when filled randomly.