COMP 590-096 Fall 2010
Assignment 4
Due date: Tuesday, December 14, 11:59 PM
Submission Instructions
This is the final assignment. There are four problems each worth 20 points,
and the total assignment is worth 40 points. That is, you only need to do
any two problems for full credit, but if you turn in additional work, you
can get up to 40 points extra credit. Problems 1 and 2 are programming exercises,
and problems 3 and 4 are written exercises.
As usual, you should turn in a report file named
lastname_firstname_assignment4.pdf or lastname_firstname_assignment4.doc,
and a source file named lastname_firstname_assignment4.zip. Note that if you
do not choose to do any of the programming problems, you do not have to turn in a
source file. Submission is via Blackboard as before.
Late policy: There will be absolutely no extensions.
Policy on outside sources: As usual, do not submit anybody else's work
as your own and acknowledge all outside sources used in the preparation of this
assignment.
Problem 1 (Programming): Naive Bayes Text Classification
In this problem, your goal is to implement a Naive Bayes text classifier as described
in this lecture. The two target classes
are:
- NIPS: Research articles published in the proceedings of Neural Information
Processing Systems,
a machine learning conference. There are 1500 documents in total, 100 of which will
be used for training and the rest for testing.
- NSF: Abstracts of research projects funded by the National Science Foundation.
There are 5000 documents in total, 1000 of which will be used for training and
the rest for testing (there is more training data for this class because its documents
tend to be shorter, and so give fewer word occurrence samples).
Details
- Data: This file (4.2MB) contains the documents from
both classes already pre-processed into a bag-of-words format, the vocabulary words,
and the indices of the files to be used for training. Please see the readme.txt
file in the archive for details of the data format.
- Training: The goal of the training stage is to estimate the likelihoods
P(word | class1) and P(word | class2) for
all the vocabulary words. As explained in the lecture, the likelihood estimate for a given
word is:
P(W | class) = (Total # of occurrences of W in training documents from the class) /
(Total # of word occurrences in the training documents from the class).
In addition, as discussed in the lecture, you have to smooth the likelihoods to
ensure there are no zero counts. Laplacian smoothing is a very simple method
that increases the observation count of every word by one. This corresponds to adding
1 to the numerator above, and V to the denominator (where V is the number of words in the
vocabulary).
- Testing: You will perform maximum likelihood classification of test documents
according to the learned Naive Bayes model. That is, you will not be using a prior for each
document class, or assuming that the prior is uniform (0.5 chance of observing a document from
each class). Suppose a document D consists of words W1, ..., WV with
counts C1, ..., CV. Then the likelihood of the document given a class is:
P(D | class) = P(W1 | class)C1 ⋅ ...⋅
P(WV | class)CV.
As explained in the lecture slides, though, in order to avoid underflow, you need to work with the log of the above quantity.
After you compute the log-likelihoods for both classes for every test image, you will classify that
image as the class for which it gets the highest likelihood. Because you know which file the document
actually came from, you know its true class and can use this information to check whether your
classification is correct.
Important: Don't forget to exclude the training documents from the test set!
What to put in the report
- For each of the two classes, display the top 50 words that get the highest likelihood.
Intuitively, these are the most "characteristic" words for those classes. (To save space, separate
words by spaces instead of line breaks. Or better yet, do some kind of "tag cloud" display where
the size of the word corresponds to its likelihood).
- For each of the two classes, display the top two test documents that get the highest
likelihood scores. These are the most "characteristic" documents for their classes.
- Report the number of misclassifications for test documents from each of the classes
(that is, how many NIPS documents were misclassified as NSF and vice versa). Hint: this is
a pretty easy text classification problem, so the number of mistakes should be pretty low.
- Display each misclassified document. Do the mistakes make sense?
- (Optional, potentially worth some extra credit) Any additional analysis or data
exploration you think is interesting, for
example, plots of distributions of document likelihoods, relationship between document class
and document length, classification performance with smaller training set sizes, etc. Since this
text classification problem is pretty easy, is there some way to use an even simpler document
model and still do well? For example, can we ignore the word counts and simply look at whether
given words are present or absent? Can we reduce the vocabulary to only the most "informative"
words for each class, and how can we find such words?
Problem 2 (Programming): MDPs and Reinforcement Learning
Consider the following environment, similar to the one in Section 17.1 of the textbook:

As in the textbook, the transition model is as follows: the intended outcome occurs
with probability 0.8, and with probability 0.1 the agent moves at either right angle to the
intended direction. If the move would make the agent walk into a wall, the agent stays in
the same place as before. The rewards for the non-terminal states are -0.04.
- Assuming the known transition model and reward function listed above, find the optimal policy
and the utilities of
all the states using value iteration or policy iteration. Display the optimal policy
and the utilities of all the states, and plot utility estimates as a function of the number of
iterations as in Figure 17.5(a) (for value iteration, you should need no more than 50 iterations to
get convergence). In this question and the next one, use a discount factor of 0.99.
Click here for reference utility values to help you
check the correctness of your answer.
- Now consider the reinforcement learning scenario in which the transition model and
the reward function are unknown to the agent, but the agent can observe the outcome of its
actions and the reward received in any state. (In the implementation, this means that the
successor states and rewards will be given to the agent by some black-box functions whose
parameters the agent doesn't have access to.) Use TD Q-learning, as in Section 21.3.2,
to learn an action-utility function. Experiment with different parameters for the exploration function
f as defined in Section 21.3.1, and report which choice works the best. For the learning rate
function, start with the suggestion on p. 837 of the book, i.e., alpha(t) = 60/(59+t), and
play around with the numbers to get the best behavior.
In the report, plot utility estimates and their RMS error as a function of
the number of trials (you will probably need several thousand), as in Figure 21.7. RMS stands
for root mean squared error: 
where U'(s) is the estimated utility of state s, U(s) is the "true" utility as determined by
value iteration, and N is the number of states.
- (Optional; for more extra credit) Design a more complicated maze environment of your own and run algorithms from
1 and 2 on it. How does the number of states and the complexity of the environment affect
convergence? How complex can you make the environment and still be able to learn the right
policy?
- (Optional; for more extra credit) Experiment with any advanced methods from Chapters 17 and 21 of the
textbook. For example, instead of Q-learning, try to use an explicit state-based representation
and learn the transition function of the environment.
Problem 3 (Written): 20 Questions
Consider the problem of writing an AI program to play the 20 Questions Game.
Such programs exist, for example, at 20q.net. Write
a detailed proposal of how you would go about implementing such a program. Feel free
to search the Web for information on how such programs may work (and be sure to reference
any outside material you consulted), but do not constrain
yourself to describing an existing solution. Instead, you should try to come up
with your own solution based on what you have learned in this course or from other sources.
Your proposal should include the following aspects:
- The problem domain: In which domain would you want to play the game, and why?
What considerations are important to you in selecting the target domain?
- The model: How would you formulate the problem? What would be the underlying
representation (model structure) used in your program, and what parameters does it have?
- Model instantiation or learning: How would you instantiate the parameters of the
model or the knowledge base that it needs to operate on? If you plan to acquire these parameters
from training data, what is a feasible data source (give specific data collection scenarios or URLs
of possible sources)? How much training data do you think would be
required to learn a good model, and can your proposed source supply this amount?
- Testing or inference: Assuming you have your knowledge base and/or model parameters,
what strategy would you use to ask the user questions and how would you compute the answer?
How would you deal with noise in your knowledge base or unreliable user responses?
- Continuous improvement: How can you use user input to improve performance? Can you
use it to grow the knowledge base, to adjust the model parameters, or to learn a better
question-asking policy?
If you are not certain of what is the best
solution for some of the above questions, you can list and discuss several alternatives with
their pros and cons. But for the most part, you should try to commit yourself to a single
overall solution and flesh it out as much as possible.
If you don't like the 20 Questions game and want to write a proposal for a different
AI system, please let me know ASAP and we can discuss the possibility individually.
The target length of the writeup should be at least three pages,
single-spaced, 1-inch margins, 11-point font.
The more specific and detailed you are, the more points you will receive.
Problem 4 (Written): Recent Advances in AI
Choose at least three items from the class discussion board on AI in the news/on the Web.
These items should all share the same theme, e.g., autonomous driving, game playing, etc.
If there aren't three items for some topic you are interested in, feel free to post more
until there are enough! Having chosen the items,
write a summary or synthesis report comparing and contrasting them.
In your writeup, try to touch on as many of the following aspects as possible:
- Problem definition: give an agent-based formulation of the overall problem
addressed in your chosen articles or web links. It doesn't have to be a complete "PEAS-style"
specification, but be sure to touch on the most relevant and interesting characteristics
of the problem, the environment, and the performance measure. What are the key technical
challenges in solving this problem?
- Technical solution: give an outline of the solution(s) proposed in the articles
or web links. Be as specific as possible about the technical components of that
solution, including algorithms, hardware, data, etc. If you feel that insufficient
detail is given in the items themselves, try to gather additional information from related
literature or make an educated guess based on your general knowledge and/or what you've
learned from the class. What are the most interesting/revolutionary aspects of the
technical approach? What are the similarities/differences between the different items
you chose?
- Connection to class material: How do the solution components relate to
topics or themes covered in this class? Be as specific and detailed as possible.
- Past context: What accounts for the recent progress made on this problem?
Possible factors may include (but are not limited to) advances in knowledge, changes in
scientific paradigm, increases in computing power, and increasing availability of training data.
List all that are relevant and elaborate on them.
- Future trends: What future developments do you anticipate for this problem/
technology? Is it basically "solved," or if not, what challenges still need to be
overcome? How quickly will progress continue to be made on this problem?
The target length of the writeup should be at least three pages,
single-spaced, 1-inch margins, 11-point font.
The more specific and detailed you are, the more points you will receive.
|