COSC-270: Artificial Intelligence

Project 5
Spring 2022

Due: T 5/3 @ 11:59 PM ET
9 points

Implement Russell and Norvig's grid world, depicted in Figure 17.1, as an environment so an agent does not have access to the transition function.

Implement Sutton and Barto's algorithm for Q-learning using an $\epsilon$-greedy strategy. You can find this algorithm in your lecture notes or on page 131 of Sutton and Barto's book Reinforcement Learning: An Introduction.

I put starter code for this project on cs-class. You can retrieve it by typing

cs-class-1$ cp ~maloofm/cosc270/ ./

Instructions for Electronic Submission

In a file named HONOR, provide the following information:

In accordance with the class policies and Georgetown's Honor Code,
I certify that, with the exceptions of the course materials and those
items noted below, I have neither given nor received any assistance
on this project.

When you are ready to submit your project for grading, put your source files, Makefile, and honor statement in a zip file named Upload the zip file to Autolab using the assignment p5. Make sure you remove all debugging output before submitting.

Plan B

If Autolab is down, upload your zip file to Canvas.

Copyright © 2022 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.