COSC 3450: Artificial Intelligence

Project 3
Spring 2024

Due: F 3/22 @ 5 PM ET
10 points

  1. Implement Neapolitan's algorithm for probability propagation in trees. Your implementation must be general, meaning that it must work for any tree-structured Bayesian network.

    I put some starter code on cs-class (aka class-1.cs.georgetown.edu). To get started, log on to cs-class and copy over the following zip file:

    cs-class-1% cd
    cs-class-1% cp ~maloofm/cosc3450/p3.zip ./
    cs-class-1% unzip p3.zip
    
    In the p3 directory, you will find partial class definitions for Drug, Node, and Network.

  2. Test your implementation using the example from lecture involving the patients who participated in a drug study. You can find this code in Drug.java. It instantiates the evidence that the doctor encountered a cured patient and prints the probability that the patient was part of the drug study.

  3. Implement the following network for determining if an employee is an insider as Insider.java. Let
    $a_{0} =$ the employee is not an insider,
    $a_{1} =$ the employee is an insider,
    $b_{0} =$ the employee was not seen using an external drive,
    $b_{1} =$ the employee was seen using an external drive,
    $c_{0} =$ the employee was not caught exfiltrating documents,
    $c_{1} =$ the employee was caught exfiltrating documents,
    $d_{0} =$ the employee did not conduct uncharacteristic document downloads, and
    $d_{1} =$ the employee conducted uncharacteristic document downloads.
    The network and its probabilities before initialization are as follows:
    After initialization, the state of the network is as follows:
    After instantiating that the employee was seen using an external drive, we have the following network:
    Note that this network is after Neapolitan's in Section 6.2.2.

  4. For both networks, the main method should print the entire state of all of the nodes in the network before initialization, after initialization, and after instantiation.

Use of AI tools for code generation, refinement, and debugging

Per the course policy stated in the syllabus, you are free to use generative AI tools for code generation, refinement, and debugging provided that you include with your submission a link to or a transcript of your session that shows your queries. You must comply with the system's terms of use (or service), Georgetown's Acceptable Use Policy, any applicable licensing terms, and copyright law. Whew!

OpenAI's terms of use for their systems including ChatGPT stipulate that users “must own their queries.” To me, this seems like a sound guiding principle for the use of large-language models for this class. The challenge is, you and I do not own much of course material. The material in the textbook is protected by copyright. From the preface:

Copyright © 2021, 2010, 2003 by Pearson Education, Inc. or its affiliates, 221 River Street, Hoboken, NJ 07030. All Rights Reserved. Manufactured in the United States of America. This publication is protected by copyright, and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise. For information regarding permissions, request forms, and the appropriate contacts within the Pearson Education Global Rights and Permissions department, please visit www.pearsoned.com/permissions/
As you can see, to transmit material from the book to a system such as ChatGPT, we would have to obtain permission prior to that use. I am fairly certain they would not grant us permission. Heck! They don't even produce an electronic version!

The algorithms in my lecture notes are based primarily on the textbook's. Consequently, they also are protected by the publisher's copyright. I have made my own modifications, but since they are minor modifications, the algorithms in my lecture notes are derivative works. It would be difficult for me to argue that they are new works. Crucially, I can not use these algorithms as queries to ChatGPT because it would violate OpenAI's terms of use, Georgetown's Acceptable Use Policy, and copyright law.

One could (try to) argue that we can submit information from the book to ChatGPT under the fair-use provision of the copyright law. In order for the fair-use provision to apply there must be a transformative use. What constitutes a transformative use is complex, and I am not a lawyer, but I am fairly certain that copying material from the textbook or my lecture notes and submitting it to ChatGPT is not a transformative use. Furthermore, OpenAI's stipulation that you “own” your queries stymies the argument that a query consisting of someone else's intellectual property is transformative. How much transformation would be required before their intellectual property becomes yours?

The textbook's authors are nice enough to distribute the algorithms for the book without a fee, but these algorithms are protected by copyright or by the Creative Commons Attribution 4.0 International License. While the so-called CC BY 4.0 is much less restrictive than the protections afforded by copyright law, it does not grant ownership. It is designed primarily to give credit to the originator through modifications and redistribution of the work.

Given all of this, I think the best way to proceed with using AI tools for the class assignments is to paraphrase. Rather than copying protected material such as an algorithm and then pasting it into an AI tool, you should learn how the algorithm works and then query the AI tool based on your understanding. For example, you can ask ChatGPT to generate an implementation of the A* algorithm in Java. ChatGPT and other large-language models were trained on all the algorithms and implementations that we are studying. You can use ChatGPT's implementation to help you understand the algorithm from class. You can use the algorithm from class to help you understand ChatGPT's implementation, including its possible flaws. With this understanding, you can use ChatGPT to refine its algorithm, or you can refine the implementation yourself.

Ultimately, I think you want to use your brain to devise queries based on your understanding of the material. This way the majority of the flow of information will be from ChatGPT to you and your project solution. While this may seem like a convoluted process, it should help you avoid break any rules.

Finally, as I said in class, this is new territory for me, and I am eager to learn from your experiences and perspectives. If you have other ideas for how to use AI tools for software development while navigating around the legal issues, it should make for an interesting class discussion.

Instructions for Electronic Submission

In a file named HONOR, provide the following information:
Name
NetID

In accordance with the class policies and Georgetown's Honor Code,
I certify that, with the exceptions of the course materials and those
items noted below, I have neither given nor received any assistance
on this project.

When you are ready to submit your project for grading, put your source files, Makefile, and honor statement in a zip file named submit.zip. Upload the zip file to Autolab using the assignment p3. Make sure you remove all debugging output before submitting.

Plan B

If Autolab is down, upload your zip file to Canvas.

Copyright © 2024 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.