GUCL: Computation and Language @ Georgetown

Courses

Overview of CL course offerings (April 2025)
Document listing courses in CS, Linguistics, and other departments that are most relevant to students interested in computational linguistics.

COSC-3470 | Deep Learning

Sarah Bargal Upperclass Undergraduate

This course will focus on building state-of-the-art systems in the intersection of deep learning and computer vision. Student will be introduced to deep architectures and learning algorithms for various discriminative and generative computer vision tasks. The course will demonstrate how such tasks are main building blocks in processing images and videos for applications such as self-driving cars, healthcare, surveillance, and human-computer interfaces.

COSC/LING-4467 | Speech & Audio Processing with Deep Neural Networks

Joe GarmanUpperclass Undergraduate & Graduate

This course covers modern deep learning approaches for speech recognition, synthesis, and audio processing. Students learn PyTorch implementation of neural architectures, from foundational networks to state-of-the-art transformer models. Topics include basic text processing, audio feature extraction, automatic speech recognition, text-to-speech synthesis, and audio/music generation. The course emphasizes hands-on experience through weekly programming assignments using PyTorch. Prior programming experience in Python required; no previous signal processing or deep learning experience assumed. Designed for computational linguistics and computer science graduate students or advanced undergraduates.

COSC-5480 | Large Language Models

Grace Hui Yang Graduate

This course delves deep into the intricacies of Large Language Models (LLMs), offering students an understanding of their design, implementation, and applications. Beginning with the foundational architectures such as transformers and attention mechanisms, students will journey through the evolution from the fundamental models to contemporary marvels like GPT-3, ChatGPT, and GPT-4. The course aims to provide a comprehensive overview of the historical and current state of LLMs, equipping students with the knowledge to design, train, and fine-tune LLMs for custom applications. It will also encourage critical discussions on the ethical, societal, and technical challenges associated with LLMs. Key topics covered in the course include (1) Foundations: Review of RNNs, LSTMs, Attention Mechanisms, and Transformers. (2) Architectural Deep Dive: Behind the design of GPT-3, BERT, and other leading models. (3) Training Paradigms: Techniques and challenges in training massive models. (4) Applications: chatbots, content generation, recommendation systems, and beyond. (5) Societal Impact: Ethical considerations, fairness, and bias in LLMs. (6) Technical Challenges: Model explainability, controllability, and safety concerns. (7) Future Directions: Where LLMs are headed and emerging research areas. The course assessments consist of monthly assignments involving practical implementations and model evaluations, exams covering theoretical and applied concepts, and one optional final project focusing on designing a custom application utilizing LLMs. Class participation and critical discussion sessions are also important components in student assessments.

COSC-5540 | Text Mining & Analysis

Nazli Goharian Graduate

This course covers various concepts and research areas in text search and mining. The structure of the course is a combination of lectures & students' presentations. The lectures will cover various search technologies, classification, text summarization, opinion and sentiment mining, covering applications on varying domains and formats, including scientific, health, and social media. The students are assigned a related topic in the field for further study, implementation, experimentation and presentation in the class.

COSC-8405 | Seminar in NLP

Nathan Schneider Graduate: Doctoral [2 credits]

This course will expose students to current research in natural language processing and computational linguistics. Class meetings will consist primarily of student-led reading discussions, supplemented occasionally by lectures or hands-on activities. The subtopics and reading list will be determined at the start of the semester; readings will consist of research papers, advanced tutorials, and/or dissertations.

Requirements: Familiarity with NLP using machine learning methods (for example satisfied by COSC-5402, Empirical Methods in NLP)

LING-2040/4400 | Computational Language Processing (a.k.a. Introduction to Natural Language Processing)

Amir Zeldes Undergraduate & Graduate

This course will introduce students to the basics of Natural Language Processing (NLP), a field that combines linguistics and computer science to produce applications, such as generative AI, that are profoundly impacting our society. We will cover a range of topics that form the basis of these exciting technological advances and will provide students with a platform for future study and research in this area. We will learn to implement simple representations such as finite-state techniques, n-gram models, and topic models in the Python programming language. Previous knowledge of Python is not required, but students should be prepared to invest the necessary time and effort to become proficient over the semester. Students who take this course will gain a thorough understanding of the fundamental methods used in natural language understanding, along with an ability to assess the strengths and weaknesses of natural language technologies based on these methods.

LING-4401/DSAN-5400 | Computational Linguistics with Advanced Python

Trevor Adriaanse Upperclass Undergraduate & Graduate

This course presents topics in Natural Language Processing (NLP) and Python programming for both text processing and analysis. The goal of this class is to explore both classical and modern techniques in NLP, with emphasis on hands-on application. We will examine topics such as text classification, model evaluation, nearest neighbors, and distributed representations. Applications include authorship identification, structured prediction, and semantic textual similarity, to name a few.

Programming topics include Python best practices, scientific computing libraries (e.g., NumPy, sklearn, etc.), exception handling, object-oriented programming, and more. By the end of this course, students will be able to program proficiently in Python, with enough comfort to reference software documentation and pseudocode to write sophisticated programs from scratch.

Requirements: Basic Python programming skills are required (for example satisfied by LING-4400, Computational Language Processing/Intro to NLP)

LING-4424 | All About Prepositions

Nathan Schneider Upperclass Undergraduate & Graduate

This course will take on the grammatical category of prepositions, which are hands-down some of the most intriguing and beguiling words once you get to know them. (How many prepositions are there in the previous sentence? The answer may surprise you!) We will look at their syntactic and semantic versatility in English and how they vary across languages. We will explore how they denote relations in space and time, as well as many other kinds of meanings. We will see why they are so hard to learn in a second language, and why they are difficult to define in dictionaries and teach to computers. The course will be project-based, including a significant project on a language other than English.

Prerequisites: Some background in syntactic description, e.g. satisfied by LING-2020, LING-4427, or LING-5127

LING-4427 | Computational Corpus Linguistics

Amir Zeldes Upperclass Undergraduate & Graduate

Digital linguistic corpora, i.e. electronic collections of written, spoken or multimodal language data, have become an increasingly important source of empirical information for theoretical and applied linguistics in recent years. This course is meant as a theoretically founded, practical introduction to corpus work with a broad selection of data, including non-standardized varieties such as language on the Internet, learner corpora and historical corpora. We will discuss issues of corpus design, annotation and evaluation using quantitative methods and both manual and automatic annotation tools for different levels of linguistic analysis, from parts-of-speech, through syntax to discourse annotation. Students in this course participate in building the corpus described here: https://corpling.uis.georgetown.edu/gum/

LING-4480 | Computational Linguistics Research Methods

Ethan Wilcox Upperclass Undergraduate & Graduate

Computational Linguistics is a fast-growing and fast-moving field. This course is intended to give advanced undergraduate and graduate students practice conducting original research in computational linguistics and to enhance their research and communication skills. It will serve as a platform for students to pursue an independent research project with guidance and oversight from faculty and peers. Students will be expected to bring their own pre-existing research topics/questions to the class. Over the course of the semester, they will select and present key research papers pertinent to their topic, and develop the project with the goal of writing an ACL-style conference proceedings paper. In addition to hands-on research, this class will provide a venue for students to learn CL-related skills that often fall through the cracks of other, content-focused courses. Possible workshop topics include data annotation, LaTeX, and working with pretrained language models, as well as communication skills such as poster and slide design. As a hands-on course whose content changes based on the instructor and students, this course can be repeated for credit.

DSAN-5800 | Advanced NLP

Chris Larson Graduate

This course provides a formalism for understanding the statistical machine learning methods that have come to dominate natural language processing. Divided into three core modules, the course explores (i) how language understanding is framed as a tractable statistical inference problem, (ii) a formal yet practical treatment of the DNN architectures and learning algorithms used in NLP, and (iii) how these components are leveraged in modern AI systems such as information retrieval, recommender systems, and conversational agents. In exploring these topics, the course exposes students to the foundational math, practical applications, current research directions, and software design that is critical to gaining proficiency as an NLP/ML practitioner. The course culminates in a capstone project, conducted over its final six weeks, in which students apply NLP to an interesting problem of their choosing. In past semesters students have built chatbots, code completion tools, stock trading algorithms, just to name a few. This course assumes a basic understanding of linear algebra, probability theory, first order optimization methods, and proficiency in Python.

This is an advanced course. Suggested prerequisites are DSAN 5000, DSAN 5100 and DSAN 5400. However, first-year students with the necessary math, statistics, and deep learning background will be considered.

ICOS-7710 | Cognitive Science Core Course

Abigail Marsh & Elissa Newport Graduate

A seminar in which important topics in cognitive science are taught by participating Georgetown faculty from the main and medical campuses. Required for the Cognitive Science concentration, available for Ph.D. students in other programs with instructor permission. (Can be taken more than once for credit.)

COSC-3450 | Artificial Intelligence

Mark Maloof Undergraduate

Artificial Intelligence (AI) is the branch of computer science that studies how to program computers to reason, learn, perceive, and understand. The lecture portion of the class surveys basic and advanced concepts and techniques of artificial intelligence, including search, knowledge representation, automated reasoning, uncertain reasoning, and machine learning. Specific topics include symbolic computing, state-space search, game playing, theorem proving, rule-based systems, Bayesian networks, probability estimation, rule induction, Markov decision processes, reinforcement learning, and ethical and philosophical issues. Applications of artificial intelligence are also discussed in domains such as medicine and computer security. Students complete midterm and final exams, and five programming projects using the Java programming language.

COSC-3590 | Data Mining

Nazli Goharian Upperclass Undergraduate

This course covers concepts and techniques in the field of data mining. This includes both supervised and unsupervised algorithms, such as naive Bayes, neural network, decision tree, rule based classifiers, distance based learners, clustering, and association rule mining. Various issues in the pre-processing of the data are addressed. Text classification, social media mining, and recommender systems will be addressed. The students learn the material by building various data mining models and using various data pre-processing techniques, performing experimentation and provide analysis of the results.

COSC-3440 | Deep Reinforcement Learning

Grace Hui Yang Undergraduate

Deep reinforcement learning is a machine learning area that learns how to make optimal decisions from interacting with an environment using deep neural networks. An intelligent agent observes the consequences of its action from the environment and alters its behavior to maximize the expected return. We study algorithms and applications in deep reinforcement learning. Topics include Deep neural networks, Markov decision processes, policy gradient methods, Q-Learning (DQN), Actor-Critic, Imitation Learning, and other advanced topics. The course has lectures, readings, programming assignments, and exams.

LING-2040/4400 | Computational Language Processing (a.k.a. Introduction to Natural Language Processing)

Ethan Wilcox Undergraduate & Graduate

This course will introduce students to the basics of Natural Language Processing (NLP), a field that combines linguistics and computer science to produce applications, such as generative AI, that are profoundly impacting our society. We will cover a range of topics that form the basis of these exciting technological advances and will provide students with a platform for future study and research in this area. We will learn to implement simple representations such as finite-state techniques, n-gram models, and topic models in the Python programming language. Previous knowledge of Python is not required, but students should be prepared to invest the necessary time and effort to become proficient over the semester. Students who take this course will gain a thorough understanding of the fundamental methods used in natural language understanding, along with an ability to assess the strengths and weaknesses of natural language technologies based on these methods.

LING-4431 | LLMs for Computational Linguistics

Ethan Wilcox Upperclass Undergraduate & Graduate

Large language models (LLMs) are the foundational technology behind today’s most advanced artificial intelligence systems. They have revolutionized the field of natural language processing and challenged traditional ideas about how language is represented in the human mind. This course offers an introduction to LLMs from the perspective of computational linguistics. Through lectures and hands-on demonstrations, we will explore three interrelated questions: How do LLMs work at a technical level? How can they be used to process natural language data? And how can they be used to model human linguistic cognition? The first half of the course will cover technical foundations of LLMs, including the transformer architecture, tokenization, interpretability techniques, and scaling. The second half of the course will focus on LLMs’ applications in natural language processing, linguistics, and cognitive science. Topics will include security and privacy, ethical issues, multilingualism, and LMs as models of human language processing and acquisition. This course is appropriate for advanced undergraduates and graduate students. Students will gain experience training (small) language models, implementing basic interpretability techniques, and reading recent research papers in the area. Knowledge of programming in Python, and basic math and statistics, are required as a prerequisite.

COSC/LING-5402 | Empirical Methods in Natural Language Processing

Nathan Schneider Graduate

Systems of communication that come naturally to humans are thoroughly unnatural for computers. For truly robust information technologies, we need to teach computers to unpack our language. Natural language processing (NLP) technologies facilitate semi-intelligent artificial processing of human language text. In particular, techniques for analyzing the grammar and meaning of words and sentences can be used as components within applications such as web search, question answering, and machine translation.

This course introduces fundamental NLP concepts and algorithms, emphasizing the marriage of linguistic corpus resources with statistical and machine learning methods. As such, the course combines elements of linguistics, computer science, and data science. Coursework will consist of lectures, programming assignments (in Python), and a final team project. The course is intended for students who are already comfortable with programming and have some familiarity with probability theory.

LING-5444 | Machine Learning for Language Data

Amir Zeldes Graduate

In the past few years, the advent of abundant computing power and data has catapulted machine learning to the forefront of a number of fields of research, including Linguistics and especially Natural Language Processing. At the same time, general machine learning toolkits and tutorials make handling ‘default cases’ relatively easy, but are much less useful in handling non-standard data, less studied languages, low-resource scenarios and the need for interpretability that is essential for drawing robust inferences from data. This course gives a broad overview of the machine learning techniques most used for text processing and linguistic research. The course is taught in Python, covering both general statistical ML algorithms, such as linear models, SVMs, decision trees and ensembles, and current deep learning models, such as deep neural net classifiers, recurrent networks and contextualized continuous meaning representations. The course assumes good command of Python (ability to implement a program from pseudo-code) but does not require previous experience with machine learning.

Requirements: Intermediate Python (courses such as LING-4401: Computational Linguistics with Advanced Python provide a good preparation)

LING-8415 | Computational Discourse Models

Amir Zeldes Graduate

Recent years have seen an explosion of computational work on higher level discourse representations, such as entity recognition, mention and coreference resolution and shallow discourse parsing. At the same time, the theoretical status of the underlying categories is not well understood, and despite progress, these tasks remain very much unsolved in practice. This graduate level seminar will concentrate on theoretical and practical models representing how referring expressions, such as mentions of people, things and events, are coded during language processing. We will begin by exploring the literature on human discourse processing in terms of information structure, discourse coherence and theories about anaphora, such as Centering Theory and Alternative Semantics. We will then look at computational linguistics implementations of systems for entity recognition and coreference resolution and explore their relationship with linguistic theory. Over the course of the semester, participants will implement their own coding project exploring some phenomenon within the domain of entity recognition, coreference, discourse modeling or a related area.

ICOS-7712 | Cognitive Science Seminar

Abigail Marsh Graduate

A seminar in which graduate students and faculty interested in the cognitive sciences will read and discuss prominent articles across our fields. Can be repeated for credit.