Courses at Georgetown
- Empirical Methods in Natural Language Processing.
Graduate-level survey of NLP.
- LING/COSC-5402: Spring 2024
- LING/COSC-572: Fall 2016, Spring 2018, Spring 2019, Spring 2020, Spring 2021, Spring 2022, Spring 2023
Synopsis
Systems of communication that come naturally to humans are thoroughly unnatural for computers. For truly robust information technologies, we need to teach computers to unpack our language. Natural language processing (NLP) technologies facilitate semi-intelligent artificial processing of human language text. In particular, techniques for analyzing the grammar and meaning of words and sentences can be used as components within applications such as web search, question answering, and machine translation.
This course introduces fundamental NLP concepts and algorithms, emphasizing the marriage of linguistic corpus resources with statistical and machine learning methods. As such, the course combines elements of linguistics, computer science, and data science. Coursework will consist of lectures, programming assignments (in Python), and a final team project. The course is intended for students who are already comfortable with programming and have some familiarity with probability theory.
- All About Prepositions.
Upper undergraduate/graduate deep dive into the mysterious world of English prepositions and similar words in other languages.
LING-424: Fall 2022
This course will take on the grammatical category of prepositions, which are hands-down some of the most intriguing and beguiling words once you get to know them. (How many prepositions are there in the previous sentence? The answer may surprise you!) We will look at their syntactic and semantic versatility in English and how they vary across languages. We will explore how they denote relations in space and time, as well as many other kinds of meanings. We will see why they are so hard to learn in a second language, and why they are difficult to define in dictionaries and teach to computers. The course will be project-based, including a significant project on a language other than English.
Prerequisites: Some background in syntactic description, e.g. satisfied by LING-224, LING-427, or LING-367
- Doctoral Seminar on Natural Language Processing.
Reading seminar on current NLP research.
- COSC-872: Spring 2019, Fall 2021
Synopsis
This course will expose students to current research in natural language processing and computational linguistics. Class meetings will consist primarily of student-led reading discussions, supplemented occasionally by lectures or hands-on activities. The subtopics and reading list will be determined at the start of the semester; readings will consist of research papers, advanced tutorials, and/or dissertations.
- Advanced Semantic Representation. In-depth graduate-level exploration of representations, data, and algorithms
for sentence semantics.
LING/COSC-672: Fall 2020, Spring 2022
This course will examine semantic representations for natural language from a computational/NLP perspective. Through readings, presentations, discussions, and hands-on exercises, we will put a semantic representation under the microscope to assess its strengths and weaknesses. For each representation we will confront questions such as: What aspects of meaning are and are not captured? How well does the representation scale to the large vocabulary of a language? What assumptions does it make about grammar? How language-specific is it? In what ways does it facilitate manual annotation and automatic analysis? What datasets and algorithms have been developed for the representation? What has it been used for? Representations covered in depth will include FrameNet (http://framenet.icsi.berkeley.edu), Universal Cognitive Conceptual Annotation (http://www.cs.huji.ac.il/~oabend/ucca.html), and Abstract Meaning Representation (http://amr.isi.edu/). Term projects will consist of (i) innovating on a representation's design, datasets, or analysis algorithms, or (ii) applying it to questions in linguistics or downstream NLP tasks.
LING/COSC-672: Fall 2018
This course will examine semantic representations for natural language from a computational/NLP perspective. Through readings, presentations, discussions, and hands-on exercises, we will put a semantic representation under the microscope to assess its strengths and weaknesses. For each representation we will confront questions such as: What aspects of meaning are and are not captured? How well does the representation scale to the large vocabulary of a language? What assumptions does it make about grammar? How language-specific is it? In what ways does it facilitate manual annotation and automatic analysis? What datasets and algorithms have been developed for the representation? What has it been used for? In Fall 2018 the focus will be on Universal Conceptual Cognitive Annotation (UCCA); its relationship to other representations in the literature will also be considered. Term projects will consist of (i) innovating on the representation's design, datasets, or analysis algorithms, or (ii) applying it to questions in linguistics or downstream NLP tasks.
LING/COSC-672: Spring 2017
This course will examine semantic representations for natural language from a computational/NLP perspective. Through readings, presentations, discussions, and hands-on exercises, we will put a semantic representation under the microscope to assess its strengths and weaknesses. For each representation we will confront questions such as: What aspects of meaning are and are not captured? How well does the representation scale to the large vocabulary of a language? What assumptions does it make about grammar? How language-specific is it? In what ways does it facilitate manual annotation and automatic analysis? What datasets and algorithms have been developed for the representation? What has it been used for? In Spring 2017 the focus will be on the Abstract Meaning Representation (AMR); its relationship to other representations in the literature will also be considered. Term projects will consist of (i) innovating on the representation's design, datasets, or analysis algorithms, or (ii) applying it to questions in linguistics or downstream NLP tasks.
- Cognitive Grammar.
Cognitive-functionalist accounts of how grammar structures meaning.
LING-485: Spring 2018 (with Lourdes Ortega)
The work of cognitive linguists falls under the general category of functional approaches to linguistic structure. Theories of grammar in Cognitive Linguistics emphasize the centrality of meaning and communicative context, and the role of cognitive processes such as memory and attention. This course will elucidate the major themes and core concepts of these theories, including Langacker’s Cognitive Grammar; categorization and prototype theory; frame semantics; Construction Grammar; metaphor and metonymy; and mental spaces and blending. Of particular emphasis will be the application of these theories to analyze sentences in terms of both form and meaning. Implications for linguistic typology, first and second language acquisition, computational linguistics, language pedagogy, and discourse and ideology will also be explored, depending on the particular interests represented in students taking the class.
- Algorithms for Natural Language Processing.
An introduction to NLP for undergraduates who are experienced programmers.
LING/COSC-272: Fall 2017
Human language technologies increasingly help us to communicate with computers and with each other. But every human language is extraordinarily complex, and the diversity seen in languages of the world is massive. Natural language processing (NLP) seeks to formalize and unpack different aspects of a language so computers can approximate human-like language abilities. In this course, we will examine the building blocks that underlie a human language such as English (or Japanese, Arabic, Tamil, or Navajo), and fundamental algorithms for analyzing those building blocks in text data, with an emphasis on the structure and meaning of words and sentences. Students will implement a variety of core algorithms for both rule-based and machine learning methods, and learn how to use computational linguistic datasets such as lexicons and treebanks. Text processing applications such as machine translation, information retrieval, and dialogue systems will be introduced as well.
This course is designed for undergraduates who are comfortable with the basics of discrete probability and possess solid programming skills, including the ability to use basic data structures and familiarity with regular expressions. COSC-160: Data Structures is the prerequisite for CS students, and LING-001 is the prerequisite for Linguistics students. Students that are new to programming or need a refresher are directed to LING-362: Introduction to NLP. The languages of instruction will be English and Python.
- Corpus Linguistics
Summer 2017 (2017 Linguistic Institute, Lexington, KY, July 5–August 1) (with Amir Zeldes) [slides]
Corpus data is essential to many approaches to linguistics, including usage-based approaches to grammar, variationist sociolinguistics, and historical linguistics. Corpus building and evaluation have advanced tremendously over the past two decades but the barriers to constructing one’s own corpus can be daunting: annotation interfaces are difficult to learn, Natural Language Processing tools can be highly complex to work with and handling data requires more than basic computer skills. In this hands-on course we will learn to apply corpus methods to a dataset created during the course itself, focusing on the growing and challenging domain of social media. We will learn practical annotation schemes and consider how design choices impact our subsequent evaluation as we build and explore a small example corpus together.
- Foundations of Natural Language Processing
INFR09028: Spring 2016 (University of Edinburgh School of Informatics) (with Sharon Goldwater)
This course covers some of the linguistic and algorithmic foundations of natural language processing. It builds on the material introduced in Informatics 2A and aims to equip students for more advanced NLP courses in years 3 or 4. The course is strongly empirical, using corpus data to illustrate both core linguistic concepts and algorithms, including language modeling, part of speech tagging, syntactic processing, the syntax-semantics interface, and aspects of semantic processing. Linguistic and algorithmic content will be interleaved throughout the course.
Other Courses
Scientific Tutorials
These are listed on the publications page.