Dr. Nazli Goharian

Text Mining and Analyis

Course Description:

After providing an introduction to information retrieval and data mining, the lectures will cover text classification, clustering, information extraction, opinion mining and sentiment analysis, and summarization. Various supervised and unsupervised methodologies that can be applied to these areas will be covered. The Students will present/discuss research literatures on various related topics (eg. e-discovery, question answering, complex answers, social media health survailance, etc.) This includes: review of current literature (survey) on specific topic, detail and specifics of few papers related to survey, experimental results and evaluation of own proposed approach.


Graduate Students (& comfortable in programming & comfortable in presentation and discussion); Undergrad Students with Permission Only


The course handouts are available on the class Forum for most topics that are covered in the class.

Office hours:

will be announced on the class Forum

Tasks & Grading (Tentative- Will be finalized by the 1st day of the class!):

Exam(s): 1-2 exams: 35%

Literature Survey and Presentations: 25% presentation of multiple papers that are assigned to students on the topic (Provide an introduction and motivation for the topic; List the issues in this research area; Provide the general overview or taxonomy of various existing approaches to solve these issues; Describe the main papers (approaches); Identify the shortcomings in the exiting approaches. (Detail will be given in the class & will be posted on the class Forum.)

Proposal and evaluation of proposed approach: 40% Based on the literature survey, the research questions are formulated. Students (in groups of 1-3) work on implementing baselines and their own improvments to exsiting methods, providing analysis of the results.) Any proposed approach may be changed as students are working on the problem, however, the change and cause of change should be documented. Must show how validated the approach & results (that can be a theoretical proof, or empirically, in which case must explain in detail the dataset, experimental plan, evaluation metrics, results, and analysis of the results). Students will present this in the class and write a detailed report. A demo may be required.

Academic Integrity:

Visit the Honor System Website at http://gervaseprograms.georgetown.edu/honor/