This is a seminar-style course devoted to recent research in statistical techniques for the automatic analysis of natural (human) language data. The instructor will give some lectures on the fundamentals and ask students to present topical papers. The topics covered in this course are: syntactic language models and large-scale distributed language models, search algorithms, statistical machine translation and non-parametric Bayesian models/processes (Dirichlet, Pitman-Yor, Indian-Buffet, etc) for natural language processing.
Time: Monday/Wednesday 8:00 pm - 9:15 pm; Location: Medical Sciences 220
387, Joshi Center
Office hours: Tuesday/Thursday 2:30PM-4:00PM
Statistical Machine Translation
Cambridge University Press, 2010.
D. Jurafksy and J. Martin.
Speech and Language Processing, 2nd Edtion
Prentice Hall, 2008.
Combinatorial Stochastic Processes
Paper Presentations. These should be done individually. You will read one or more papers and give a 45 minutes class presentation. When someone else give his/her presentation, you need to do three things: read the papers he's presenting, write reviews about the papers, finally give a grade on the presentation.