CS707 Information Retrieval

Course Objective

Course Prerequisite

Course Description

This course will cover models for information retrieval, techniques for indexing and searching, and algorithms for classification and clustering. It will also cover SVM, latent semantic indexing, link analysis and ranking, Map-Reduce architecture and Hadoop, to different degrees of detail, time permitting. 

Course Load

The course load includes a  programming project (30 pts), a midterm exam (30 pts), and a final exam (40 pts). 

Required Texts

Recommended Texts

Reference URLs


The A/B/C/D/F letter grade will be assigned at the end of the course.  

 Tentative Class Schedule and Syllabus

  Topics Addl. Reading
Class 1 Information Retrieval; The Boolean Model MIR-1
Class 2 The Vector Space Model : Term Weighting and Scoring IIR-6, MIR-2 
Class 3 Inverted Index Construction IIR-1, MIR-8.2
Class 4 Dictionary and Postings; Query Processing IIR-2, MIR-7.2
Class 5 Tolerant Retrieval (B-Trees) IIR-3
Class 6 Index Construction IIR-4, MG-5
Class 7 Map Reduce Architecture Hadoop
Class 8 Index Compression IIR-5, MG 3.3-4
Class 9 Vector Space Model : TF-IDF IIR-6.2-4
Class 10

Midterm Exam (Feb 2)

Class 11 Vector Space Model : Ranking Revisited IIR-6.1, IIR-7
Class 12 Evaluation in Information Retrieval IIR-8, MIR-3 IIR-8
Class 13 Relevance Feedback and Query Expansion IIR-9, MIR-5.2-4
Class 14 Text Classification and Naive Bayes IIR-13
Class 15 Vector Space Classification IIR-14
Class 16 Support Vector Machines IIR-15, Primer
Class 17 Flat and Hierarchical Clustering

IIR-16, IIR-17

Class 18 Latent Semantic Indexing

IIR-18, Refs

Class 19 Linear Algebra: Matrix Decompositions SVD-URL
Class 20  Wrap-Up


Class * Web Characteristics


Class * Web Search: Crawling and Indexes


Class * Link Analysis



Final Exam (5:45pm-7:45pm, March 13)



Assignments (Winter 2011/2012)

Exams (Winter 2012)