CS60092: Information Retrieval
CS60092 | |||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Course name | Information Retrieval | ||||||||||||||||||||||||||||
Offered by | Computer Science & Engineering | ||||||||||||||||||||||||||||
Credits | 3 | ||||||||||||||||||||||||||||
L-T-P | 3-0-0 | ||||||||||||||||||||||||||||
Previous Year Grade Distribution | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
Semester | Spring |
Syllabus
Syllabus mentioned in ERP
Introduction to Information Retrieval: The nature of unstructured and semistructured text. Inverted index and Boolean queries.Text Indexing, Storage and Compression: Text encoding: tokenization, stemming, stop words, phrases, index optimization. Index compression: lexicon compression and postings. lists compression. Gap encoding, gamma codes, Zipfs Law. Index construction. Postings size estimation, merge sort, dynamic indexing, positional indexes, n-gram indexes, real-world issues.Retrieval Models: Boolean, vector space, TFIDF, Okapi, probabilistic, language modeling, latent semantic indexing. Vector space scoring. The cosine measure. Efficiency considerations. Document length normalization. Relevance feedback and query expansion. Rocchio.Performance Evaluation: Evaluating search engines. User happiness, precision, recall, Fmeasure. Creating test collections: kappa measure, interjudge agreement.Text Categorization and Filtering: Introduction to text classification. Naive Bayes models. Spam filtering. Vector space classification using hyperplanes; centroids; k Nearest Neighbors. Support vector machine classifiers. Kernel functions. Boosting.Text Clustering: Clustering versus classification. Partitioning methods. k-means clustering. Mixture of Gaussians model. Hierarchical agglomerative clustering. Clustering terms using documents.Advanced Topics: Summarization, Topic detection and tracking, Personalization, Question answering, Cross language information retrievalWeb Information Retrieval: Hypertext, web crawling, search engines, ranking, link analysis, PageRank, HITS, XML and Semantic web.References1.Manning, Raghavan and Schutze, Introduction to Information Retrieval, Cambridge University Press.2.Baeza-Yates and Ribeiro-Neto, Modern Information Retrieval, AddisonWesley.3.Soumen Charabarti, Mining the Web, Morgan-Kaufmann.4.Survey by Ed Greengrass available in the Internet.
Concepts taught in class
Student Opinion
How to Crack the Paper
Classroom resources
Additional Resources
Time Table
Day | 8:00-8:55 am | 9:00-9:55 am | 10:00-10:55 am | 11:00-11:55 am | 12:00-12:55 pm | 2:00-2:55 pm | 3:00-3:55 pm | 4:00-4:55 pm | 5:00-5:55 pm | |
---|---|---|---|---|---|---|---|---|---|---|
Monday | ||||||||||
Tuesday | ||||||||||
Wednesday | ||||||||||
Thursday | CSE-120 | CSE-120 | ||||||||
Friday | CSE-120 |