Skip to content

๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฅ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น | ๐—–๐—ฆ๐Ÿฒ๐Ÿฌ๐Ÿฌ๐Ÿต๐Ÿฎ | ๐—•๐—ผ๐—ผ๐—น-๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต, ๐—ฅ๐—ฎ๐—ป๐—ธ๐—ฒ๐—ฟ, ๐—ช๐—ผ๐—ฟ๐—ฑ๐—ก๐—ฒ๐˜ ๐—ฆ๐˜‚๐—บ๐—บ๐—ฎ๐—ฟ๐—ถ๐˜‡๐—ฒ๐—ฟ

Notifications You must be signed in to change notification settings

Ecolash/Information-Retrieval

Repository files navigation

Information Retrieval (CS60092)

This repository contains code, data, and write-ups for the Information Retrieval course assignments and resources. It was organized to store three main assignments, datasets, results, and supporting materials (books, papers, slides).

Course

  • Course name: Information Retrieval
  • Author / Owner: Tuhin Mondal (22CS10087)

Repository Structure

Top-level layout (folders and representative files):

A1-Boolean AND Retrieval Using Inverted Index/
โ”œโ”€ bool.py
โ”œโ”€ indexer.py
โ”œโ”€ parser.py
โ”œโ”€ README.md
โ”œโ”€ requirements.txt
โ”œโ”€ run.sh
โ”œโ”€ Dataset/
โ””โ”€ Output/
  โ”œโ”€ queries.txt
  โ””โ”€ results.txt

A2-Scoring and Evaluation/
โ”œโ”€ ranker.py
โ”œโ”€ evaluator.py
โ”œโ”€ README.md
โ”œโ”€ requirements.txt
โ”œโ”€ run.sh
โ”œโ”€ Dataset/
โ”‚  โ”œโ”€ TIME_Documents.txt
โ”‚  โ”œโ”€ TIME_Queries.txt
โ”‚  โ””โ”€ TIME_Relevance.txt
โ”œโ”€ Evals/
โ””โ”€ Ranks/

A3-Wordnet based Summarization/
โ”œโ”€ summarizer.py
โ”œโ”€ evaluator.py
โ”œโ”€ README.txt
โ””โ”€ BBC_News_1K/
  โ”œโ”€ Generated Summaries/
  โ”œโ”€ News Articles/
  โ”œโ”€ Summaries/
  โ””โ”€ Evals/

Books/
Research Papers/
Slides/

Dependencies

  • Each assignment maintains its own requirements.txt. Common dependencies for the course include NLTK, NumPy, and scikit-learn (check each requirements.txt for exact pins).

Data and Outputs

  • Dataset/ directories inside assignments contain the raw documents and query/relevance files used for experiments.
  • Output/, Evals/, and Ranks/ folders contain generated outputs, ranked lists, and evaluation metrics. Do not overwrite these if you want to preserve results.

Course Details / Purpose

  • The repository is structured to support learning and evaluation of Information Retrieval concepts:
    • Boolean retrieval via inverted indexes
    • Scoring and ranking with evaluation against ground truth relevance
    • Summarization methods using lexical resources (WordNet)

About

๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฅ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น | ๐—–๐—ฆ๐Ÿฒ๐Ÿฌ๐Ÿฌ๐Ÿต๐Ÿฎ | ๐—•๐—ผ๐—ผ๐—น-๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต, ๐—ฅ๐—ฎ๐—ป๐—ธ๐—ฒ๐—ฟ, ๐—ช๐—ผ๐—ฟ๐—ฑ๐—ก๐—ฒ๐˜ ๐—ฆ๐˜‚๐—บ๐—บ๐—ฎ๐—ฟ๐—ถ๐˜‡๐—ฒ๐—ฟ

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published