Skip to content

HSKCTA/Exam_Prep_assistant

Repository files navigation

📘 SPPU Exam Preparation Chatbot

Automated Question Paper Scraper • Question Analysis • Trend Prediction • Study Insights

This project is an AI-powered exam preparation assistant built using Rasa + Python, capable of:

✔️ Fetching SPPU engineering question papers (pattern 2015/2019) ✔️ Scraping subject-wise PDFs directly from sppuquestionpapers.com ✔️ Extracting questions using regex + text-cleaning ✔️ Clustering & analyzing question trends (semantic + TF-IDF fallback) ✔️ Generating summaries of frequently asked questions ✔️ Creating a cluster analysis chart ✔️ Exporting results as JSON and PDF reports ✔️ Interacting through a friendly chat interface


Features

🔍 1. Automated Question Paper Scraping

  • Scrapes department → semester → subject → pattern tables.
  • Extracts all matching PDF links directly from the subject page.
  • Uses concurrency to download PDFs faster.

2. Intelligent Question Extraction

  • Extracts questions using robust patterns:

    • Q1) a) question [6]
    • Q2) b) ...
  • Cleans multi-line messy OCR text.

🤖 3. Semantic Question Clustering

Uses:

  • SentenceTransformer ("all-MiniLM-L6-v2") if available

  • TF-IDF + Agglomerative fallback if model unavailable

  • Groups similar questions under the same cluster

  • Shows:

    • Representative question
    • Cluster frequency
    • Topic trends

📊 4. Visual Analysis

Generates:

  • Top cluster bar chart
  • Optionally sent inline or as a file (adaptive to channel)

📝 5. Report Generation

Exports:

  • paper_analysis.json
  • paper_analysis.pdf

Including:

  • extracted questions
  • cluster info
  • topics
  • frequent questions
  • difficulty estimate
  • question type counts
  • embedded chart

💬 6. Rasa Chatbot Interface

Guides users step-by-step:

  • Department
  • Semester
  • Subject
  • Pattern (2015/2019)
  • Runs analysis and sends the summary + chart + reports

🏗️ Architecture

Rasa Chatbot
     │
     ├── Form (department, semester, subject, pattern)
     │
     ├── Action: ActionFetchAndAnalyze
     │        ├── sppu_scraper.py  → table-based scraper
     │        ├── downloader.py    → multi-threaded downloads
     │        ├── parser.py        → question extraction
     │        ├── clustering.py    → semantic & fallback clusters
     │        ├── reporting.py     → json + pdf export
     │        └── image utils      → plot & inline send
     │
     └── Output:
            - Pretty summary (markdown)
            - Cluster plot
            - JSON report
            - PDF report

🛠️ Installation

1️⃣ Clone the repository

git clone https://github.com/your-username/sppu-exam-chatbot
cd sppu-exam-chatbot

2️⃣ Create a virtual environment

python3 -m venv venv
source venv/bin/activate

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Train Rasa

rasa train

5️⃣ Run Action Server

rasa run actions

6️⃣ Run Chatbot

rasa shell

📂 Project Structure

.
├── actions/
│   ├── actions.py
│   ├── sppu_scraper.py
│   └── __init__.py
│
├── data/
│   ├── nlu.yml
│   ├── stories.yml
│   └── rules.yml
│
├── domain.yml
├── config.yml
├── README.md
└── requirements.txt

🧪 Usage

Start bot:

rasa shell

Example conversation:

User: Can you analyze question papers from the web?
Bot: Which department?
User: Computer Engineering
Bot: Which semester?
User: Sem 5
Bot: Which subject?
User: Database Management Systems
Bot: Which pattern?
User: 2019 pattern
Bot: Found 16 papers. Downloading...
Bot: Analysis Complete! (summary + chart + PDF report)

📄 Outputs

Markdown summary (sample)

**Top recurring question types:**
1. (3 times) Consider following schema
2. (3 times) Write short note on...
3. (2 times) Compare DBMS and File Systems...
...

Generated Files:

paper_analysis.json
paper_analysis.pdf
cluster_analysis_plot.png

🤝 Contributing

Pull requests are welcome! Ideas, improvements, or scraper fixes are appreciated.


📜 License

MIT License Free for personal & academic use.


👤 Author

Hitesh Khare

About

When exams come , many of us have to manually skim through the past papers to see which i topic is important, instead of doing all that use this.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages