Skip to content

SukanyaKarri/Project-Webisdom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Modal Vehicle Intelligence Platform

  • I built this Multi-Modal Vehicle Intelligence Platform that pulls together vehicle data from different sources and gives a clean service record.
  • It mixes computer vision for spotting vehicles and damage, an LLM to understand what the customer really wants, some data engineering to tie in metadata, and FastAPI to make it all run super fast in real-time.

Main idea and Business requirement- Classify vehicle image (like from CCTV), the customer's text request, detects the damages, then output structured like below:

{
   "vehicle_type": "SUV",
   "detected_damage": ["scratch","dent"],
   "customer_intent": "insurance claim",
   "service_priority": "high"
}

System Architecture

FastAPI (main.py, routes.py) serves as the central hub connecting all components:

Computer Vision Pipeline:

  • models/vehicle_model.pth (PyTorch ResNet18 for classification)
  • detect_damage.py (lightweight damage detection logic using image-based features)
  • models/vehicle_model.py (vehicle type detection)

LLM Pipeline:

  • models/service_model.py (Groq API for customer intent)

Data Processing:

  • data/metadata.py (Pandas + Kaggle CarDekho data)
  • fusion_service.py (multi-modal logic + priority scoring)

API Endpoint: POST /analyze_vehicle

Data Pipeline

  1. Image Input → vehicle_model.py + detect_damage.py → vehicle_type & damage list
  2. Text Input → service_model.py (Groq LLM) → customer_intent
  3. Metadata → data/metadata.py → enrichment
  4. Fusion → fusion_service.py → final JSON output

What It Does

Vehicle Classification

I used ResNet18 with transfer learning, trained it on the Kaggle Vehicle Classification dataset (~5600 images), and tweaked the last layers for my custom classes.

Damage Detection

Got a basic pipeline going with the Kaggle Car Damage Detection dataset.

Since the dataset didn’t have fully structured labels for direct classification, I implemented a lightweight, feature-based approach:

  • The image is converted to grayscale and resized
  • A gradient-based score is calculated to capture pixel intensity changes
  • Higher variation generally indicates surface irregularities like dents or scratches

Based on a threshold:

  • Higher score → “possible dent/scratch”
  • Lower score → “no visible damage”

This acts as a baseline damage detection system, and the pipeline is designed to easily plug in advanced models like YOLO later.

Customer Intent

This part runs on Groq's LLM API—it figures out if the customer's after insurance, repairs, or just general stuff.

Multi-Modal Fusion

Pulls together the image analysis, text intent, and some business rules to make that JSON output.

FastAPI Setup

  • Real-time endpoint at POST /analyze_vehicle.
  • Handles full pipeline inference in a single request.

Datasets I Used

  • Vehicle Classification: Kaggle - ~5600 images across types
  • Damage Detection: Kaggle Car Damage - dent/scratch/shatter classes
  • Customer Intent: Kaggle customer support data for the logic
  • Vehicle Metadata: Kaggle CarDekho - fuel type, year, ownership, price

Tech Stack

  • Python
  • FastAPI
  • PyTorch (for ResNet18)
  • Groq LLM API
  • Pandas for metadata

How to Run It

  1. Clone the repo: git clone <repo_link> && cd <project_folder>
  2. pip install -r requirements.txt
  3. uvicorn main:app --reload
  4. Hit the Swagger UI at http://127.0.0.1:8000/docs

Sample Workflow

Upload a vehicle pic, type in the customer request, and it processes:

  • CV for vehicle type
  • image-based logic for damage detection
  • LLM for intent
  • metadata enrichment
  • final structured output

Next Steps -Further Enhancements

  • Upgrade to full YOLO damage detection
  • Fine-tune models on bigger datasets
  • Dockerize and deploy to the cloud
  • Add a dashboard UI

It nutshell includes:

  • End-to-end pipeline
  • Multi-modal reasoning
  • Real-time API
  • Transfer learning for fast development
  • Modular design for easy upgrades

Made by: Sukanya

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages