DeepBench: Domain-Specific Robustness Evaluation Framework for Vision and Vision-Language Models

Project Overview

DeepBench is a benchmarking framework developed under the TAHAI (TrustAdHocAI) project. It provides:

Systematic evaluation of image classification models
Quantitative analysis of model robustness against image perturbations
Visual insights into performance degradation

The system consists of two integrated components:

Benchmark Framework (Backend): Benchmark models with controlled perturbations
Analysis Dashboard (Frontend): Visualizes results and model comparison metrics

Designed for GPU-accelerated environments, DeepBench supports modern vision-language models including from Hugging Face and Ollama APIs.

Components

🚀 DeepBench Backend

Command-line tool for configuration of experiments and running the benchmarks

Core Functionality:

Applies ~17 image transformations across multiple adjustable use-cases
Tests models with individual or ramped corruptions
Stores results in MongoDB (remote) or TinyDB (local)
TOML-configurable experiments

📊 DeepBench Analysis Frontend

Interactive visualization dashboard

Core Functionality:

Model performance comparison across perturbation types and different metrics
Use-case specific analysis (Medical, Autonomous Driving, etc.)

Usage

The usage of each submodule is described in more detail in their own README files

Project TAHAI

Developed under the TAHAI (TrustAdHocAI) project:

Quantifying model robustness boundaries
Establishing trust metrics for vision systems
Human-AI collaboration frameworks

Project Links:

License

MIT License - See LICENSE for details.

Authors & Acknowledgments

Team:

Mario Koddenbrock (Mario.Koddenbrock@HTW-Berlin.de)
Erik Rodner (Erik.Rodner@HTW-Berlin.de)
David Brodmann (David.Brodmann@htw-berlin.de)
Rudolf Hoffmann (Rudolf.Hoffmann@student.htw-berlin.de)

Funding:
This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) via the Project ApplFM (528483508) and the Institut für angewandte Forschung Berlin (IFAF, Berlin Institute for Applied Research) via Project TrustAdHocAI (TAHAI).

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
deepbench-backend @ 45d3428		deepbench-backend @ 45d3428
deepbench-frontend @ 1df0578		deepbench-frontend @ 1df0578
resources		resources
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepBench: Domain-Specific Robustness Evaluation Framework for Vision and Vision-Language Models

Table of Contents

Project Overview

Components

🚀 DeepBench Backend

📊 DeepBench Analysis Frontend

Usage

Project TAHAI

License

Authors & Acknowledgments

About

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DeepBench: Domain-Specific Robustness Evaluation Framework for Vision and Vision-Language Models

Table of Contents

Project Overview

Components

🚀 DeepBench Backend

📊 DeepBench Analysis Frontend

Usage

Project TAHAI

License

Authors & Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!