Ai safety feature #23

Shuyib · 2025-10-12T09:50:40Z

This pull request introduces significant improvements to the project's AI safety and model management workflows, focusing on integrating a robust Inspect AI safety layer, updating supported models, and enhancing Docker container startup reliability. The most important changes are grouped below by theme.

AI Safety Layer Integration

Added comprehensive Inspect AI safety layer guide (INSPECT_SAFETY_GUIDE.md) and integrated safety checks into app.py, evaluating user input for prompt injection, jailbreaking, and other attacks before LLM processing. Safety results are logged, and configuration options for strict/normal modes are provided. [1] [2] [3]
Updated README.md to document the Inspect AI safety layer, its features, usage, integration points, and references, making security practices transparent and easy to follow.

Model and Docker Management

Updated the base Ollama Docker image from version 0.3.3 to 0.6.8 and switched the default model from qwen2.5:0.5b to qwen3:0.6b. Added a robust entrypoint script to ensure the model is pulled on container startup, improving reliability and startup diagnostics. [1] [2]
Changed Ollama service port mappings in docker-compose.yml and docker-compose-all.yml to avoid conflicts and clarify external/internal port usage. [1] [2]

Documentation and Project Structure

Expanded documentation in README.md to include new recommended models (e.g., Gemma 27B), added references to the safety guide and implementation summary, and clarified file roles for easier onboarding and maintenance. [1] [2]

Minor Fixes

Corrected a typo in Makefile for improved clarity in linting instructions.

These changes collectively strengthen the project's security posture, improve model support, and make deployment and usage more robust and transparent.

Co-authored-by: Shuyib <12908522+Shuyib@users.noreply.github.com>

…ecks and logging in Inspect AI layer

remove sudo and edit typos

format code with black

Added Gemma model recommendation and clarified VRAM requirements.

…nd project documentation

…1-f50d49393314 Add Inspect AI Safety Layer for Prompt Injection and Jailbreaking Detection

Copilot AI and others added 10 commits September 30, 2025 04:48

Initial plan

527a1fa

Add Inspect AI safety layer with comprehensive testing

373fa29

Co-authored-by: Shuyib <12908522+Shuyib@users.noreply.github.com>

Add demo script, usage guide, and update documentation

7d8d87c

Co-authored-by: Shuyib <12908522+Shuyib@users.noreply.github.com>

Add comprehensive implementation summary document

d5a520e

Co-authored-by: Shuyib <12908522+Shuyib@users.noreply.github.com>

Update Dockerfile and Compose files for Ollama API; enhance safety ch…

52046f3

…ecks and logging in Inspect AI layer

Update Makefile

879c9db

remove sudo and edit typos

Add files via upload

31e878f

format code with black

Update README with Gemma model and VRAM details

3654bb3

Added Gemma model recommendation and clarified VRAM requirements.

Update README to include new files and directories for Docker setup a…

c8754cb

…nd project documentation

Merge pull request #22 from Shuyib/copilot/fix-dd75908f-a10f-47d1-be7…

570974a

…1-f50d49393314 Add Inspect AI Safety Layer for Prompt Injection and Jailbreaking Detection

Shuyib merged commit 1541f5d into main Oct 12, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ai safety feature #23

Ai safety feature #23

Uh oh!

Shuyib commented Oct 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Ai safety feature #23

Ai safety feature #23

Uh oh!

Conversation

Shuyib commented Oct 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants