Skip to content

zy-ning/LinguaSafe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinguaSafe

A multilingual LLM safety evaluation framework. Evaluates models against unsafe content across 12 languages and 5 harm categories, with support for jailbreak robustness testing.

Languages: English, Chinese, Arabic, Russian, Serbian, Thai, Korean, Vietnamese, Czech, Hungarian, Bengali, Malay

Categories: Explicit Content · Crimes & Illegal Activities · Harm & Misuse · Privacy & Property · Fairness & Justice

Installation

uv venv --python 3.10
uv sync

Create api.toml (already in .gitignore):

[openai]
api_key = "sk-..."
models = ["gpt-4o-2024-11-20", "gpt-5-mini"]

[local]
base_url = "http://localhost:8000/v1"
models = ["Llama-3.1-8B-Instruct", "..."]

[oai_moderate]
api_key = "sk-..."
models = ["omni-moderation-latest"]

Each section maps to an API endpoint. models lists which model names route to that endpoint.

Usage

# Evaluate a model across all languages
python -m linguasafe.eval --model gpt-4o-2024-11-20 --assitSLM gpt-5-mini

# Single language
python -m linguasafe.eval --model gpt-4o-2024-11-20 --lang zh

Re-running the same command resumes from where it left off — already-completed tasks are skipped automatically.

Results are saved to logs/res/.

About

The official github repo for [LinguaSafe paper](https://arxiv.org/abs/2508.12733)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors