Skip to content

AITLS-DLP/BE-2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

18 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AI-TLS-DLP Backend v1.1.0

ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

ํ•œ๊ตญ์–ด ๊ฐœ์ธ์ •๋ณด(PII) ํƒ์ง€๋ฅผ ์œ„ํ•œ ์ •๊ทœ์‹ + BERT NER ๊ธฐ๋ฐ˜ FastAPI ๋ฐฑ์—”๋“œ ์„œ๋น„์Šค์ž…๋‹ˆ๋‹ค.

ํ—ˆ๊น…ํŽ˜์ด์Šค์˜ psh3333/roberta-large-korean-pii5 ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ PII ํƒ์ง€ ๋ฐ ์ฐจ๋‹จ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

โœจ ์ฃผ์š” ๊ธฐ๋Šฅ (v1.1.0)

  • โœ… JWT ๊ธฐ๋ฐ˜ ์ธ์ฆ ์‹œ์Šคํ…œ: ํšŒ์›๊ฐ€์ž…, ๋กœ๊ทธ์ธ, ํ† ํฐ ์ธ์ฆ
  • โœ… ์ •๊ทœ์‹ ๊ธฐ๋ฐ˜ PII ํƒ์ง€: ์ „ํ™”๋ฒˆํ˜ธ, ์ด๋ฉ”์ผ ๋“ฑ ํŒจํ„ด ๋งค์นญ
  • โœ… BERT NER ๊ธฐ๋ฐ˜ PII ํƒ์ง€: RoBERTa ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ๊ฐœ์ธ์ •๋ณด ์—”ํ‹ฐํ‹ฐ ์ธ์‹
  • โœ… ์‹ค์‹œ๊ฐ„ ์ฐจ๋‹จ ํŒ๋‹จ: ํƒ์ง€๋œ PII ๊ธฐ๋ฐ˜ ์ž๋™ ์ฐจ๋‹จ ์—ฌ๋ถ€ ๊ฒฐ์ •
  • โœ… RESTful API: FastAPI ๊ธฐ๋ฐ˜ ๊ณ ์„ฑ๋Šฅ API
  • โœ… ์ž๋™ ๋ฌธ์„œํ™”: Swagger UI ์ œ๊ณต
  • โœ… PostgreSQL ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค: ์‚ฌ์šฉ์ž ์ •๋ณด ๋ฐ ์ธ์ฆ ๊ด€๋ฆฌ

๐Ÿ—๏ธ ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

DLP-BE/
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ main.py                # FastAPI ์ง„์ž…์ 
โ”‚   โ”œโ”€โ”€ api/routers/
โ”‚   โ”‚   โ”œโ”€โ”€ auth.py           # ์ธ์ฆ API (ํšŒ์›๊ฐ€์ž…, ๋กœ๊ทธ์ธ)
โ”‚   โ”‚   โ””โ”€โ”€ pii.py            # PII ํƒ์ง€ API (์ธ์ฆ ํ•„์š”)
โ”‚   โ”œโ”€โ”€ services/
โ”‚   โ”‚   โ””โ”€โ”€ pii_service.py    # ๋น„์ฆˆ๋‹ˆ์Šค ๋กœ์ง
โ”‚   โ”œโ”€โ”€ ai/
โ”‚   โ”‚   โ”œโ”€โ”€ pii_detector.py   # RoBERTa PII ํƒ์ง€ ๋ชจ๋ธ
โ”‚   โ”‚   โ””โ”€โ”€ model_manager.py  # ๋ชจ๋ธ ์‹ฑ๊ธ€ํ†ค ๊ด€๋ฆฌ
โ”‚   โ”œโ”€โ”€ schemas/
โ”‚   โ”‚   โ”œโ”€โ”€ auth.py           # ์ธ์ฆ ์Šคํ‚ค๋งˆ
โ”‚   โ”‚   โ””โ”€โ”€ pii.py            # PII ์Šคํ‚ค๋งˆ
โ”‚   โ”œโ”€โ”€ models/
โ”‚   โ”‚   โ””โ”€โ”€ user.py           # User ๋ฐ์ดํ„ฐ ๋ชจ๋ธ
โ”‚   โ”œโ”€โ”€ repository/
โ”‚   โ”‚   โ””โ”€โ”€ user_repo.py      # User ๋ฐ์ดํ„ฐ ์ ‘๊ทผ ๋ ˆ์ด์–ด
โ”‚   โ”œโ”€โ”€ db/
โ”‚   โ”‚   โ”œโ”€โ”€ base.py           # SQLAlchemy Base
โ”‚   โ”‚   โ””โ”€โ”€ session.py        # DB ์„ธ์…˜ ๊ด€๋ฆฌ
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ config.py         # ์„ค์ •
โ”‚   โ”‚   โ”œโ”€โ”€ security.py       # JWT ๋ฐ ์•”ํ˜ธํ™”
โ”‚   โ”‚   โ””โ”€โ”€ dependencies.py   # ์ธ์ฆ ์˜์กด์„ฑ
โ”‚   โ””โ”€โ”€ utils/
โ”‚       โ””โ”€โ”€ entity_extractor.py # BIO ํƒœ๊ทธ ์—”ํ‹ฐํ‹ฐ ์ถ”์ถœ
โ”œโ”€โ”€ alembic/                   # DB ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜
โ”‚   โ””โ”€โ”€ versions/             # ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ํŒŒ์ผ๋“ค
โ”œโ”€โ”€ docker-compose.yml         # PostgreSQL ์ปจํ…Œ์ด๋„ˆ
โ”œโ”€โ”€ alembic.ini               # Alembic ์„ค์ •
โ”œโ”€โ”€ .env                      # ํ™˜๊ฒฝ ๋ณ€์ˆ˜ (์ƒ์„ฑ ํ•„์š”)
โ”œโ”€โ”€ pyproject.toml            # ์˜์กด์„ฑ ๊ด€๋ฆฌ
โ””โ”€โ”€ CLAUDE.md                 # ๊ฐœ๋ฐœ ๋ฌธ์„œ

๐Ÿš€ ๋น ๋ฅธ ์‹œ์ž‘ (์ƒˆ๋กœ์šด PC์—์„œ ์‹คํ–‰)

1. ์š”๊ตฌ์‚ฌํ•ญ

  • Python 3.13+
  • Docker & Docker Compose
  • uv (๋˜๋Š” pip)

2. ํ”„๋กœ์ ํŠธ ํด๋ก  ๋ฐ ์ด๋™

git clone <repository-url>
cd DLP-BE

3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •

.env ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ๋‹ค์Œ ๋‚ด์šฉ์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค:

# ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค
DATABASE_URL=postgresql+asyncpg://admin:password123@localhost:5432/ai_tlsdlp

# JWT ์ธ์ฆ
SECRET_KEY=dlp-secret-key-change-in-production-minimum-32-characters-required
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=30

# AI ๋ชจ๋ธ ์„ค์ •
PII_MODEL_NAME=psh3333/roberta-large-korean-pii5
DEFAULT_PII_THRESHOLD=0.59

# ์•ฑ ์„ค์ •
DEBUG=True

4. ์˜์กด์„ฑ ์„ค์น˜

# uv ์‚ฌ์šฉ (๊ถŒ์žฅ)
uv sync

# ๋˜๋Š” pip ์‚ฌ์šฉ
pip install -e .

5. PostgreSQL ์ปจํ…Œ์ด๋„ˆ ์‹œ์ž‘

# Docker ๋ฐ๋ชฌ์ด ์‹คํ–‰ ์ค‘์ธ์ง€ ํ™•์ธ ํ›„
docker-compose up -d

# ์ปจํ…Œ์ด๋„ˆ ์ƒํƒœ ํ™•์ธ
docker-compose ps

6. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ์ ์šฉ

# Alembic์œผ๋กœ users ํ…Œ์ด๋ธ” ์ƒ์„ฑ
alembic upgrade head

7. ์„œ๋ฒ„ ์‹คํ–‰

uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

8. API ๋ฌธ์„œ ํ™•์ธ


๐Ÿ”„ ๋‹ค๋ฅธ PC์—์„œ ๋™์ผ ํ™˜๊ฒฝ ๊ตฌ์ถ• ์š”์•ฝ

์ˆœ์„œ๋Œ€๋กœ ์‹คํ–‰ํ•˜์„ธ์š”:

# 1. ํ”„๋กœ์ ํŠธ ํด๋ก 
git clone <repository-url> && cd DLP-BE

# 2. .env ํŒŒ์ผ ์ƒ์„ฑ (์œ„ ๋‚ด์šฉ ๋ณต์‚ฌ)
nano .env

# 3. ์˜์กด์„ฑ ์„ค์น˜
uv sync

# 4. PostgreSQL ์‹œ์ž‘
docker-compose up -d

# 5. DB ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ (ํ…Œ์ด๋ธ” ์ƒ์„ฑ)
alembic upgrade head

# 6. ์„œ๋ฒ„ ์‹คํ–‰
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

์ด์ œ http://localhost:8000/docs ์—์„œ API๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

๐Ÿ“ก API ์‚ฌ์šฉ๋ฒ•

1. ํšŒ์›๊ฐ€์ž…

์—”๋“œํฌ์ธํŠธ: POST /api/v1/auth/register

curl -X POST "http://localhost:8000/api/v1/auth/register" \
  -H "Content-Type: application/json" \
  -d '{
    "username": "testuser",
    "email": "test@example.com",
    "password": "password123",
    "full_name": "ํ…Œ์ŠคํŠธ ์‚ฌ์šฉ์ž"
  }'

์‘๋‹ต:

{
  "id": 1,
  "username": "testuser",
  "email": "test@example.com",
  "full_name": "ํ…Œ์ŠคํŠธ ์‚ฌ์šฉ์ž",
  "is_active": true,
  "is_superuser": false
}

2. ๋กœ๊ทธ์ธ

์—”๋“œํฌ์ธํŠธ: POST /api/v1/auth/login

curl -X POST "http://localhost:8000/api/v1/auth/login" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=testuser&password=password123"

์‘๋‹ต:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer"
}

3. ํ˜„์žฌ ์‚ฌ์šฉ์ž ์ •๋ณด ์กฐํšŒ

์—”๋“œํฌ์ธํŠธ: GET /api/v1/auth/me (์ธ์ฆ ํ•„์š”)

curl -X GET "http://localhost:8000/api/v1/auth/me" \
  -H "Authorization: Bearer <access_token>"

4. PII ํƒ์ง€ (์ธ์ฆ ํ•„์š”)

์—”๋“œํฌ์ธํŠธ: POST /api/v1/pii/detect

curl -X POST "http://localhost:8000/api/v1/pii/detect" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <access_token>" \
  -d '{"text": "์ œ ์ด๋ฆ„์€ ํ™๊ธธ๋™์ด๊ณ  ์ „ํ™”๋ฒˆํ˜ธ๋Š” 010-1234-5678์ž…๋‹ˆ๋‹ค"}'

์‘๋‹ต ์˜ˆ์‹œ:

{
  "has_pii": true,
  "reason": "๊ฐœ์ธ์ •๋ณด 2๊ฐœ ํƒ์ง€๋จ (PERSON, PHONE_NUM)",
  "details": "ํƒ์ง€๋œ ๊ฐœ์ธ์ •๋ณด: PERSON 'ํ™๊ธธ๋™' (์‹ ๋ขฐ๋„: 95.0%), PHONE_NUM '010-1234-5678' (์‹ ๋ขฐ๋„: 89.0%)",
  "entities": [
    {
      "type": "PERSON",
      "value": "ํ™๊ธธ๋™",
      "confidence": 0.95,
      "token_count": 2
    },
    {
      "type": "PHONE_NUM",
      "value": "010-1234-5678",
      "confidence": 0.89,
      "token_count": 7
    }
  ]
}

5. ํ—ฌ์Šค์ฒดํฌ (์ธ์ฆ ํ•„์š”)

์—”๋“œํฌ์ธํŠธ: GET /api/v1/pii/health

curl -X GET "http://localhost:8000/api/v1/pii/health" \
  -H "Authorization: Bearer <access_token>"

โš ๏ธ ์ฃผ์˜: v1.1.0๋ถ€ํ„ฐ ๋ชจ๋“  PII API๋Š” JWT ์ธ์ฆ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค!

๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ

  • ๋ฐฑ์—”๋“œ: FastAPI + Python 3.13
  • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค: PostgreSQL 15 + SQLAlchemy 2.0 (Async)
  • ์ธ์ฆ: JWT (python-jose) + bcrypt (passlib)
  • ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜: Alembic
  • AI ๋ชจ๋ธ: Transformers + PyTorch
  • PII ๋ชจ๋ธ: psh3333/roberta-large-korean-pii5
  • ํŒจํ‚ค์ง€ ๊ด€๋ฆฌ: uv
  • ์ปจํ…Œ์ด๋„ˆ: Docker + Docker Compose

๐Ÿ“Š ์„ฑ๋Šฅ

  • ์ฒซ ์š”์ฒญ: ~2์ดˆ (๋ชจ๋ธ ๋กœ๋”ฉ ํฌํ•จ)
  • ์ดํ›„ ์š”์ฒญ: 100-300ms
  • ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ ํ…์ŠคํŠธ: ์ตœ๋Œ€ 512 ํ† ํฐ (์•ฝ 1000์ž)

๐Ÿ—บ๏ธ ๋กœ๋“œ๋งต

Phase 1: ๊ธฐ๋ณธ PII ํƒ์ง€ + ์ธ์ฆ ์‹œ์Šคํ…œ (์™„๋ฃŒ) โœ…

  • RoBERTa ๋ชจ๋ธ ํ†ตํ•ฉ
  • ์ •๊ทœ์‹ ๊ธฐ๋ฐ˜ ํŒจํ„ด ๋งค์นญ
  • BIO ํƒœ๊น… ์ •ํ™•๋„ ๊ฐœ์„ 
  • ๋ชจ๋ธ ์„ฑ๋Šฅ ์ตœ์ ํ™”
  • JWT ๊ธฐ๋ฐ˜ ์ธ์ฆ/์ธ๊ฐ€ ์‹œ์Šคํ…œ
  • PostgreSQL + Alembic ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜

Phase 2: ํ™•์žฅ ๊ธฐ๋Šฅ (์˜ˆ์ •)

  • ๋ฌธ์„œ ์—…๋กœ๋“œ ๋ฐ ํŒŒ์‹ฑ (PDF, DOCX)
  • ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ๋ฌธ์„œ ๋น„๊ต (KoSimCSE)
  • ๋ฒกํ„ฐ DB ์—ฐ๋™ (ChromaDB)
  • ๊ด€๋ฆฌ์ž ๋Œ€์‹œ๋ณด๋“œ

Phase 3: ์šด์˜ ์ค€๋น„ (์˜ˆ์ •)

  • ๋‹จ์œ„/ํ†ตํ•ฉ ํ…Œ์ŠคํŠธ
  • ๋กœ๊น… ๋ฐ ๋ชจ๋‹ˆํ„ฐ๋ง
  • Rate limiting ๋ฐ ๋ณด์•ˆ ๊ฐ•ํ™”
  • Docker ์ปจํ…Œ์ด๋„ˆํ™” (์ „์ฒด ์•ฑ)

๐Ÿ”ง ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๊ด€๋ฆฌ

์ƒˆ๋กœ์šด ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ์ƒ์„ฑ

# ๋ชจ๋ธ ๋ณ€๊ฒฝ ํ›„ ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ํŒŒ์ผ ์ž๋™ ์ƒ์„ฑ
alembic revision --autogenerate -m "์„ค๋ช…"

# ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ์ ์šฉ
alembic upgrade head

๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ๋กค๋ฐฑ

# ์ด์ „ ๋ฒ„์ „์œผ๋กœ ๋˜๋Œ๋ฆฌ๊ธฐ
alembic downgrade -1

# ํŠน์ • ๋ฒ„์ „์œผ๋กœ ๋˜๋Œ๋ฆฌ๊ธฐ
alembic downgrade <revision_id>

๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ํžˆ์Šคํ† ๋ฆฌ ํ™•์ธ

alembic history
alembic current

๐Ÿ“– ๋ฌธ์„œ

์ž์„ธํ•œ ๊ฐœ๋ฐœ ๋ฌธ์„œ๋Š” CLAUDE.md๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

๐Ÿค ๊ธฐ์—ฌ

์ด ํ”„๋กœ์ ํŠธ๋Š” KISIA ํ”„๋กœ์ ํŠธ์˜ ์ผ๋ถ€์ž…๋‹ˆ๋‹ค.

๐Ÿ“„ ๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ์˜ ๋ผ์ด์„ ์Šค๋Š” ํ”„๋กœ์ ํŠธ ์†Œ์œ ์ž์™€ ํ˜‘์˜ํ•˜์„ธ์š”.

๐Ÿ“ง ๋ฌธ์˜

ํ”„๋กœ์ ํŠธ ๊ด€๋ จ ๋ฌธ์˜์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด ์ด์Šˆ๋ฅผ ์ƒ์„ฑํ•ด์ฃผ์„ธ์š”.


AI-TLS-DLP Backend v1.1.0 - JWT ์ธ์ฆ + ์ •๊ทœ์‹ + BERT NER ๊ธฐ๋ฐ˜ PII ํƒ์ง€ ์‹œ์Šคํ…œ

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published