Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion agent/prompts/doc_templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,5 +103,5 @@
- Ensure commands and source_dir are coherent
- Keep naming derived from app idea

Return YAML only. No markdown fences.
Return JSON with one key: 'content' whose value is the YAML string. No markdown fences.
""".strip()
Empty file added agent/utils/__init__.py
Empty file.
88 changes: 88 additions & 0 deletions agent/utils/json_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
import json
import logging
import re

logger = logging.getLogger(__name__)

# Directories and patterns that LLM-generated files must never target.
_BLOCKED_PATH_PREFIXES = (
".github/",
".git/",
".env",
".gradient/",
)

_BLOCKED_EXACT_FILES = {
".env",
".gitignore",
"Dockerfile",
}


def parse_json_response(content: str, default: dict) -> dict:
"""Parse an LLM response that should contain JSON.

Strips markdown fences, attempts ``json.loads``, and falls back to
extracting the first ``{...}`` block via regex. On total failure
the *default* dict is returned with the raw response attached under
the ``raw_response`` key.
"""
content = content.strip()
if content.startswith("```"):
content = re.sub(r"^```(?:json)?\n?", "", content)
content = re.sub(r"\n?```$", "", content)

try:
return json.loads(content)
except json.JSONDecodeError:
json_match = re.search(r"\{[\s\S]*\}", content)
if json_match:
try:
return json.loads(json_match.group())
except json.JSONDecodeError:
logger.warning(
"Failed to parse extracted JSON block (length=%d)",
len(json_match.group()),
)

logger.warning(
"Returning default for unparseable LLM response (length=%d)",
len(content),
)
result = dict(default)
result["raw_response"] = content[:500]
return result
Comment on lines +35 to +54
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

최상위 JSON이 객체가 아닐 때 호출부가 바로 깨집니다.

json.loads()는 리스트나 문자열도 정상 파싱으로 반환하는데, 여기서는 타입 검증 없이 그대로 돌려줍니다. 이 유틸을 쓰는 agent/nodes/code_generator.pyagent/nodes/doc_generator.py는 곧바로 .get()을 호출하므로, 모델이 []"..." 같은 유효한 JSON을 내보내면 AttributeError가 납니다. 객체가 아닐 때는 default로 폴백해야 합니다.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent/utils/json_utils.py` around lines 35 - 54, The current JSON parsing
returns any JSON type (list/string/etc.), which breaks callers like
agent/nodes/code_generator.py and doc_generator.py that expect a dict; after
each successful json.loads (both the initial parse and the fallback
json_match.parse), verify the result is a dict (mapping) and if not, fall back
to creating result = dict(default) with result["raw_response"] = content[:500]
and return that; preserve existing warning logs where parsing fails but ensure
non-dict parsed values are treated the same as parse failures so callers always
receive a dict.



def slugify(value: object, *, max_length: int = 0, fallback: str = "vibedeploy-app") -> str:
"""Convert *value* to a URL/repo-safe slug.

* Non-alphanumeric characters (except hyphens) are removed.
* Whitespace and underscores become single hyphens.
* Consecutive hyphens are collapsed.
* If *max_length* > 0 the slug is truncated and trailing hyphens
are stripped.
"""
text = str(value) if value else ""
clean = re.sub(r"[^a-zA-Z0-9\s-]", "", text).strip().lower()
clean = re.sub(r"[\s_]+", "-", clean)
clean = re.sub(r"-+", "-", clean)
if max_length > 0:
clean = clean[:max_length].strip("-")
return clean or fallback


def is_safe_file_path(path: str) -> bool:
"""Return ``True`` when *path* is safe to write to a generated repo.

Blocks sensitive directories (``.github/``, ``.git/``) and files
(``.env``, ``Dockerfile``) that could be exploited via prompt
injection.
"""
normalized = path.lstrip("/")
if normalized in _BLOCKED_EXACT_FILES:
return False
for prefix in _BLOCKED_PATH_PREFIXES:
if normalized.startswith(prefix):
return False
return True
Comment on lines +75 to +88
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

민감 경로 차단이 ..와 중첩 .env를 우회당합니다.

lstrip("/")startswith()만 검사하면 web/../.github/workflows/deploy.yml, src/../../.env, config/.env, web/.env.local 같은 경로가 그대로 통과합니다. 프롬프트 인젝션 방어가 목적이라면 경로를 먼저 정규화하고, ./.. 세그먼트와 모든 민감한 path component를 차단해야 합니다.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent/utils/json_utils.py` around lines 75 - 88, The is_safe_file_path
function currently only strips leading slashes and checks startswith, which
allows traversal and nested sensitive names to bypass (e.g., web/../.github,
config/.env); fix by fully normalizing and sanitizing the path first (use
os.path.normpath or pathlib.Path.resolve-like normalization without following
symlinks), reject any path containing ".." segments or leading "..", split the
normalized path into components and ensure none of the components exactly match
entries in _BLOCKED_EXACT_FILES and none of the path prefixes match
_BLOCKED_PATH_PREFIXES (also treat names like ".env.local" as sensitive by
matching component startswith ".env"), and keep the final boolean behavior in
is_safe_file_path.

Loading
Loading