Skip to content

Commit 2f7eec6

Browse files
authored
Merge pull request #21 from AperturePlus/develop
Develop
2 parents bbb8f00 + 962ad2f commit 2f7eec6

23 files changed

Lines changed: 825 additions & 133 deletions

.dockerignore

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
.git
2+
.github
3+
.hypothesis
4+
.mypy_cache
5+
.pytest_cache
6+
.ruff_cache
7+
.venv
8+
.aci
9+
.compare
10+
__pycache__
11+
dist
12+
build
13+
tests
14+
doc
15+
*.pyc
16+
*.pyo
17+
*.pyd
18+
*.log
19+
uv.lock

Dockerfile

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
FROM python:3.12-slim
2+
3+
ENV PYTHONDONTWRITEBYTECODE=1 \
4+
PYTHONUNBUFFERED=1
5+
6+
WORKDIR /build
7+
8+
COPY pyproject.toml README.md ./
9+
COPY src ./src
10+
11+
RUN pip install --no-cache-dir uv \
12+
&& uv pip install --system .
13+
14+
WORKDIR /data
15+
16+
ENTRYPOINT ["aci-mcp"]

README.md

Lines changed: 52 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ aci shell
8383
```
8484

8585
This launches an interactive REPL (Read-Eval-Print Loop) with:
86+
8687
- Command history (up/down arrows to navigate)
8788
- Tab completion for commands
8889
- Persistent history across sessions
@@ -102,7 +103,7 @@ This launches an interactive REPL (Read-Eval-Print Loop) with:
102103

103104
### Example Session
104105

105-
```
106+
```text
106107
$ aci shell
107108
108109
_ ____ ___ ____ _ _ _
@@ -142,6 +143,7 @@ Search queries support inline modifiers to filter results:
142143
| `exclude:<pattern>` | Alias for `-path:` | `exclude:fixtures` |
143144

144145
Multiple exclusions can be combined:
146+
145147
```bash
146148
aci search "database query -path:tests -path:fixtures"
147149
```
@@ -190,13 +192,53 @@ ACI supports the Model Context Protocol (MCP), allowing LLMs to directly interac
190192
}
191193
```
192194

193-
2. Ensure `.env` exists in the working directory with required settings (see `.env.example`)
195+
1. Ensure `.env` exists in the working directory with required settings (see `.env.example`)
194196

195-
3. Use natural language to interact with your codebase:
197+
2. Use natural language to interact with your codebase:
196198
- "Index the current directory"
197199
- "Search for authentication functions"
198200
- "Show me the index status"
199201

202+
### Docker Sidecar Delivery
203+
204+
For agentic coding tools, the recommended deployment model is a local Docker sidecar:
205+
206+
- The code repository stays on the user's machine
207+
- The MCP server runs in a local container
208+
- Qdrant runs either as another local container or as a cloud endpoint
209+
- The embedding API uses the user's own API key
210+
211+
Build the image:
212+
213+
```bash
214+
docker build -t aci-mcp:latest .
215+
```
216+
217+
If you want a local Qdrant container, start it separately:
218+
219+
```bash
220+
docker run -d --name aci-qdrant -p 6333:6333 qdrant/qdrant:latest
221+
```
222+
223+
Then configure your MCP client to launch ACI through Docker. A complete template is available in `mcp-config.docker.example.json`.
224+
225+
Important runtime rules:
226+
227+
- Mount the host source tree read-only into the container, for example `/workspace`
228+
- Persist `/data` as a Docker volume so `.aci/index.db` survives container restarts
229+
- Set `ACI_MCP_WORKSPACE_ROOT` for relative paths
230+
- Set `ACI_MCP_PATH_MAPPINGS` when the MCP client sends host-native absolute paths such as `D:\repo` or `/Users/alice/repo`
231+
232+
Example mapping values:
233+
234+
```text
235+
ACI_MCP_WORKSPACE_ROOT=/workspace
236+
ACI_MCP_PATH_MAPPINGS=D:\repo=/workspace
237+
ACI_MCP_PATH_MAPPINGS=/Users/alice/repo=/workspace
238+
```
239+
240+
When path mappings are configured, MCP tools can accept the host path provided by the client and resolve it to the mounted container path automatically.
241+
200242
### Available MCP Tools
201243

202244
| Tool | Description |
@@ -235,6 +277,7 @@ REINDEX=1 uv run python scripts/measure_mcp_search.py
235277
### Debug Mode
236278

237279
Set `ACI_ENV=development` in `.env` to enable debug logging:
280+
238281
```
239282
ACI_ENV=development
240283
```
@@ -266,6 +309,7 @@ cp .env.example .env
266309
```
267310

268311
Key settings:
312+
269313
| Variable | Description | Required |
270314
|----------|-------------|----------|
271315
| `ACI_EMBEDDING_API_KEY` | API key for embedding service | Yes |
@@ -275,6 +319,8 @@ Key settings:
275319
| `ACI_VECTOR_STORE_API_KEY` | Qdrant API key (for Qdrant Cloud) | No |
276320
| `ACI_VECTOR_STORE_HOST` | Qdrant host | No (defaults to localhost) |
277321
| `ACI_VECTOR_STORE_PORT` | Qdrant port | No (defaults to 6333) |
322+
| `ACI_MCP_WORKSPACE_ROOT` | Base directory for relative MCP paths inside the container/runtime | No |
323+
| `ACI_MCP_PATH_MAPPINGS` | Host-to-container path prefix mappings for MCP, separated by `;` | No |
278324
| `ACI_SERVER_HOST` | HTTP server host | No (defaults to 0.0.0.0) |
279325
| `ACI_SERVER_PORT` | HTTP server port | No (defaults to 8000) |
280326
| `ACI_ENV` | Environment (development/production) | No |
@@ -284,3 +330,6 @@ See `.env.example` for the full list of options.
284330
The CLI and HTTP server will attempt to auto-start a local Qdrant Docker container only when
285331
targeting a local endpoint (`localhost` / `127.0.0.1`). For cloud Qdrant (`ACI_VECTOR_STORE_URL`),
286332
it will not run Docker.
333+
334+
When ACI itself is running inside a container, it will not attempt to launch nested Docker for Qdrant.
335+
In that setup, run Qdrant as a separate local container or point `ACI_VECTOR_STORE_URL` to Qdrant Cloud.

doc/README.zh-CN.md

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ aci shell
8282
```
8383

8484
启动后会进入 REPL(Read-Eval-Print Loop),包含:
85+
8586
- 命令历史(方向键上下浏览)
8687
- 命令自动补全(Tab)
8788
- 跨会话持久化历史
@@ -190,13 +191,53 @@ ACI 支持 Model Context Protocol(MCP),使 LLM 可直接调用你的代码
190191
}
191192
```
192193

193-
2. 确保工作目录存在 `.env` 且配置完整(参考 `../.env.example`
194+
1. 确保工作目录存在 `.env` 且配置完整(参考 `../.env.example`
194195

195-
3. 可用自然语言与代码库交互,例如:
196+
2. 可用自然语言与代码库交互,例如:
196197
- “索引当前目录”
197198
- “搜索认证相关函数”
198199
- “查看当前索引状态”
199200

201+
### 以 Docker Sidecar 交付 MCP
202+
203+
对于 Agentic coding tools,推荐的交付模型是本地 Docker sidecar:
204+
205+
- 代码仓库保留在用户机器上
206+
- MCP server 运行在本地容器里
207+
- Qdrant 可以是另一个本地容器,也可以是云端地址
208+
- Embedding API 使用用户自己的 key
209+
210+
构建镜像:
211+
212+
```bash
213+
docker build -t aci-mcp:latest .
214+
```
215+
216+
如果使用本地 Qdrant 容器,单独启动它:
217+
218+
```bash
219+
docker run -d --name aci-qdrant -p 6333:6333 qdrant/qdrant:latest
220+
```
221+
222+
然后让 MCP 客户端通过 Docker 拉起 ACI。完整模板见 `mcp-config.docker.example.json`
223+
224+
运行时约定:
225+
226+
- 把宿主机源码目录以只读方式挂载进容器,例如 `/workspace`
227+
-`/data` 挂成 Docker volume,这样 `.aci/index.db` 不会随容器重建丢失
228+
- 相对路径场景设置 `ACI_MCP_WORKSPACE_ROOT`
229+
- 如果 MCP 客户端传的是宿主机绝对路径,例如 `D:\repo``/Users/alice/repo`,设置 `ACI_MCP_PATH_MAPPINGS`
230+
231+
示例:
232+
233+
```text
234+
ACI_MCP_WORKSPACE_ROOT=/workspace
235+
ACI_MCP_PATH_MAPPINGS=D:\repo=/workspace
236+
ACI_MCP_PATH_MAPPINGS=/Users/alice/repo=/workspace
237+
```
238+
239+
配置后,MCP 工具可以接受客户端传来的宿主机路径,并自动解析到容器内的挂载路径。
240+
200241
### MCP 可用工具
201242

202243
| 工具 | 说明 |
@@ -277,10 +318,14 @@ cp .env.example .env
277318
| `ACI_VECTOR_STORE_API_KEY` | Qdrant API Key(Qdrant Cloud) ||
278319
| `ACI_VECTOR_STORE_HOST` | Qdrant 主机地址 | 否(默认 localhost) |
279320
| `ACI_VECTOR_STORE_PORT` | Qdrant 端口 | 否(默认 6333) |
321+
| `ACI_MCP_WORKSPACE_ROOT` | MCP 在容器/运行时中解析相对路径时使用的基础目录 ||
322+
| `ACI_MCP_PATH_MAPPINGS` | MCP 使用的宿主路径前缀到容器路径前缀映射,使用 `;` 分隔 ||
280323
| `ACI_SERVER_HOST` | HTTP 服务主机地址 | 否(默认 0.0.0.0) |
281324
| `ACI_SERVER_PORT` | HTTP 服务端口 | 否(默认 8000) |
282325
| `ACI_ENV` | 运行环境(development/production) ||
283326

284327
完整配置请查看 `../.env.example`
285328

286329
CLI 和 HTTP 服务仅在目标为本地端点(`localhost` / `127.0.0.1`)时尝试自动启动本地 Qdrant Docker 容器。若使用云端 Qdrant(`ACI_VECTOR_STORE_URL`),不会启动 Docker。
330+
331+
当 ACI 自身运行在容器内时,不会再尝试嵌套启动 Docker 来拉起 Qdrant。此时请把 Qdrant 作为独立本地容器运行,或直接配置 `ACI_VECTOR_STORE_URL` 指向 Qdrant Cloud。

mcp-config.docker.example.json

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
{
2+
"mcpServers": {
3+
"aci": {
4+
"command": "docker",
5+
"args": [
6+
"run",
7+
"-i",
8+
"--rm",
9+
"-v",
10+
"HOST_SOURCE_PATH:/workspace:ro",
11+
"-v",
12+
"aci-mcp-data:/data",
13+
"-e",
14+
"ACI_MCP_WORKSPACE_ROOT=/workspace",
15+
"-e",
16+
"ACI_MCP_PATH_MAPPINGS=HOST_SOURCE_PATH=/workspace",
17+
"-e",
18+
"ACI_EMBEDDING_API_URL",
19+
"-e",
20+
"ACI_EMBEDDING_API_KEY",
21+
"-e",
22+
"ACI_EMBEDDING_MODEL",
23+
"-e",
24+
"ACI_EMBEDDING_DIMENSION",
25+
"-e",
26+
"ACI_VECTOR_STORE_URL=http://host.docker.internal:6333",
27+
"-e",
28+
"ACI_VECTOR_STORE_VECTOR_SIZE=1024",
29+
"aci-mcp:latest"
30+
],
31+
"env": {
32+
"ACI_EMBEDDING_API_URL": "https://api.openai.com/v1/embeddings",
33+
"ACI_EMBEDDING_API_KEY": "your_api_key_here",
34+
"ACI_EMBEDDING_MODEL": "text-embedding-3-small",
35+
"ACI_EMBEDDING_DIMENSION": "1024"
36+
}
37+
}
38+
}
39+
}

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ classifiers = [
2222
"Topic :: Utilities",
2323
]
2424

25-
# 运行时核心依赖(字母顺序)
25+
# runtime dependencies
2626
dependencies = [
2727
"fastapi>=0.111.0",
2828
"httpx>=0.25.0",

src/aci/core/__init__.py

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -41,11 +41,13 @@
4141
ScannedFile,
4242
get_default_registry,
4343
)
44-
from aci.core.tokenizer import (
45-
TiktokenTokenizer,
46-
TokenizerInterface,
47-
get_default_tokenizer,
48-
)
44+
from aci.core.tokenizer import (
45+
CharacterTokenizer,
46+
SimpleTokenizer,
47+
TiktokenTokenizer,
48+
TokenizerInterface,
49+
get_default_tokenizer,
50+
)
4951
from aci.core.watch_config import WatchConfig
5052

5153
__all__ = [
@@ -70,10 +72,12 @@
7072
"TreeSitterParser",
7173
"SUPPORTED_LANGUAGES",
7274
"check_tree_sitter_setup",
73-
# Tokenizer
74-
"TokenizerInterface",
75-
"TiktokenTokenizer",
76-
"get_default_tokenizer",
75+
# Tokenizer
76+
"TokenizerInterface",
77+
"TiktokenTokenizer",
78+
"CharacterTokenizer",
79+
"SimpleTokenizer",
80+
"get_default_tokenizer",
7781
# Chunker
7882
"CodeChunk",
7983
"ChunkerConfig",

src/aci/core/config.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ class IndexingConfig:
133133
default_factory=lambda: _get_default("indexing", "chunk_overlap_lines", 2)
134134
)
135135
max_workers: int = field(default_factory=lambda: _get_default("indexing", "max_workers", 4))
136+
tokenizer: str = field(default_factory=lambda: _get_default("indexing", "tokenizer", "tiktoken"))
136137

137138

138139
@dataclass
@@ -226,6 +227,7 @@ def apply_env_overrides(self) -> "ACIConfig":
226227
"ACI_INDEXING_MAX_CHUNK_TOKENS": ("indexing", "max_chunk_tokens", int),
227228
"ACI_INDEXING_CHUNK_OVERLAP_LINES": ("indexing", "chunk_overlap_lines", int),
228229
"ACI_INDEXING_MAX_WORKERS": ("indexing", "max_workers", int),
230+
"ACI_TOKENIZER": ("indexing", "tokenizer", str),
229231
"ACI_INDEXING_FILE_EXTENSIONS": (
230232
"indexing",
231233
"file_extensions",

0 commit comments

Comments
 (0)