diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml new file mode 100644 index 0000000..8b01fae --- /dev/null +++ b/.github/workflows/docs.yml @@ -0,0 +1,52 @@ +name: Deploy docs to GitHub Pages + +on: + push: + branches: + - "main" + paths: + - "docs/**" + - "mkdocs.yml" + workflow_dispatch: + +permissions: + contents: read + pages: write + id-token: write + +concurrency: + group: pages + cancel-in-progress: true + +jobs: + deploy: + runs-on: ubuntu-latest + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + + steps: + - uses: actions/checkout@v5 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.12" + + - name: Install MkDocs + run: pip install -r docs/requirements.txt + + - name: Build docs + run: mkdocs build --strict + + - name: Setup Pages + uses: actions/configure-pages@v5 + + - name: Upload artifact + uses: actions/upload-pages-artifact@v3 + with: + path: site/ + + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v4 diff --git a/docs/api.md b/docs/api.md new file mode 100644 index 0000000..3e27eb2 --- /dev/null +++ b/docs/api.md @@ -0,0 +1,84 @@ +# API Reference + +## `guide` — Tag Builder + +A singleton instance of `Guide` for creating masked tags. Import it as `g` by convention: + +```python +from gimkit import guide as g +``` + +### Methods + +| Method | Description | +|---|---| +| `g(name, desc, regex, content)` | Create a generic masked tag with optional attributes. | +| `g.single_word(name)` | A single word without spaces (`\S+`). | +| `g.select(name, choices)` | Choose one value from the given list of options. | +| `g.datetime(name, require_date, require_time)` | A date and/or time string, e.g. `2023-10-05 14:30:00`. | +| `g.person_name(name)` | A person's name, e.g. *John Doe*, *张三*. | +| `g.phone_number(name)` | A phone number, e.g. *+1-123-456-7890*. | +| `g.e_mail(name)` | An email address, e.g. *alice@example.com*. | + +### `MaskedTag` attributes + +| Attribute | Type | Description | +|---|---|---| +| `name` | `str | None` | Tag name for named access in results. | +| `desc` | `str | None` | Natural-language description sent to the model. | +| `regex` | `str | None` | Regex pattern constraining model output. | +| `content` | `str | None` | Filled content (set after model inference). | + +--- + +## `from_openai` — OpenAI Backend + +```python +from gimkit import from_openai +from openai import OpenAI + +model = from_openai(client: OpenAI, model_name: str) +result = model(query, use_gim_prompt=True) +``` + +Returns a callable model. Supports both synchronous and asynchronous calls. + +--- + +## `from_vllm` — vLLM Server Backend + +```python +from gimkit import from_vllm + +model = from_vllm(base_url: str, model_name: str) +result = model(query) +``` + +Requires `pip install gimkit[vllm]` on Linux. + +--- + +## `from_vllm_offline` — vLLM Offline Backend + +```python +from gimkit import from_vllm_offline + +model = from_vllm_offline(model_name: str) +result = model(query) +``` + +Requires `pip install gimkit[vllm]` on Linux. + +--- + +## `Query` and `Response` + +Low-level classes for working with GIM-formatted strings directly: + +```python +from gimkit.contexts import Query, Response, infill + +query = Query(f"Hello, {g(name='word', desc='a single word')}!") +response = Response(f"<|GIM_RESPONSE|>...<|/GIM_RESPONSE|>") +result = infill(query, response) +``` diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..7894160 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,24 @@ +# GIMKit + +**Guided Infilling Modeling Toolkit** — precise structured text generation using language models. + +GIMKit lets you define placeholders (masked tags) in text and have a language model fill them in. It gives you fine-grained control over model outputs through a typed tag system with optional regex constraints. + +[![PyPI Version](https://img.shields.io/pypi/v/gimkit?label=pypi%20package)](https://pypi.org/project/gimkit) +[![Python Versions](https://img.shields.io/pypi/pyversions/gimkit.svg)](https://pypi.org/project/gimkit) +[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey)](https://pypi.org/project/gimkit) + +--- + +## Features + +- **Masked tag system** — embed typed placeholders directly in f-strings. +- **Regex constraints** — restrict model output to specific patterns. +- **Named access** — retrieve results by tag name or index. +- **Multiple backends** — OpenAI, vLLM (server and offline). +- **Small-model friendly** — designed to work well with compact open-source models. + +## Design Philosophy + +- **Stable over feature** — reliability and correctness are prioritized above new features. +- **Small open-source model first** — designed to work well with small, freely available language models. diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 0000000..837874a --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,25 @@ +# Installation + +## Standard + +Install GIMKit using pip: + +```bash +pip install gimkit +``` + +## With vLLM support + +Install with the optional `vllm` extra to enable the vLLM backends: + +```bash +pip install gimkit[vllm] +``` + +!!! note + vLLM is only supported on Linux. On Windows and macOS, omit the `[vllm]` extra. + +## Requirements + +- Python 3.10 or later +- Linux, macOS, or Windows diff --git a/docs/quickstart.md b/docs/quickstart.md new file mode 100644 index 0000000..e0a2a64 --- /dev/null +++ b/docs/quickstart.md @@ -0,0 +1,35 @@ +# Quick Start + +Here is a minimal example using the OpenAI backend. + +## 1. Set up the client + +```python +from openai import OpenAI +from gimkit import from_openai, guide as g + +client = OpenAI() # reads OPENAI_API_KEY from environment +model = from_openai(client, model_name="gpt-4") +``` + +## 2. Create a query with masked tags + +```python +result = model(f"Hello, {g(desc='a single word')}!", use_gim_prompt=True) +print(result) # Hello, world! +``` + +## 3. Run a structured form + +```python +query = f""" +Name: {g.person_name(name="name")} +Email: {g.e_mail(name="email")} +Favorite color: {g.select(name="color", choices=["red", "green", "blue"])} +""" + +result = model(query, use_gim_prompt=True) +print(result.tags["name"].content) # e.g. Alice +print(result.tags["email"].content) # e.g. alice@example.com +print(result.tags["color"].content) # red | green | blue +``` diff --git a/docs/requirements.txt b/docs/requirements.txt new file mode 100644 index 0000000..2d9b3cf --- /dev/null +++ b/docs/requirements.txt @@ -0,0 +1,2 @@ +mkdocs>=1.6.1 +mkdocs-material>=9.6.23 diff --git a/docs/usage.md b/docs/usage.md new file mode 100644 index 0000000..256dbc1 --- /dev/null +++ b/docs/usage.md @@ -0,0 +1,87 @@ +# Usage Guide + +## Creating Masked Tags + +Use the `guide` helper (conventionally imported as `g`) to create masked tags: + +```python +from gimkit import guide as g + +# Basic tag with description +tag = g(name="greeting", desc="A friendly greeting") + +# Specialized tags +name_tag = g.person_name(name="user_name") +email_tag = g.e_mail(name="email") +phone_tag = g.phone_number(name="phone") +word_tag = g.single_word(name="word") + +# Selection from choices +choice_tag = g.select(name="color", choices=["red", "green", "blue"]) + +# Tag with regex constraint +code_tag = g(name="code", desc="A 4-digit PIN", regex=r"\d{4}") +``` + +## Building Queries + +Masked tags can be embedded directly in Python f-strings: + +```python +from gimkit import from_openai, guide as g +from openai import OpenAI + +client = OpenAI() +model = from_openai(client, model_name="gpt-4") + +query = f""" +Name: {g.person_name(name="name")} +Email: {g.e_mail(name="email")} +Favorite color: {g.select(name="color", choices=["red", "green", "blue"])} +""" + +result = model(query, use_gim_prompt=True) +print(result) +``` + +## Accessing Results + +Tags in the result can be accessed by index or by name: + +```python +result = model(query, use_gim_prompt=True) + +# Iterate over all tags +for tag in result.tags: + print(f"{tag.name}: {tag.content}") + +# Access a specific tag by name +print(result.tags["name"].content) + +# Access by index +print(result.tags[0].content) + +# Modify tag content +result.tags["email"].content = "REDACTED" +``` + +## Using vLLM + +```python +from gimkit import from_vllm + +model = from_vllm(base_url="http://localhost:8000", model_name="your-model") +result = model(query) +``` + +For offline inference without a running server: + +```python +from gimkit import from_vllm_offline + +model = from_vllm_offline(model_name="your-model") +result = model(query) +``` + +!!! note + `from_vllm` and `from_vllm_offline` require `pip install gimkit[vllm]` on Linux. diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 0000000..2f297c7 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,44 @@ +site_name: GIMKit +site_url: https://sculptai.github.io/GIMKit +site_description: Guided Infilling Modeling Toolkit — precise structured text generation with language models. +site_author: Shichao Song +repo_name: SculptAI/GIMKit +repo_url: https://github.com/SculptAI/GIMKit + +theme: + name: material + palette: + - scheme: default + primary: indigo + accent: indigo + toggle: + icon: material/brightness-7 + name: Switch to dark mode + - scheme: slate + primary: indigo + accent: indigo + toggle: + icon: material/brightness-4 + name: Switch to light mode + features: + - navigation.top + - content.code.copy + +nav: + - Home: index.md + - Installation: installation.md + - Quick Start: quickstart.md + - Usage Guide: usage.md + - API Reference: api.md + +markdown_extensions: + - pymdownx.highlight: + anchor_linenums: true + - pymdownx.superfences + - pymdownx.inlinehilite + - admonition + - attr_list + - tables + +exclude_docs: | + requirements.txt diff --git a/pyproject.toml b/pyproject.toml index f32cb17..94e8119 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -41,7 +41,7 @@ vllm = [ [project.urls] Homepage = "https://github.com/SculptAI/GIMKit" -Documentation = "https://github.com/SculptAI/GIMKit" +Documentation = "https://sculptai.github.io/GIMKit" Repository = "https://github.com/SculptAI/GIMKit" Issues = "https://github.com/SculptAI/GIM/issues"