Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions .github/prompts/sync-models.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
mode: 'agent'
description: 'Update src/tokenEstimators.json and src/modelPricing.json with missing models found in GitHub Copilot documentation.'
tools: ['fetch', 'read_file', 'write_file', 'search_files']
---

# Sync Copilot Model Data

Update `src/tokenEstimators.json` and `src/modelPricing.json` with missing models found in GitHub Copilot documentation.

## Requirements

1. Fetch and parse the GitHub Copilot supported models documentation page:
- URL: https://docs.github.com/en/copilot/reference/ai-models/supported-models
- Extract all model names from the "Supported AI models in Copilot" section
- Model names should be normalized (lowercase, hyphens instead of spaces)
2. Compare the extracted models list to:
- `src/tokenEstimators.json` - contains character-to-token ratio estimators
- `src/modelPricing.json` - contains pricing data (cost per million tokens)
3. For each model from the documentation that is missing from either JSON file:
- Add it to the appropriate JSON file(s)
- Use sensible defaults based on existing similar models

> **IMPORTANT**: Only add models that are **explicitly listed** on the documentation page above. Do NOT add models based on your own knowledge of AI models, third-party sources, or speculation about what models might exist. If a model is not present on that specific documentation page, it must not be added.

## Token Estimators (`src/tokenEstimators.json`)

For missing models in tokenEstimators.json:
- Add new entry to the `estimators` object
- Use a default ratio of `0.25` (4 characters per token) unless you can infer from model family:
- GPT-4 models: `0.25`
- GPT-3.5 models: `0.25`
- Claude models: `0.25`
- o1 models: `0.25`
- Format example:
```json
"gpt-4o": 0.25
```
- Maintain alphabetical ordering within model families
- Group related models together (e.g., all gpt-4 variants, all claude variants)

## Model Pricing (`src/modelPricing.json`)

For missing models in modelPricing.json:
- Add new entry to the `pricing` object
- Structure:
```json
"model-name": {
"input": 0.00,
"output": 0.00
}
```
- Where `input` and `output` are cost per million tokens
- Use `0.00` as default (pricing will need manual verification later)
- Add a note to the PR body that pricing needs verification
- Maintain alphabetical ordering within model families
- Group related models together

## Metadata Updates

- **ONLY** update `lastUpdated` field in `src/modelPricing.json` to today's date (YYYY-MM-DD format) **if you added new models to the pricing file**
- If no models were added, do NOT update the `lastUpdated` field
- Do NOT modify the `sources` section unless you have specific pricing data
- Do NOT add models that are not explicitly listed on the documentation page — if a model is not on that page, skip it entirely

## Output Format

1. Make all necessary changes to both JSON files
2. Ensure proper JSON formatting (2-space indentation)
3. Maintain existing structure and patterns
4. If no changes are needed, do nothing

## Constraints

- **Only add models that appear on the GitHub Copilot supported models documentation page** (`https://docs.github.com/en/copilot/reference/ai-models/supported-models`). Do NOT add models from any other source, from your own training knowledge, or that you believe might exist — only what is explicitly listed on that page.
- Only modify `src/tokenEstimators.json` and `src/modelPricing.json`
- Do not open a PR (the workflow will handle that)
- Preserve all existing entries and formatting conventions
- Use consistent spacing and indentation with existing file style
- Models should be normalized (lowercase, hyphens instead of spaces)

46 changes: 46 additions & 0 deletions .github/prompts/sync-toolnames.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
mode: 'agent'
description: 'Scan microsoft/vscode-copilot-chat for model-facing tool identifiers and update src/toolNames.json with any missing entries.'
tools: ['read_file', 'write_file', 'search_files', 'run_in_terminal']
---

# Sync Tool Names from vscode-copilot-chat

Scan `microsoft/vscode-copilot-chat` repo for model-facing tool identifiers, compare them to the existing `src/toolnames.json` in our repo in the `current` folder. Only make changes in our current folder.

## Requirements

1. The `microsoft/vscode-copilot-chat` repository has been checked out and is available in the workspace in the folder `vscode-copilot-chat`. Use the paths provided in the Context Paths section below.
2. In the vscode-copilot-chat repo, treat `src/extension/tools/common/toolNames.ts` as the source of truth for tool IDs.
- Extract tool IDs from:
- `export enum ToolName { ... }` (string literal values)
- `export enum ContributedToolName { ... }` (string literal values)
- Ignore TypeScript keys (enum member names). Only collect the **string values** (the model-facing tool names).
3. In *this* repo, load the existing mapping file `src/toolNames.json`. Treat its top-level keys as the set of already-known tool IDs.
4. Compute `missing = (upstreamToolIds - existingMappingKeys)`.
- Deduplicate.
- Sort ascending (stable, locale-insensitive).
5. For each missing tool ID, generate a default friendly name:
- Replace `_` / `-` / `.` with spaces.
- Split camelCase / PascalCase boundaries into words.
- Uppercase words (Title Case).
- Preserve known acronyms in ALL CAPS: `VSCODE`, `MCP`, `GITHUB`, `API`, `URL`, `JSON`, `HTTP`, `HTTPS`, `CLI`, `UI`, `IO`, `ID`.
- Examples:
- `github_api_tool` → `GitHub API Tool`
- `copilot_readFile` → `Copilot Read File`
- `mcp.io.github.git` → `MCP IO GitHub Git`
- `search_subagent` → `Search Subagent`
- `run_in_terminal` → `Run In Terminal`
- `vscode_command` → `VSCode Command`
6. Inject **only** the missing entries into our existing mapping object in the current repository, using the same style as the mapping (leading comma with space on each line), e.g.:
```
, "some_tool": "Some Tool"
```
7. Inject the missing entries inside the json object, matching the organic grouping of locically related tools if possible (e.g. if there are existing entries with the same prefix, group the new entry with them). If no related entries exist, add the new entry at the end of the file, but before the closing `}`.
8. Also print (as plain text, after the delta or NO_DELTA) the upstream commit SHA used for the scan and the exact file path scanned in upstream, for traceability.

## Constraints
- Only modify our toolNames.json file.
- Do not open a PR.
- Do not include tools in the list that are not model-facing (only those defined in upstream `ToolName` / `ContributedToolName` string values).
- Be resilient to minor refactors (enum order changes, added comments, etc.).
4 changes: 2 additions & 2 deletions .github/workflows/check-models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
push:
paths:
- .github/workflows/check-models.yml
- .github/workflows/prompts/sync-models-prompt.md
- .github/prompts/sync-models.prompt.md
- src/tokenEstimators.json
- src/modelPricing.json

Expand Down Expand Up @@ -76,9 +76,9 @@
env:
GH_TOKEN: ${{ secrets.GH_PAT }}
COPILOT_GITHUB_TOKEN: ${{ secrets.GH_PAT }}
run: |

Check failure on line 79 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:23:4: Double quote to prevent globbing and word splitting

Check failure on line 79 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:23:4: Double quote to prevent globbing and word splitting
# Read the prompt from the file
PROMPT=$(cat .github/workflows/prompts/sync-models-prompt.md)
PROMPT=$(cat .github/prompts/sync-models.prompt.md)

# Get the paths to relevant files
ESTIMATORS_PATH="${GITHUB_WORKSPACE}/src/tokenEstimators.json"
Expand Down Expand Up @@ -110,7 +110,7 @@

- name: Check for changes
id: changes
run: |

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2126:style:9:41: Consider using 'grep -c' instead of 'grep|wc -l'

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2126:style:14:53: Consider using 'grep -c' instead of 'grep|wc -l'

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:3:27: Double quote to prevent globbing and word splitting

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:25:26: Double quote to prevent globbing and word splitting

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:18:33: Double quote to prevent globbing and word splitting

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2126:style:9:41: Consider using 'grep -c' instead of 'grep|wc -l'

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2126:style:14:53: Consider using 'grep -c' instead of 'grep|wc -l'

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:3:27: Double quote to prevent globbing and word splitting

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:25:26: Double quote to prevent globbing and word splitting

Check failure on line 113 in .github/workflows/check-models.yml

View workflow job for this annotation

GitHub Actions / run-actionlint

shellcheck reported issue in this script: SC2086:info:18:33: Double quote to prevent globbing and word splitting
if git diff --quiet src/tokenEstimators.json src/modelPricing.json; then
echo "No changes detected in model data files"
echo "changed=false" >> $GITHUB_OUTPUT
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/sync-toolnames.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
push:
paths:
- .github/workflows/sync-toolnames.yml
- .github/workflows/prompts/sync-toolnames-prompt.md
- .github/prompts/sync-toolnames.prompt.md

permissions:
contents: read
Expand Down Expand Up @@ -85,7 +85,7 @@ jobs:
COPILOT_GITHUB_TOKEN: ${{ secrets.GH_PAT }}
run: |
# Read the prompt from the file
PROMPT=$(cat .github/workflows/prompts/sync-toolnames-prompt.md)
PROMPT=$(cat .github/prompts/sync-toolnames.prompt.md)

# Get the path to the vscode-copilot-chat repo
UPSTREAM_PATH="${GITHUB_WORKSPACE}/vscode-copilot-chat"
Expand Down
Loading