diff --git a/.github/prompts/sync-models.prompt.md b/.github/prompts/sync-models.prompt.md new file mode 100644 index 00000000..56350b25 --- /dev/null +++ b/.github/prompts/sync-models.prompt.md @@ -0,0 +1,81 @@ +--- +mode: 'agent' +description: 'Update src/tokenEstimators.json and src/modelPricing.json with missing models found in GitHub Copilot documentation.' +tools: ['fetch', 'read_file', 'write_file', 'search_files'] +--- + +# Sync Copilot Model Data + +Update `src/tokenEstimators.json` and `src/modelPricing.json` with missing models found in GitHub Copilot documentation. + +## Requirements + +1. Fetch and parse the GitHub Copilot supported models documentation page: + - URL: https://docs.github.com/en/copilot/reference/ai-models/supported-models + - Extract all model names from the "Supported AI models in Copilot" section + - Model names should be normalized (lowercase, hyphens instead of spaces) +2. Compare the extracted models list to: + - `src/tokenEstimators.json` - contains character-to-token ratio estimators + - `src/modelPricing.json` - contains pricing data (cost per million tokens) +3. For each model from the documentation that is missing from either JSON file: + - Add it to the appropriate JSON file(s) + - Use sensible defaults based on existing similar models + +> **IMPORTANT**: Only add models that are **explicitly listed** on the documentation page above. Do NOT add models based on your own knowledge of AI models, third-party sources, or speculation about what models might exist. If a model is not present on that specific documentation page, it must not be added. + +## Token Estimators (`src/tokenEstimators.json`) + +For missing models in tokenEstimators.json: +- Add new entry to the `estimators` object +- Use a default ratio of `0.25` (4 characters per token) unless you can infer from model family: + - GPT-4 models: `0.25` + - GPT-3.5 models: `0.25` + - Claude models: `0.25` + - o1 models: `0.25` +- Format example: + ```json + "gpt-4o": 0.25 + ``` +- Maintain alphabetical ordering within model families +- Group related models together (e.g., all gpt-4 variants, all claude variants) + +## Model Pricing (`src/modelPricing.json`) + +For missing models in modelPricing.json: +- Add new entry to the `pricing` object +- Structure: + ```json + "model-name": { + "input": 0.00, + "output": 0.00 + } + ``` +- Where `input` and `output` are cost per million tokens +- Use `0.00` as default (pricing will need manual verification later) +- Add a note to the PR body that pricing needs verification +- Maintain alphabetical ordering within model families +- Group related models together + +## Metadata Updates + +- **ONLY** update `lastUpdated` field in `src/modelPricing.json` to today's date (YYYY-MM-DD format) **if you added new models to the pricing file** +- If no models were added, do NOT update the `lastUpdated` field +- Do NOT modify the `sources` section unless you have specific pricing data +- Do NOT add models that are not explicitly listed on the documentation page — if a model is not on that page, skip it entirely + +## Output Format + +1. Make all necessary changes to both JSON files +2. Ensure proper JSON formatting (2-space indentation) +3. Maintain existing structure and patterns +4. If no changes are needed, do nothing + +## Constraints + +- **Only add models that appear on the GitHub Copilot supported models documentation page** (`https://docs.github.com/en/copilot/reference/ai-models/supported-models`). Do NOT add models from any other source, from your own training knowledge, or that you believe might exist — only what is explicitly listed on that page. +- Only modify `src/tokenEstimators.json` and `src/modelPricing.json` +- Do not open a PR (the workflow will handle that) +- Preserve all existing entries and formatting conventions +- Use consistent spacing and indentation with existing file style +- Models should be normalized (lowercase, hyphens instead of spaces) + diff --git a/.github/prompts/sync-toolnames.prompt.md b/.github/prompts/sync-toolnames.prompt.md new file mode 100644 index 00000000..a12a11dd --- /dev/null +++ b/.github/prompts/sync-toolnames.prompt.md @@ -0,0 +1,46 @@ +--- +mode: 'agent' +description: 'Scan microsoft/vscode-copilot-chat for model-facing tool identifiers and update src/toolNames.json with any missing entries.' +tools: ['read_file', 'write_file', 'search_files', 'run_in_terminal'] +--- + +# Sync Tool Names from vscode-copilot-chat + +Scan `microsoft/vscode-copilot-chat` repo for model-facing tool identifiers, compare them to the existing `src/toolnames.json` in our repo in the `current` folder. Only make changes in our current folder. + +## Requirements + +1. The `microsoft/vscode-copilot-chat` repository has been checked out and is available in the workspace in the folder `vscode-copilot-chat`. Use the paths provided in the Context Paths section below. +2. In the vscode-copilot-chat repo, treat `src/extension/tools/common/toolNames.ts` as the source of truth for tool IDs. + - Extract tool IDs from: + - `export enum ToolName { ... }` (string literal values) + - `export enum ContributedToolName { ... }` (string literal values) + - Ignore TypeScript keys (enum member names). Only collect the **string values** (the model-facing tool names). +3. In *this* repo, load the existing mapping file `src/toolNames.json`. Treat its top-level keys as the set of already-known tool IDs. +4. Compute `missing = (upstreamToolIds - existingMappingKeys)`. + - Deduplicate. + - Sort ascending (stable, locale-insensitive). +5. For each missing tool ID, generate a default friendly name: + - Replace `_` / `-` / `.` with spaces. + - Split camelCase / PascalCase boundaries into words. + - Uppercase words (Title Case). + - Preserve known acronyms in ALL CAPS: `VSCODE`, `MCP`, `GITHUB`, `API`, `URL`, `JSON`, `HTTP`, `HTTPS`, `CLI`, `UI`, `IO`, `ID`. + - Examples: + - `github_api_tool` → `GitHub API Tool` + - `copilot_readFile` → `Copilot Read File` + - `mcp.io.github.git` → `MCP IO GitHub Git` + - `search_subagent` → `Search Subagent` + - `run_in_terminal` → `Run In Terminal` + - `vscode_command` → `VSCode Command` +6. Inject **only** the missing entries into our existing mapping object in the current repository, using the same style as the mapping (leading comma with space on each line), e.g.: + ``` + , "some_tool": "Some Tool" + ``` +7. Inject the missing entries inside the json object, matching the organic grouping of locically related tools if possible (e.g. if there are existing entries with the same prefix, group the new entry with them). If no related entries exist, add the new entry at the end of the file, but before the closing `}`. +8. Also print (as plain text, after the delta or NO_DELTA) the upstream commit SHA used for the scan and the exact file path scanned in upstream, for traceability. + +## Constraints +- Only modify our toolNames.json file. +- Do not open a PR. +- Do not include tools in the list that are not model-facing (only those defined in upstream `ToolName` / `ContributedToolName` string values). +- Be resilient to minor refactors (enum order changes, added comments, etc.). diff --git a/.github/workflows/check-models.yml b/.github/workflows/check-models.yml index 49e91048..f1679242 100644 --- a/.github/workflows/check-models.yml +++ b/.github/workflows/check-models.yml @@ -5,7 +5,7 @@ on: push: paths: - .github/workflows/check-models.yml - - .github/workflows/prompts/sync-models-prompt.md + - .github/prompts/sync-models.prompt.md - src/tokenEstimators.json - src/modelPricing.json @@ -78,7 +78,7 @@ jobs: COPILOT_GITHUB_TOKEN: ${{ secrets.GH_PAT }} run: | # Read the prompt from the file - PROMPT=$(cat .github/workflows/prompts/sync-models-prompt.md) + PROMPT=$(cat .github/prompts/sync-models.prompt.md) # Get the paths to relevant files ESTIMATORS_PATH="${GITHUB_WORKSPACE}/src/tokenEstimators.json" diff --git a/.github/workflows/sync-toolnames.yml b/.github/workflows/sync-toolnames.yml index 11bfcc57..1ad9a397 100644 --- a/.github/workflows/sync-toolnames.yml +++ b/.github/workflows/sync-toolnames.yml @@ -7,7 +7,7 @@ on: push: paths: - .github/workflows/sync-toolnames.yml - - .github/workflows/prompts/sync-toolnames-prompt.md + - .github/prompts/sync-toolnames.prompt.md permissions: contents: read @@ -85,7 +85,7 @@ jobs: COPILOT_GITHUB_TOKEN: ${{ secrets.GH_PAT }} run: | # Read the prompt from the file - PROMPT=$(cat .github/workflows/prompts/sync-toolnames-prompt.md) + PROMPT=$(cat .github/prompts/sync-toolnames.prompt.md) # Get the path to the vscode-copilot-chat repo UPSTREAM_PATH="${GITHUB_WORKSPACE}/vscode-copilot-chat"