From 645330ac7c4b64fa87723eb79cdfe754d34c2323 Mon Sep 17 00:00:00 2001 From: Hayden Barnes Date: Sun, 8 Mar 2026 16:21:43 -0400 Subject: [PATCH] Update documentation to match current codebase Rewrote README in plain, accessible language. Removed duplicate sections, technical jargon, and formatting clutter. Added trademarks disclaimer. Updated agents.md with current font sizes, input control values, and new sections for ChatInputControl, hotkey infrastructure, efficiency mode, and native interop. --- README.md | 260 ++++++++++++++---------------------------------------- agents.md | 55 ++++++++++-- 2 files changed, 117 insertions(+), 198 deletions(-) diff --git a/README.md b/README.md index ecdcf22..685f52e 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,31 @@ # GitHub Copilot Taskbar GUI -image - -.NET 11 Preview WinUI 3 desktop application providing system tray access to GitHub Copilot CLI with automatic context awareness. Detects active focus, open applications, file system state, and running services to augment prompts with relevant environment information. - -## Features - -- **System Tray Integration**: Windows notification area icon for quick access -- **WinUI 3 Interface**: Native Windows UI with Fluent Design and Mica/DesktopAcrylic backdrop -- **Automatic Context Detection**: - - Active window focus (Explorer paths, Terminal with WSL distribution detection, IDEs) - - Open applications and visible windows - - Background services (Docker, databases, language servers) - - WSL distributions with smart Linux prompt detection - - Environment variables (PYTHONPATH, NODE_ENV, DOTNET_ROOT, filtered PATH, etc.) - - Screenshot capture when context ambiguous (LLM vision only when needed) -- **Conversation History**: Last 10 messages included for context continuity -- **Context Optimization**: Tiered detection (10-500ms) prioritizes fast operations -- **Fallback Mechanisms**: Windows Accessibility API when Win32 insufficient -- **Smart Command Execution**: Imperative commands executed immediately with partial progress reporting -- **"Thinking..." Indicator**: Visual feedback while processing requests -- **Chat Persistence**: SQLite storage for message history -- **GitHub Copilot SDK**: Direct integration with Copilot CLI (v0.1.32, 5-minute timeout for complex operations) - -## Prerequisites - -- Windows 10 1809+ (Windows 11 recommended) +Screenshot 2026-02-03 010803 + +> This is an experimental proof-of-concept. APIs and features may change. + +A Windows desktop app that puts GitHub Copilot in your system tray. It watches what you are doing on your computer and automatically gives Copilot relevant context about your active windows, folders, terminal sessions, and running services so you can ask questions without explaining your setup. + +## What it does + +- Lives in the Windows system tray for quick access +- Detects what you are working on: open folders, terminals, IDEs, browsers, and background services +- Recognizes WSL distributions when you are using Windows Terminal with Linux +- Takes a screenshot only when it cannot figure out the context from text alone +- Remembers recent conversation so you can say things like "uninstall it" after asking to install something +- Saves chat history locally in SQLite +- Shows a "Thinking..." indicator while waiting for a response + +## Requirements + +- Windows 10 version 1809 or later (Windows 11 recommended) - .NET 11 Preview SDK - GitHub Copilot subscription +- Authenticated with `gh auth login` -## Build +## Getting started + +Clone and build: ```powershell git clone https://github.com/sirredbeard/ghcopilot-taskbar-gui @@ -39,202 +35,82 @@ dotnet build --configuration Release dotnet run ``` -## Usage - -Run `CopilotTaskbarApp.exe`. Application icon appears in system tray. Click to open chat interface. +When you run the app, an icon appears in the system tray. Click it to open the chat window. -**First Run**: -1. CLI detection runs automatically (bundled with SDK) -2. Authentication check runs -3. If not authenticated, prompts for `gh auth login` +On first launch the app checks for authentication. If you have not logged in yet, it will ask you to run `gh auth login`. -**Context Gathering**: -- Automatic on every query -- Tier 1 (10-50ms): Active window detection via Win32 Z-order, WSL Unix prompt detection -- Tier 2 (100-200ms): File explorer, applications, environment variables, screenshot (only if context ambiguous) -- Tier 3 (500ms+): WSL distributions list, background services (developer scenarios only) -- Screenshot skipped when strong text context exists (faster responses) -- Environment variables collected: PATH (filtered), PYTHONPATH, NODE_ENV, JAVA_HOME, DOTNET_ROOT, etc. -- Conversation history: Last 10 messages included for contextual awareness +## How context detection works -**Smart Features**: -- **Context Continuity**: Remembers previous actions ("install podman" → "uninstall it" works) -- **WSL Distribution Detection**: Recognizes "user@hostname:~" patterns, checks running distros -- **Actionable Commands**: Executes imperative commands immediately (install, start, configure) -- **Partial Progress**: Reports what succeeded even if later steps fail -- **"Thinking..." Indicator**: Shows real-time feedback during long operations +The app gathers context in three passes, stopping early when it has enough information: -**Keyboard Shortcuts**: -- `Enter`: Send message -- `Up/Down`: Command history +**Fast pass (under 50ms):** Checks which window is in the foreground. If it is File Explorer, a terminal, or an IDE like VS Code or Visual Studio, the app already has strong context and skips the slower steps. -## Architecture +**Medium pass (100-200ms):** Lists open windows and open Explorer folders. Takes a screenshot if the fast pass did not produce clear context. -### Components +**Full pass (500ms or more, developer scenarios only):** Checks for running WSL distributions, background services like Docker or databases, and collects environment variables like PATH, PYTHONPATH, NODE_ENV, JAVA_HOME, and DOTNET_ROOT. -- **MainWindow**: WinUI 3 UI with system tray integration (System.Windows.Forms.NotifyIcon) -- **CopilotService**: GitHub Copilot SDK client wrapper with pattern matching for type safety -- **ContextService**: Multi-tiered context detection (Win32, Shell COM, UI Automation) -- **ScreenshotService**: Automatic screen capture (Base64 JPEG, 1024px max) -- **PersistenceService**: SQLite message storage +When a WSL distribution is active in the terminal, the app detects the Linux prompt pattern and prioritizes that environment. -### Technologies +## Keyboard shortcuts -- .NET 11 Preview (Partial trimming enabled, full AOT incompatible with WinUI 3 data binding) -- WinUI 3 with Windows App SDK -- GitHub Copilot SDK v0.1.32 (JSON-RPC over stdio) -- System.Windows.Forms.NotifyIcon (official Microsoft API) -- Windows Accessibility API (UI Automation fallback) -- SQLite for persistence +- Enter: send message +- Up/Down arrows: scroll through previous messages -## Project Structure +## Project layout ``` CopilotTaskbarApp/ -├── App.xaml.cs # Entry point -├── MainWindow.xaml.cs # UI and tray integration -├── CopilotService.cs # SDK client wrapper -├── ContextService.cs # Tiered context detection -├── ScreenshotService.cs # Screen capture -├── PersistenceService.cs # SQLite storage -├── CopilotCliDetector.cs # CLI installation checks -├── ChatMessage.cs # Data model -└── Assets/ # Icons + App.xaml.cs Application entry point + MainWindow.xaml.cs Chat window and tray icon + CopilotService.cs Talks to the Copilot SDK + ContextService.cs Gathers context from your desktop + ScreenshotService.cs Screen capture for ambiguous scenarios + PersistenceService.cs Saves chat history in SQLite + CopilotCliDetector.cs Checks CLI availability + ChatMessage.cs Message data model + Win11ContextMenu.cs Windows 11 styled context menu + TrayMenuWindow.xaml.cs Tray popup menu + Native/ Windows API interop + Controls/ Chat input control (in progress) + Commands/ Async command helpers + Assets/ Icons ``` -**Data Directory**: `%LOCALAPPDATA%\CopilotTaskbarApp\chat.db` +Chat history is stored at `%LOCALAPPDATA%\CopilotTaskbarApp\chat.db`. -## Troubleshooting +## Building for release -**CLI not found**: -```powershell -winget install --id GitHub.Copilot -copilot --version -``` +The app builds for both x64 and ARM64. It ships as a self-contained package so users do not need .NET installed. -**Authentication errors**: ```powershell -gh auth login -``` - -**Subscription errors**: Verify GitHub Copilot access on your account. - -**SDK Notes**: -- SDK communicates via JSON-RPC over stdio -- Starts bundled CLI process automatically in server mode -- Request timeout: 300 seconds (5 minutes) for complex multi-step operations - -**Connection issues**: -1. Ensure you are authenticated (`gh auth login`) -2. Restart application - -**Timeout issues**: For complex multi-step commands, try breaking into separate requests. Check debug logs (see Debugging section). - -## Debugging - -### Viewing CopilotService Debug Output - -Detailed diagnostics are available in VS Code Debug Console: - -**Setup**: -1. Open project in VS Code -2. Press **F5** to start debugging (or Run → Start Debugging) -3. Debug Console opens automatically showing all output, filter by [CopilotService] - -**What You'll See**: -``` -[CopilotService] ===== Request START at 14:23:45.123 ===== -[CopilotService] Stage 1 (CLI Start): 0.05s -[CopilotService] Stage 2 (Session Create): 0.12s -[CopilotService] ===== PROMPT (2345 chars) ===== -[CopilotService] You are a desktop assistant... -[CopilotService] ===== END PROMPT ===== -[CopilotService] Stage 3 (Sending to model)... -[CopilotService] Stage 3 (Model Response): 18.42s -[CopilotService] Total request time: 18.59s -[CopilotService] ===== RESPONSE (1234 chars) ===== -[CopilotService] -[CopilotService] ===== END RESPONSE ===== -``` - -**Diagnosing Timeouts**: -Logs show exactly where delays occur: -- **Stage 1** (CLI Start): Should be <1s after first request -- **Stage 2** (Session Create): Usually <1s -- **Stage 3** (Model Response): Where most time is spent (varies by complexity) - -If timeout occurs: -``` -[CopilotService] TIMEOUT after 300.12s! -[CopilotService] +dotnet publish -c Release -r win-x64 +dotnet publish -c Release -r win-arm64 ``` -## Known Issues - -### Input Control - -**Solution**: Application uses `AutoSuggestBox` for text input, which avoids the cursor spacing bug that affects `TextBox` and `RichEditBox` controls in WinUI 3. - -**Typography**: All text uses Segoe UI 16pt (standard content) with automatic Windows text scaling support. +Output goes to `bin\Release\net11.0-windows10.0.19041.0\{runtime}\publish\`. -See AGENTS.md for detailed technical analysis. +## CI/CD -### SDK/CLI Compatibility +Three GitHub Actions workflows handle the release pipeline: -**Symptom**: Authentication errors or session creation failures - -**Diagnosis**: Check debug logs (see Debugging section) for CLI startup or session errors - -**Resolution**: Ensure CLI is properly authenticated: -```powershell -gh auth login -``` +- **build.yml**: Runs on pull requests. Builds both x64 and ARM64. When called by the release workflow, also publishes and uploads zip artifacts. +- **release.yml**: Triggered manually with a version number. Calls the build workflow, creates a git tag, publishes a GitHub Release with both zips attached, and then triggers the WinGet submission. +- **winget-release.yml**: Downloads release assets, computes hashes, generates WinGet manifest files, and submits them to the winget-pkgs repository. ## Troubleshooting -**CLI not found**: The Copilot CLI is bundled with the SDK (no separate installation required) +**Authentication errors**: Run `gh auth login` and try again. -**Authentication errors**: -```powershell -gh auth login -``` +**Subscription errors**: Make sure your GitHub account has an active Copilot subscription. -**Subscription errors**: Verify GitHub Copilot access on your account +**Timeouts**: The request timeout is 5 minutes. For complex requests, try breaking them into smaller steps. Debug logs in VS Code (press F5, then check the Debug Console) show exactly where the delay is. -**SDK Notes**: -- SDK communicates via JSON-RPC over stdio -- Starts bundled CLI process automatically in server mode -- Request timeout: 300 seconds (5 minutes) - -## Publishing - -**Self-contained deployment** (required for unpackaged WinUI 3): -```powershell -dotnet publish -c Release -r win-x64 # x64 -dotnet publish -c Release -r win-arm64 # ARM64 -``` - -Output: `bin\Release\net11.0-windows10.0.19041.0\{runtime}\publish\` - -**Key Dependencies**: -- `Microsoft.WindowsAppSDK` - WinUI 3 framework -- `GitHub.Copilot.SDK` v0.1.32 - Copilot integration -- `Microsoft.Data.Sqlite` - Persistence -- `CommunityToolkit.WinUI.UI.Controls.Markdown` - Message rendering -- Framework references: WindowsForms (NotifyIcon), WPF (UI Automation) - -## Technical Notes - -**Context Detection Performance**: -- Strong context (Explorer/Terminal/IDE + WSL detection): 15-30ms (no screenshot) -- Weak context (generic app): 250-400ms (includes screenshot + OCR) -- Full developer context: 500-700ms - -**Timeout Handling**: -- SDK timeout: 300 seconds (5 minutes) -- UI timeout matches SDK -- Staged diagnostics identify bottlenecks (CLI start, session create, model response) +**Connection issues**: Restart the app. The Copilot CLI is bundled with the SDK and starts automatically. ## License [MIT License](LICENSE) + +## Trademarks + +GitHub, the GitHub logo, GitHub Copilot, and the GitHub Copilot logo are trademarks of GitHub, Inc. This project is not affiliated with, endorsed by, or sponsored by GitHub, Inc. diff --git a/agents.md b/agents.md index 5b6268f..0380efe 100644 --- a/agents.md +++ b/agents.md @@ -43,8 +43,11 @@ This is a WinUI 3 desktop application that: - Uses Windows Accessibility API (UI Automation) as fallback for enhanced context inference - Shows "Thinking..." placeholder while processing requests - Persists chat history in SQLite +- Renders assistant responses with Markdown (CommunityToolkit.WinUI.UI.Controls.Markdown) - Targets .NET 11 Preview with partial trimming on ARM64 and x64 - Full Native AOT disabled due to WinUI 3 incompatibility (data binding, XAML resources) +- Includes Efficiency Mode utilities for process QoS and priority management +- Has a prepared ChatInputControl with file attachment, drag-drop, model selection (not yet wired into MainWindow) ### Context Inference Strategy @@ -99,6 +102,9 @@ The application infers user intent/questions/problems using a **tiered optimizat 2. **Deployment**: Self-contained deployment required for unpackaged WinUI 3 applications 3. **SDK Integration**: Direct usage of GitHub.Copilot.SDK NuGet package with JSON-RPC communication to bundled Copilot CLI 4. **Authentication**: Authentication via GitHub CLI (`gh auth login`) required +5. **Native Interop**: CsWin32 source generator for type-safe P/Invoke (NativeMethods.txt lists required Win32 APIs) +6. **Window Subclassing**: WindowSubclassBase/WindowTrayHandler for WM_TRAYICON and WM_HOTKEY message routing +7. **Efficiency Mode**: Process QoS level (Eco/Default/High) and priority management via SetProcessInformation ## System Prompt Guidelines @@ -226,9 +232,9 @@ When timeout occurs, logs show: ```xaml @@ -264,9 +270,9 @@ private async void InputBox_QuerySubmitted(AutoSuggestBox sender, AutoSuggestBox WinUI 3 XAML controls use **Segoe UI** (via `ContentControlThemeFontFamily`) with standardized sizes for Windows 11 native appearance. The WinForms-based `Win11ContextMenu` still uses **Segoe UI Variable Text** at 9pt, which is appropriate for that Win32 surface and does not affect WinUI 3 cursor rendering. -- **Standard content**: 18px (chat messages, input box — `FontSize="18"`) +- **Standard content**: 18px (chat messages, input box — `FontSize="18"`, `Padding="8,6,8,6"`) - **Secondary text**: 13px (timestamps, metadata, copy button icons — `CaptionTextBlockStyle`) -- **Speaker labels / headers**: 18–19px SemiBold (`FlyoutSpeakerTextStyle`, `FlyoutHeaderTextStyle`) +- **Speaker labels / headers**: 18-19px SemiBold (`FlyoutSpeakerTextStyle`, `FlyoutHeaderTextStyle`) - **Tray menu (XAML)**: `ControlContentThemeFontSize` / `BodyTextBlockStyle` (system-scaled, no hardcoded sizes) - **Tray menu (WinForms)**: Segoe UI Variable Text 9pt @@ -276,8 +282,45 @@ WinUI 3 XAML controls use **Segoe UI** (via `ContentControlThemeFontFamily`) wit ### Input Control -- **AutoSuggestBox** for message input: +- **AutoSuggestBox** for message input (MainWindow.xaml): - Handles Enter key via `QuerySubmitted` event - Avoids TextBox/RichEditBox cursor spacing bug - Maintains Windows 11 native look and feel - Up/Down arrows for command history navigation + +- **ChatInputControl** (Controls/ folder, not yet wired into MainWindow): + - TextBox-based input with AcceptsReturn, TextWrapping, spell-check + - File attachment via drag-drop and clipboard paste + - Model selection dropdown (ObservableCollection) + - Supported file types: ~70 text extensions, images (.jpg, .png, .gif, .webp), PDFs + - Events: MessageSent, FileSendRequested, StreamingStopRequested, RequestHistoryItem + - IsStreaming dependency property to disable input during LLM response + +### Hotkey Infrastructure + +- RegisterHotKey/UnregisterHotKey available via NativeWindow +- WindowTrayHandler routes WM_HOTKEY to HotKeyEventReceived event +- Not yet wired to a specific key combination in MainWindow + +## CI/CD Workflows + +### build.yml +- Triggers on PRs (all branches) and via `workflow_call` for release pipeline +- Matrix build: x64 and ARM64 on windows-latest +- Installs .NET 11 Preview SDK, restores, builds Release configuration +- When called with `version` input: also publishes, zips, and uploads artifacts + +### release.yml +- Manual trigger (`workflow_dispatch`) with version input +- Calls build.yml to produce artifacts +- Creates git tag (`v{version}`), pushes to origin +- Creates GitHub Release with auto-generated release notes and both zip assets +- Calls winget-release.yml to submit to WinGet + +### winget-release.yml +- Triggers on GitHub release events or via `workflow_call` +- Downloads release zips, computes SHA256 hashes +- Generates three WinGet manifest files (version, locale, installer) +- Package ID: `sirredbeard.CopilotTaskbarApp` +- Installer type: zip with nested portable exe +- Submits to microsoft/winget-pkgs via wingetcreate