Skip to content

LyalinDotCom/ScreenshotExtension

Repository files navigation

Screenshot Anything (macOS only)

A Gemini CLI extension that gives the AI the ability to see your screen. Capture full screen or specific application windows on-demand for visual debugging, UI analysis, and context gathering.

Features

  • On-Demand Capture: Screenshots are only taken when you ask (e.g., "Look at my screen")
  • Window Locking: Focus captures on a specific application window using fuzzy search
  • Smart Countdown: 3-second visual countdown for full-screen captures; instant capture for locked windows
  • Optimized Images: Automatically resized and compressed for fast response times
  • Session History: View previously captured screenshots from the current session
  • Multi-Display Support: Capture from any connected display
  • Local Privacy: Images are processed locally and saved to a session folder for your reference

Requirements

  • macOS (this extension uses macOS-specific APIs)
  • Node.js v18 or later
  • Xcode Command Line Tools (for window enumeration)
    xcode-select --install
  • Screen Recording Permission (you'll be prompted on first use)

Installation

From GitHub

gemini extensions install https://github.com/LyalinDotCom/ScreenshotExtension

The extension will automatically build during installation.

Local Development

  1. Clone the repository:

    git clone https://github.com/LyalinDotCom/ScreenshotExtension.git
    cd ScreenshotExtension
  2. Install dependencies:

    npm install
  3. Link to Gemini CLI:

    gemini extensions link .

Usage

Once installed, Gemini will automatically use the screenshot tool when you ask it to look at your screen.

Natural Language Examples

  • "Look at my screen and tell me what error is in the console."
  • "Take a screenshot and analyze the layout of this web page."
  • "I'm stuck on this screen, what should I click?"
  • "What version number is shown in the bottom corner?"

Custom Commands

Command Description
/shot Take a screenshot of the full screen or locked window
/list List all open visible windows
/lock <query> Lock captures to a window matching the query (e.g., /lock Chrome)
/clear Clear the window lock, revert to full screen capture

Window Locking Workflow

  1. Use /list to see all available windows
  2. Use /lock VS Code to focus on a specific window
  3. Use /shot or ask naturally - captures will only show that window
  4. Use /clear to go back to full screen capture

MCP Tools

This extension provides the following tools to the AI:

Tool Description
take_screenshot Captures the screen or locked window, returns the image
view_screenshot_history Lists or retrieves previously captured screenshots
list_windows Lists all visible application windows
focus_window Locks capture to a specific window by fuzzy search
clear_lock Removes the window lock

Limitations

  • macOS only: Uses CoreGraphics APIs and screencapture utility
  • Screen Recording permission required: Must be granted in System Settings
  • Xcode CLI tools required: Needed for Swift-based window enumeration
  • 3-second countdown: For full-screen captures only (locked windows capture instantly)

Troubleshooting

Permission Denied

On first run, macOS will prompt you to allow screen recording. If you denied it:

  1. Go to System Settings > Privacy & Security > Screen Recording
  2. Enable the toggle for your Terminal app (Terminal, iTerm, VS Code, etc.)
  3. Restart your terminal and the Gemini CLI

Window List Empty or Missing Apps

Ensure Xcode Command Line Tools are installed:

xcode-select --install

Multiple Displays

The extension defaults to the primary display (index 0). To capture another display, mention it in your prompt (e.g., "take a screenshot of my second monitor").

File Storage

Screenshots are saved locally to:

./captures/session-{uuid}/

Each session gets a unique folder. Images are saved as compressed JPEGs (1280px width, 80% quality).

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published