1.4 KiB
1.4 KiB
Background Vision Agent (Windows)
- One-command setup/run:
powershell -ExecutionPolicy Bypass -File .\run.ps1 - Requires Python 3.9+ and an
OPENAI_API_KEYin your user environment. - Runs hidden (uses
pythonw.exe) and listens for global hotkeys.
Hotkeys
- Alt+Shift+1 — Capture active window (added to input buffer)
- Alt+Shift+2 — Send payload (buffered images + prompt) to OpenAI; save response
- Alt+Shift+3 — Action 3 (depends on mode)
- Mode 1: Type response char-by-char into current input field
- Mode 2: Clipboard mode: primes clipboard with first char; every Ctrl+V advances to next char
- Alt+Shift+4 — Reset program state (clears buffers and captured files)
- Alt+Shift+5 — Quit permanently (press 3x within 2 seconds); also deletes app data directory
- Alt+Shift+6 — Switch Action 3 mode (toggle between Mode 1 and Mode 2)
Customize
- Edit defaults in
bg_agent/config.py(hotkeys, model, prompt, typing speed). The endpoint is hardcoded via the official OpenAI Python SDK. - App data directory (captures, response, logs):
%LOCALAPPDATA%\BgVisionAgent.
Notes
- Windows is supported now; code is structured to later add macOS/Linux window capture backends.
- No admin privileges are required. If a hotkey conflicts with another app, change it in
bg_agent/config.py. - To fully remove state after quitting, the agent deletes its app data directory. Source files and the virtual env remain unless manually removed.