Files
openai-code-script-poc/README.md
Muzhen Gaming bf764fe683 first commit
2025-10-15 17:13:06 +08:00

1.4 KiB

Background Vision Agent (Windows)

  • One-command setup/run: powershell -ExecutionPolicy Bypass -File .\run.ps1
  • Requires Python 3.9+ and an OPENAI_API_KEY in your user environment.
  • Runs hidden (uses pythonw.exe) and listens for global hotkeys.

Hotkeys

  • Alt+Shift+1 — Capture active window (added to input buffer)
  • Alt+Shift+2 — Send payload (buffered images + prompt) to OpenAI; save response
  • Alt+Shift+3 — Action 3 (depends on mode)
    • Mode 1: Type response char-by-char into current input field
    • Mode 2: Clipboard mode: primes clipboard with first char; every Ctrl+V advances to next char
  • Alt+Shift+4 — Reset program state (clears buffers and captured files)
  • Alt+Shift+5 — Quit permanently (press 3x within 2 seconds); also deletes app data directory
  • Alt+Shift+6 — Switch Action 3 mode (toggle between Mode 1 and Mode 2)

Customize

  • Edit defaults in bg_agent/config.py (hotkeys, model, prompt, typing speed). The endpoint is hardcoded via the official OpenAI Python SDK.
  • App data directory (captures, response, logs): %LOCALAPPDATA%\BgVisionAgent.

Notes

  • Windows is supported now; code is structured to later add macOS/Linux window capture backends.
  • No admin privileges are required. If a hotkey conflicts with another app, change it in bg_agent/config.py.
  • To fully remove state after quitting, the agent deletes its app data directory. Source files and the virtual env remain unless manually removed.