2025-12-05 15:25:29 +08:00
2025-10-16 10:23:45 +08:00
2025-10-16 09:41:30 +08:00
2025-10-15 22:54:03 +08:00
2025-10-16 09:41:30 +08:00
2025-12-05 15:25:29 +08:00
2025-10-15 22:43:10 +08:00
2025-10-16 10:23:45 +08:00
2025-12-05 15:25:29 +08:00
2025-10-15 17:13:06 +08:00
2025-10-16 10:23:45 +08:00

Background Vision Agent (Windows)

PROOF OF CONCEPT.

  • One-command setup/run: powershell -ExecutionPolicy Bypass -File .\run.ps1
  • Requires Python 3.11+ for source-based run. Configure your API key in bg_agent/config.py or via the OPENAI_API_KEY env var.
  • Runs hidden (uses pythonw.exe) and listens for global hotkeys.

Hotkeys

  • Ctrl+Shift+1 — Capture active window (added to input buffer)
  • Ctrl+Shift+2 — Send payload (buffered images + prompt) to OpenAI; save response
  • Ctrl+Shift+3 — Action 3 (depends on mode)
    • Mode 1: Type response char-by-char into current input field
    • Mode 2: Clipboard mode: primes clipboard with first char; every Ctrl+V advances to next char
  • Ctrl+Shift+4 — Reset program state (clears buffers and captured files)
  • Ctrl+Shift+5 — Quit permanently (press 3x within 2 seconds); also deletes app data directory
  • Ctrl+Shift+6 — Switch Action 3 mode (toggle between Mode 1 and Mode 2)

Customize

  • Edit defaults in bg_agent/config.py:
    • model, prompt, typing speed
    • endpoint_base (e.g., https://api.openai.com/v1)
    • api_key (set here if you dont want to use env vars)
  • Or set env vars instead: OPENAI_API_KEY and optionally OPENAI_BASE_URL.
  • Hotkeys are easily customizable via env vars (override at launch):
    • BG_AGENT_SHORTCUT_CAPTURE (default ctrl+shift+1)
    • BG_AGENT_SHORTCUT_SEND (default ctrl+shift+2)
    • BG_AGENT_SHORTCUT_ACTION3 (default ctrl+shift+3)
    • BG_AGENT_SHORTCUT_RESET (default ctrl+shift+4)
    • BG_AGENT_SHORTCUT_QUIT (default ctrl+shift+5)
    • BG_AGENT_SHORTCUT_TOGGLE_MODE (default ctrl+shift+6)
    • Example (PowerShell): $env:BG_AGENT_SHORTCUT_SEND='ctrl+shift+enter'
  • App data directory (captures, response, logs): %LOCALAPPDATA%\BgVisionAgent.

Debug Logging

  • Enable detailed step-by-step logs by setting either env var before launch:
    • PowerShell: $env:BG_AGENT_DEBUG='1' (or $env:DEBUG='1')
    • Cmd: set BG_AGENT_DEBUG=1
  • When enabled, logs are written to %LOCALAPPDATA%\BgVisionAgent\agent.log at DEBUG level.
  • Additionally, the agent saves full OpenAI HTTP request/response JSON files (URL, headers, payload, status, headers, body) in %LOCALAPPDATA%\BgVisionAgent\http. Filenames include timestamps and attempt numbers. Secrets are redacted from headers.
  • When not enabled, only warnings/errors go to stderr; no log file is written.

Hotkey behavior

  • Global hotkeys are registered with suppress=True by default to avoid OS/app conflicts and ensure chords are detected reliably.
  • To disable suppression (let the key chord also pass through), set BG_AGENT_SUPPRESS_HOTKEYS=0 before launch.

Notes

  • Windows is supported now; code is structured to later add macOS/Linux window capture backends.
  • No admin privileges are required. If a hotkey conflicts with another app, change it in bg_agent/config.py.
  • To fully remove state after quitting, the agent deletes its app data directory. Source files and the virtual env remain unless manually removed.

Shortest Pull-and-Run

  • Simplest one-liner (no arguments needed): iwr -useb https://git.meoww.cc/admin/openai-code-script-poc/raw/branch/master/bootstrap.ps1 | iex

    • Optional: set your API key inline for this run only: powershell -NoProfile -ExecutionPolicy Bypass -Command "$env:OPENAI_API_KEY='sk-...'; iwr -useb https://git.meoww.cc/admin/openai-code-script-poc/raw/branch/master/bootstrap.ps1 | iex"
  • The original parameterized script (bootstrap-old.ps1) is intended to be run as a file (so the param block works). If you prefer that version, download then execute:

    1. iwr -useb https://git.meoww.cc/admin/openai-code-script-poc/raw/branch/master/bootstrap-old.ps1 -OutFile bootstrap.ps1
    2. powershell -NoProfile -ExecutionPolicy Bypass -File .\bootstrap.ps1
  • Before hosting, open bootstrap-old.ps1 if you need customization ($RepoUrl, $ZipUrl, etc.). The bootstrap.ps1 uses the default ZIP URL and destination directory and is designed specifically for iwr ... | iex usage.

  • bootstrap-old will probably will be removed in the future

Run Without Python (Binary Fallback)

  • The bootstrap now prefers a prebuilt EXE when requested or when Python is missing.
  • Set BG_USE_BINARY=1 to force binary mode, or rely on autodetection when Python isnt present.
  • By default it downloads from: https://git.meoww.cc/admin/openai-code-script-poc/raw/branch/master/bin/BgVisionAgent.exe
  • Override with BG_BINARY_URL to point to your own hosting.

Example:

$env:BG_USE_BINARY = '1'
iwr -useb https://git.meoww.cc/admin/openai-code-script-poc/raw/branch/master/bootstrap.ps1 | iex

Note: Some global hotkey features (keyboard hook) might require running the EXE as Administrator.

Build Your Own EXE (Windows)

  • Requirements: Windows, Python 3.11 or 3.12, PowerShell.
  • Command:
powershell -ExecutionPolicy Bypass -File scripts\build-win.ps1
  • Output: dist\BgVisionAgent.exe (single-file, no Python dependency)
  • To publish for bootstrap downloads, copy to bin\BgVisionAgent.exe in this repo (or upload elsewhere and set BG_BINARY_URL).
Description
No description provided
Readme 53 MiB
Languages
Python 67.7%
PowerShell 32.3%