AI Cowork watches what's on your screen, reads it with OCR, and lets you ask questions about your work. 100% local. Your data never leaves your machine.
v1.0.0 · Windows · 38 MB · or install from source →
Automatically captures your screen every few seconds using the blazing-fast mss library. Multi-monitor supported.
Reads all text on screen using Tesseract OCR with smart preprocessing — contrast enhancement, dual-mode scanning, garbled line filtering.
"What am I looking at?" "Summarize this page." "What was I doing 5 minutes ago?" — the AI knows because it was watching.
Exclude sensitive windows by title keyword — banking, passwords, private browsing. Capture auto-pauses when those windows are active.
Use Ollama locally (zero data leaves your machine), or connect to OpenAI or Anthropic Claude for even smarter answers.
One-click pause in the dashboard. When paused, nothing is captured or stored. Full control, always.
Keeps up to 50 screen observations in memory during your session. Nothing is written to disk — ever. Close the app, history is gone.
Switch LLM providers, set API keys, adjust capture intervals, manage privacy filters — all from the web dashboard. No restart needed.
Clean, modern interface with live screen preview and chat panel. Runs on localhost:8080. Open in any browser.
Screen capture, OCR, and LLM inference all happen on your machine. With Ollama, zero bytes leave your network.
Everything is stored in RAM only. Close the app and all observations are gone. No database, no files, no traces.
No analytics, no crash reports, no usage tracking. The app makes zero outbound connections (except to your chosen LLM).
Automatically skip capturing when sensitive apps are in focus. Set keywords like "bank", "password", "1password".
✓ Everything happens on localhost
No internet required when using Ollama
Free. Runs on your GPU via Docker. Your data never leaves your machine. Requires NVIDIA GPU.
Bring your own API key. Best quality answers. No GPU required. Pay per usage.
Bring your own API key. Great for long context. No GPU required. Pay per usage.
Clone, configure, run. It's that simple.
# Clone the repo
git clone https://github.com/Sami-Fd/ai-cowork.git
cd ai-cowork
# Configure
cp .env.example .env
# Install dependencies
pip install -r requirements.txt
# Start Ollama (optional — for local LLM)
docker compose up -d
# Run!
python app.py
Yes, AI Cowork is completely free and open-source under the MIT license. No paid plans, no subscriptions, no hidden costs.
No. When using Ollama (local LLM), all processing happens entirely on your machine. Zero data leaves your network. Cloud providers (OpenAI, Claude) are optional — the text is only sent if you choose to use them.
AI Cowork supports Ollama for 100% local inference (qwen2.5, llama3, mistral, etc.), plus OpenAI GPT-4o and Anthropic Claude as optional cloud backends. You can switch between them at any time from the dashboard.
Windows with Tesseract OCR installed. For local LLM: Docker and an NVIDIA GPU (4GB+ VRAM). For cloud LLMs: just an API key — no GPU needed.
No. Screen observations are kept in RAM only during your session. Nothing is written to disk. Close the app and all data is gone — no database, no files, no traces.
Yes. Add keywords like "bank", "1password", or "private" to the privacy filter. AI Cowork will automatically skip capturing when those windows are in focus.