Local LLMs with Ollama — Private Automation That Scales to Zero Cloud CostLocal LLMs with Ollama — Private Automation That Scales to Zero Cloud Cost

Cloud LLM bills add up. For personal automation and sensitive workflows, I run Ollama locally and orchestrate with n8n + Python.

Architecture (Job Hunt Automation pattern)

n8n cron → Next.js API → Python ingest → Ollama tailor → Gmail PDF → Postgres
  • Discover jobs via Playwright
  • Filter in dashboard
  • Compose cover letters + emails with local model
  • Track applications and replies

When local wins

  • Secrets stay on your machine (encrypted git-crypt)
  • Unlimited iterations during prompt tuning
  • No per-token anxiety for batch jobs

When cloud wins

  • Highest reasoning quality for architecture decisions
  • Multimodal (screenshots, PDFs) at scale
  • SLA-backed APIs for customer-facing agents

Hardware reality (Mac / Linux)

  • 8–16GB RAM: 7B–8B quant models for drafting
  • 32GB+: 14B models for better instruction following
  • Always measure time-to-first-token vs cloud latency

Takeaway

Hybrid LLM strategy: Ollama for volume and privacy; GPT/Claude for customer-facing agent peaks. Engineer the router — don't pick one religion.

Keep Reading