How I Self-Host an AI-Powered Website with Caddy, SearXNG, and a Local LLM

Most people use Vercel, Netlify, or a VPS to host their websites. I wanted something different: a fully self-hosted production site running from my own machine — complete with HTTPS, a live AI chatbot, and a real-time news feed. No cloud bills. No vendor lock-in. Full control.

This post breaks down exactly how rockyslabs.com runs — the stack, the architecture, and the lessons learned.

The Stack at a Glance

Here's what powers the site:

Caddy Server — Reverse proxy + static file server with automatic HTTPS (Let's Encrypt certificates, zero config)
Qwen 3.5 (2B-MTP) — A local LLM running via LM Studio for the site's AI chatbot
SearXNG — A private, self-hosted meta search engine that powers the live AI news feed
Windows + WinSW — Running everything as native Windows services for auto-start on boot
Dual-router NAT — Port forwarding through two consumer routers to reach the home server

Why self-host? Privacy, zero recurring cost, learning, and the satisfaction of saying "that server? it's in my room."

1. Caddy — The Easiest HTTPS You'll Ever Set Up

Caddy is criminally underrated. Unlike Nginx or Apache, it handles HTTPS automatically — just point your domain and it provisions a Let's Encrypt certificate with zero configuration. No certbot, no cron jobs, no renewal scripts.

Here's a simplified version of the Caddyfile that powers the site:

www.rockyslabs.com {
    # Serve the static site
    root * /path/to/site
    file_server

    # Proxy the AI chatbot to local vLLM
    handle_path /chat-api/* {
        reverse_proxy localhost:8000
    }

    # Proxy the news feed to local SearXNG
    handle_path /news-api/* {
        reverse_proxy localhost:8888
    }
}

That's it. Three blocks. Static files, chatbot proxy, and news proxy — all behind automatic HTTPS. Caddy handles certificate renewal, OCSP stapling, HTTP/2, and compression out of the box.

Running Caddy as a Windows Service

Since this runs on Windows, I use WinSW to wrap Caddy as a system service that starts on boot. No need to keep a terminal window open. The service runs silently in the background and auto-restarts on failure.

2. The AI Chatbot — Local Qwen via LM Studio

The site features a chatbot widget in the bottom-right corner. It's not calling OpenAI or Anthropic — it's hitting a local LLM running on my GPU.

Model: Qwen 3.5 2B-MTP (fast, small, surprisingly capable)
Inference engine: LM Studio with OpenAI-compatible API
Endpoint: localhost:8000/v1/chat/completions
Proxy path: Caddy exposes it as /chat-api/ — no CORS issues, no exposed ports

The frontend sends a standard OpenAI-format request to /chat-api/chat/completions, and Caddy transparently proxies it to the local LM Studio instance. From the browser's perspective, it's just talking to the same domain.

The key insight: you don't need a cloud AI API. A 2B parameter model on a single GPU handles conversational queries with sub-second latency.

3. The AI News Feed — SearXNG as a Private API

The homepage features a live "Top AI News" section that refreshes automatically. It works in two stages:

SearXNG fetches and aggregates results from multiple search engines (Google News, Bing News, DuckDuckGo) through a single private instance
The local LLM screens the results — filtering out false positives where "AI" is a person's name or a word in another language, and selecting the top articles

This gives you a curated, AI-verified news feed without any third-party API keys or subscriptions. SearXNG is open source and respects privacy — it doesn't track searches or leak data to upstream engines.

Deduplication and Quality Filtering

Raw search results are messy. The JavaScript on the page handles:

URL deduplication — Same article from multiple engines? Keep one.
Title similarity — Near-duplicate headlines get merged.
Content filtering — Articles without snippets or from Wikipedia get dropped.
AI screening — The local LLM picks the final top articles based on relevance and quality.

4. The Network — Dual-Router Port Forwarding

One challenge of self-hosting from home: most ISPs put you behind NAT, and many homes have multiple routers (ISP router → personal router → machine). My setup requires port forwarding on both routers:

ISP Router: Forward ports 80 and 443 to the personal router's IP
Personal Router: Forward ports 80 and 443 to the server machine's local IP
DNS: Point www.rockyslabs.com to the public IP (with a DDNS updater if your IP changes)

Once the ports reach Caddy, it handles everything — TLS termination, routing, and proxying.

5. Security Considerations

Self-hosting means you're responsible for security. A few things I've done:

Caddy auto-HTTPS — All traffic is encrypted, certificates auto-renew
No exposed service ports — vLLM and SearXNG only bind to localhost; Caddy is the only public-facing process
robots.txt blocks API paths — /chat-api/ and /news-api/ are disallowed from crawlers
Rate limiting — Basic request throttling on the proxy paths
Windows Firewall — Only ports 80 and 443 are open inbound

What I'd Do Differently

Use Linux — Windows works, but service management is easier on Linux with systemd. WinSW is a fine workaround, but it's an extra layer.
Add authentication to the chatbot — Right now it's open. A simple token or rate limit per IP would prevent abuse.
Cache the news feed server-side — Currently cached in the browser's localStorage. A server-side cache would be faster for first-time visitors.

The Result

The end result is a fully self-hosted, AI-powered website that costs exactly $0/month to run (beyond electricity and the domain name). It's fast, it's private, and it's entirely under my control.

If you're interested in building something similar, the entire approach is reproducible with free, open-source tools. No cloud required.

Want to see it live? You're already on it. The chatbot in the corner and the news feed at the top of the homepage are both running on this exact stack right now.

— Rakesh Ganesan
Rocky's Labs · 27 May 2025