YOUR AI STACK.
YOUR HARDWARE.
A production ready and secure AI infrastructure stack. Local LLM inference, workflow automation, and authentication — running on your own machine.
curl -fsSL https://tronexus.dev/install.sh -o install.sh && sudo bash install.sh
Everything you need.
Nothing you don't.
Pre-integrated, production-tested services that work together out of the box.
LOCAL LLM INFERENCE
Ollama runs your models locally on your GPU. Route to a dedicated inference server for better performance. No cloud, no cost per token.
AI CHAT INTERFACE
Open WebUI provides a full-featured, multi-model, multi-user interface. Switch models, manage conversations, and invite collaborators.
LLM ROUTING
LiteLLM proxies all model requests through a single OpenAI-compatible API. Mix local and remote models transparently.
WORKFLOW AUTOMATION
n8n connects your AI stack to the outside world. Build automations, webhooks, and AI-powered pipelines without code.
AUTH API
Google OAuth, JWT tokens with rotation, API keys with mandatory expiry, per-app role management. Built-in, not bolted on.
HARDENED UBUNTU
fail2ban, UFW, SSH hardening, automatic security updates, and sysctl tuning applied at install time. Professional-grade security by default.
AUTOMATIC TLS
Caddy handles HTTPS automatically via Let's Encrypt. Zero configuration, automatic renewal, HTTP/2 out of the box.
MONITORING & ALERTS
Daily Telegram alerts with AI-generated summaries. Container health, disk usage, security events, and API key expiry — all covered.
REMOTE INFERENCE
Offload LLM inference to a dedicated GPU machine on your local network. Keeps your primary server lean while maximising GPU utilisation.
Runs on real hardware.
No cloud account. No data leaving your machine. No subscription fees.
Start in under an hour.
Point to a fresh Ubuntu 24.04 machine. Run one command. Your AI stack handles the rest.