Under the hood

Built on open,
proven technology

No proprietary lock-in. Every component of AEGIS OS is replaceable, auditable, and built around tenant isolation, model policy and human-gated memory.

Engineering principles

The decisions that shaped the stack โ€” and that guide every future choice.

๐Ÿ”“
Open by default Prefer open-source over SaaS where reliability is comparable. You can always self-host the full stack.
๐Ÿ 
Data sovereignty first Your agent conversations, documents, and tenant data never leave the database without your explicit action.
๐Ÿ”„
Model agnosticism LLM providers are swappable through LiteLLM, but model access is selected by tenant tier and intent before a call is made.
๐Ÿ—๏ธ
Multi-tenant by design Tenant isolation at the database schema level โ€” not just a WHERE clause โ€” so data leakage is structurally impossible.
โšก
Lean runtime No heavyweight frameworks. Vanilla JS frontend keeps the client bundle under 50 kB with zero build tooling.
๐Ÿ”
Auditable logs Every agent action and tool call is written to an audit log. You can reconstruct exactly what any agent did and why.
๐Ÿ”
Idempotent processing Channel events and business writes are designed around stable source identifiers, terminal outcomes, and resolve-before-create handlers.
Backend & API
๐Ÿ
Python 3.12
Language
The application runtime. Chosen for its mature AI/ML ecosystem, readable code, and broad library support for every integration we need.
๐ŸŒถ๏ธ
Flask
Web framework
Lightweight WSGI framework handling all HTTP routing, blueprints, and session management. Jinja2 templating for server-rendered pages.
๐Ÿฆ„
Gunicorn
WSGI server
Production WSGI server running 3 worker processes behind Nginx. Handles concurrent requests with graceful restarts on deploy.
๐Ÿ”
Flask-Login
Auth sessions
Session-based authentication with secure cookie storage, login_required decorators, and remember-me persistent sessions.
Data Layer
๐Ÿ˜
PostgreSQL 16
Primary database
All structured data โ€” users, tenants, agents, tasks, schedules, chat history, audit records, and durable channel job state. Multi-tenant isolation via per-tenant schemas (tenant_{id}) with public system tables for platform-level queues and metadata.
๐Ÿงฎ
pgvector
Vector search
PostgreSQL extension for storing and querying 768-dim embedding vectors. Powers Knowledge Base and Memoria semantic search without a separate vector DB. All tenants unified at 768-dim for consistent cosine similarity.
๐Ÿ”—
psycopg2
DB driver
Battle-tested synchronous PostgreSQL adapter. Used with connection pooling and RealDictCursor for ergonomic row-as-dict access patterns.
๐Ÿณ
Docker
Database container
PostgreSQL runs in a Docker container on the VPS with persistent volume mounts. Isolated from the application process for clean upgrades. Enterprise self-hosted deployments use a full Docker Compose stack (pgvector, LiteLLM, AEGIS app, nginx, optional Ollama) for a complete on-premise setup in 2โ€“3 hours.
AI & Agent Inference
๐Ÿง 
OpenClaw Gateway
Agent orchestrator
Internal agent orchestration layer. Routes messages to the correct agent, loads SOUL personality profiles, and manages the tool-calling loop.
๐Ÿ”€
LiteLLM Proxy
Model router
Translates between provider APIs and applies the reconciled fallback chain. The app-level model-policy resolver decides which LiteLLM alias a tenant may use before the request reaches the proxy.
๐Ÿ›ก๏ธ
Model Policy Resolver
Tier-aware control plane
Maps semantic intent to allowed model aliases by tenant tier. Free/trial cannot reach premium GPT-4o-class spend; Business and Enterprise premium calls are quota-governed and auditable.
โšก
Groq
Free-tier throughput
Ultra-fast free-tier inference via Llama models. Used for free/trial and routing paths where cost control matters; paid tiers use funded OpenRouter aliases for reliable capacity.
โœจ
Google Gemini Flash
Free fallback
Gemini Flash is used as a free fallback in the proxy chain. Free/trial traffic does not terminal-fall into premium OpenRouter GPT-4o spend.
๐Ÿฆ™
OpenRouter + Ollama
Funded + local inference
OpenRouter backs funded `openrouter-mini` and premium `openrouter-gpt4o` aliases. Ollama supports local models such as Qwen and nomic-embed-text for low-egress or sovereign deployment paths.
๐Ÿ“
text-embedding-3-small
Embedding model
768-dim embeddings for Knowledge Base ingestion and semantic search. The schema has been unified at 768 dimensions so cloud and local embedding paths remain compatible.
Integrations & Channels
โœˆ๏ธ
Telegram Bot API
Primary notification channel
Agents deliver results, workflow completions, approvals, and deterministic read-command responses directly to Telegram. Long-polling keeps it free of public webhooks; inbound updates are durably recorded before processing so retries do not become duplicate actions or silent loss.
๐Ÿ”ต
Google OAuth 2.0
Auth & workspace
Two separate flows: Google Sign-In for authentication (PKCE flow, no client secret in browser) and Google Workspace OAuth for Calendar/Drive access. Gmail inbox monitoring is configured separately through Email (IMAP).
๐ŸŸ 
HubSpot OAuth
CRM integration
OAuth 2.0 connection to HubSpot CRM. Sales agent reads contacts and company data to enrich outreach. Tokens stored encrypted per tenant.
๐ŸŽฌ
HeyGen
AI video generation
Luna (Creative Producer) generates personalized avatar videos from agent-written scripts. Completed videos are delivered via Telegram.
๐ŸŽ™๏ธ
ElevenLabs
AI voice synthesis
Voice clone synthesis for audio briefings. Faster than video โ€” agents can produce audio updates in seconds and deliver them as voice messages.
๐Ÿ“ง
Gmail SMTP/IMAP
Email channel
Inbound email parsing and outbound delivery via Gmail. Agents can read, draft, and send on behalf of your connected inbox.
๐ŸŸฆ
Microsoft Bot Framework
Teams connector โ€” Enterprise
Azure Bot resource + Bot Connector REST API for Microsoft Teams. Employees message the AEGIS bot in Teams exactly as they message each other โ€” all processing happens server-side. Token cache is thread-safe for multi-worker deployments. Enabled via TEAMS_ENABLED=true env var.
๐Ÿ›๏ธ
python3-saml
SAML 2.0 SSO โ€” Enterprise
OneLogin's python3-saml library implements SAML 2.0 SP-initiated SSO. Enterprise customers paste their Azure AD / Entra Federation Metadata XML once โ€” employees then log in with existing corporate credentials. SP certificate generated with OpenSSL, stored in environment variables.
Frontend & UI
๐ŸŒ
Vanilla JS + CSS
Client-side
No frameworks, no build step, no hydration latency. The entire client bundle is under 50 kB. Pages render in <200 ms even on slow connections.
๐ŸŽจ
CSS Custom Properties
Design system
Theming via CSS variables โ€” light/dark mode, brand colours, spacing. Instant theme switching with zero flash via an inline anti-flash script in <head>.
๐Ÿ“„
Jinja2
Server-side rendering
Flask's default template engine. All pages are server-rendered HTML โ€” fast first paint, excellent SEO, no client-side routing complexity.
๐Ÿ”ค
Inter (Google Fonts)
Typography
Variable-weight Inter for all UI text. Loaded asynchronously with a system-font fallback stack to prevent layout shift.
Infrastructure & DevOps
๐Ÿ–ฅ๏ธ
VPS (Linux)
Compute
Dedicated VPS running Ubuntu. Single-tenant capable โ€” one AEGIS instance per customer is possible for maximum isolation and compliance.
๐ŸŒ€
Nginx
Reverse proxy
Handles TLS termination, HTTPโ†’HTTPS redirects, static file serving, and proxying to Gunicorn. Security headers applied at the Nginx layer.
๐Ÿ”’
Let's Encrypt / HTTPS
TLS
Auto-renewed TLS certificates via Certbot. HSTS enabled. All traffic is encrypted in transit โ€” no plain-HTTP fallback in production.
โš™๏ธ
systemd
Process management
Gunicorn and the scheduler run as systemd services with auto-restart on failure. Deployments are a git pull + service restart โ€” zero downtime in <5 seconds.
๐Ÿ™
GitHub
Version control & deploy
All application code is versioned on GitHub. Deploys are manual git pulls โ€” giving you full control and audit trail of exactly what ran when.
๐Ÿ“ฆ
APScheduler
Cron scheduler
In-process cron scheduler for agent task and workflow schedules. Runs in the same Python process โ€” no Celery, no Redis, no separate broker to manage.
Document Processing (Knowledge Base)
๐Ÿ“„
PyPDF2
PDF parsing
Extracts text from PDF files page-by-page. Pure Python โ€” no external binary dependencies.
๐Ÿ“
python-docx
Word document parsing
Reads .docx files paragraph by paragraph. Supports modern OOXML format (Word 2007+).
๐Ÿ“Š
openpyxl
Spreadsheet parsing
Reads .xlsx files sheet-by-sheet with read-only mode for memory efficiency. Extracts cell values as plain text.
๐Ÿ“ฝ๏ธ
python-pptx
Presentation parsing
Extracts text from .pptx slide shapes. Each slide becomes a labelled section in the Knowledge Base chunk.
Skills Engine & Memoria
๐ŸŽฏ
Skills Engine
Python singleton (skill_engine.py)
Runs versioned business procedures โ€” 23 skills total (8 system + 15 business templates) with per-tenant overrides, execution logs, and a self-improving feedback loop. Three skills auto-inject live CRM or Memoria context before the LLM call. Auto-Learn Mode (Pro+) runs nightly: detects underperforming skills (avg rating < -0.2 over โ‰ฅ3 runs), generates an improved prompt via LLM, auto-applies it, and notifies via Telegram.
๐Ÿง 
Memoria (Knowledge Graph)
pgvector + D3.js force graph
Stores agent insights, vault documents, and user knowledge as typed nodes with edges. Semantic search via pgvector cosine similarity. D3 force-directed graph for exploration. Crystallization service extracts 2-3 insights per agent response automatically.
๐Ÿ’ฌ
Crystallization Service
Background LLM extraction
Fire-and-forget daemon threads extract business insights from every agent response. Rate-limited to 5 extractions per 10 min per tenant. Semantic linking finds related Memoria nodes and creates 'crystallized_from' edges automatically.
๐Ÿ’“
Service Guardian
Heartbeat watchdog (guardian_service.py)
All background workers write heartbeats to PostgreSQL every loop cycle. Guardian detects stale workers, evicts zombie advisory locks via pg_terminate_backend, and sends admin Telegram alerts. Admin panel shows live service health.
๐Ÿ”
Trust Infrastructure
ai_audit_log ยท agent_runs ยท audit_logs
Model-policy decisions are appended to ai_audit_log; agent and skill executions record model, token, latency, cost and success fields where the routed path provides them. Skill runs also record a prompt_hash (SHA256[:16] of the template) and a schema_valid boolean, so prompt provenance is inspectable. Audit events (skill_executed, workflow_triggered, scheduled_task_fired) are written to audit_logs per tenant. The true post-call provider ledger and final migration of legacy call sites remain active hardening work.

See it in production

AEGIS OS is live and taking new tenants. Operators can start free; enterprise reviewers can inspect the architecture, Q&A and runbook first.