Tech Stack — AEGIS OS

Engineering principles

The decisions that shaped the stack — and that guide every future choice.

🔓

Open by default Prefer open-source over SaaS where reliability is comparable. You can always self-host the full stack.

🏠

Data sovereignty first Your agent conversations, documents, and tenant data never leave the database without your explicit action.

🔄

Model agnosticism LLM providers are swappable. Groq, Gemini, Anthropic, or local Ollama — change a config line, not code.

🏗️

Multi-tenant by design Tenant isolation at the database schema level — not just a WHERE clause — so data leakage is structurally impossible.

⚡

Lean runtime No heavyweight frameworks. Vanilla JS frontend keeps the client bundle under 50 kB with zero build tooling.

🔍

Auditable logs Every agent action and tool call is written to an audit log. You can reconstruct exactly what any agent did and why.

Backend & API

🐍

Python 3.11

Language

The application runtime. Chosen for its mature AI/ML ecosystem, readable code, and broad library support for every integration we need.

🌶️

Flask

Web framework

Lightweight WSGI framework handling all HTTP routing, blueprints, and session management. Jinja2 templating for server-rendered pages.

🦄

Gunicorn

WSGI server

Production WSGI server running 3 worker processes behind Nginx. Handles concurrent requests with graceful restarts on deploy.

🔐

Flask-Login

Auth sessions

Session-based authentication with secure cookie storage, login_required decorators, and remember-me persistent sessions.

Data Layer

🐘

PostgreSQL 16

Primary database

All structured data — users, tenants, agents, tasks, schedules, chat history. Multi-tenant isolation via per-tenant schemas (tenant_{id}).

🧮

pgvector

Vector search

PostgreSQL extension for storing and querying 768-dim embedding vectors. Powers Knowledge Base and Memoria semantic search without a separate vector DB. All tenants unified at 768-dim for consistent cosine similarity.

🔗

psycopg2

DB driver

Battle-tested synchronous PostgreSQL adapter. Used with connection pooling and RealDictCursor for ergonomic row-as-dict access patterns.

🐳

Docker

Database container

PostgreSQL runs in a Docker container on the VPS with persistent volume mounts. Isolated from the application process for clean upgrades. Enterprise self-hosted deployments use a full Docker Compose stack (pgvector, LiteLLM, AEGIS app, nginx, optional Ollama) for a complete on-premise setup in 2–3 hours.

AI & Agent Inference

🧠

OpenClaw Gateway

Agent orchestrator

Internal agent orchestration layer. Routes messages to the correct agent, loads SOUL personality profiles, and manages the tool-calling loop.

🔀

LiteLLM Proxy

Model router

Translates between API formats. Enables seamless switching between Groq, Gemini, Anthropic Claude, and local Ollama with automatic fallback chains.

⚡

Groq

Primary LLM provider

Ultra-fast inference via Llama 3.3 70B. Used as the primary model for all 7 agents — fast enough for real-time chat, powerful enough for complex reasoning.

✨

Google Gemini

Fallback LLM

Gemini 1.5 Flash serves as automatic fallback when Groq hits rate limits. Transparent to the user — the agent response is the same quality.

🦙

Ollama

Local inference

Runs local models (Qwen 2.5 Coder, nomic-embed-text) for embedding generation and code tasks. Zero egress cost. Fully air-gapped option for sensitive deployments.

📐

text-embedding-3-small

Embedding model

1536-dim embeddings for Knowledge Base ingestion and semantic search. Can be replaced with nomic-embed-text via Ollama for a fully local setup.

Integrations & Channels

✈️

Telegram Bot API

Primary notification channel

Agents deliver results, workflow completions, and Boardroom syntheses directly to your Telegram. Long-polling keeps it free of webhooks and public endpoints.

🔵

Google OAuth 2.0

Auth & workspace

Two separate flows: Google Sign-In for authentication (PKCE flow, no client secret in browser) and Google Workspace OAuth for Calendar/Gmail agent access.

🟠

HubSpot OAuth

CRM integration

OAuth 2.0 connection to HubSpot CRM. Sales agent reads contacts and company data to enrich outreach. Tokens stored encrypted per tenant.

🎬

HeyGen

AI video generation

Luna (Creative Producer) generates personalized avatar videos from agent-written scripts. Completed videos are delivered via Telegram.

🎙️

ElevenLabs

AI voice synthesis

Voice clone synthesis for audio briefings. Faster than video — agents can produce audio updates in seconds and deliver them as voice messages.

📧

Gmail SMTP/IMAP

Email channel

Inbound email parsing and outbound delivery via Gmail. Agents can read, draft, and send on behalf of your connected inbox.

🟦

Microsoft Bot Framework

Teams connector — Enterprise

Azure Bot resource + Bot Connector REST API for Microsoft Teams. Employees message the AEGIS bot in Teams exactly as they message each other — all processing happens server-side. Token cache is thread-safe for multi-worker deployments. Enabled via TEAMS_ENABLED=true env var.

🏛️

python3-saml

SAML 2.0 SSO — Enterprise

OneLogin's python3-saml library implements SAML 2.0 SP-initiated SSO. Enterprise customers paste their Azure AD / Entra Federation Metadata XML once — employees then log in with existing corporate credentials. SP certificate generated with OpenSSL, stored in environment variables.

Frontend & UI

🌐

Vanilla JS + CSS

Client-side

No frameworks, no build step, no hydration latency. The entire client bundle is under 50 kB. Pages render in <200 ms even on slow connections.

🎨

CSS Custom Properties

Design system

Theming via CSS variables — light/dark mode, brand colours, spacing. Instant theme switching with zero flash via an inline anti-flash script in <head>.

📄

Jinja2

Server-side rendering

Flask's default template engine. All pages are server-rendered HTML — fast first paint, excellent SEO, no client-side routing complexity.

🔤

Inter (Google Fonts)

Typography

Variable-weight Inter for all UI text. Loaded asynchronously with a system-font fallback stack to prevent layout shift.

Infrastructure & DevOps

🖥️

VPS (Linux)

Compute

Dedicated VPS running Ubuntu. Single-tenant capable — one AEGIS instance per customer is possible for maximum isolation and compliance.

🌀

Nginx

Reverse proxy

Handles TLS termination, HTTP→HTTPS redirects, static file serving, and proxying to Gunicorn. Security headers applied at the Nginx layer.

🔒

Let's Encrypt / HTTPS

TLS

Auto-renewed TLS certificates via Certbot. HSTS enabled. All traffic is encrypted in transit — no plain-HTTP fallback in production.

⚙️

systemd

Process management

Gunicorn and the scheduler run as systemd services with auto-restart on failure. Deployments are a git pull + service restart — zero downtime in <5 seconds.

🐙

GitHub

Version control & deploy

All application code is versioned on GitHub. Deploys are manual git pulls — giving you full control and audit trail of exactly what ran when.

📦

APScheduler

Cron scheduler

In-process cron scheduler for agent task and workflow schedules. Runs in the same Python process — no Celery, no Redis, no separate broker to manage.

Document Processing (Knowledge Base)

📄

PyPDF2

PDF parsing

Extracts text from PDF files page-by-page. Pure Python — no external binary dependencies.

📝

python-docx

Word document parsing

Reads .docx files paragraph by paragraph. Supports modern OOXML format (Word 2007+).

📊

openpyxl

Spreadsheet parsing

Reads .xlsx files sheet-by-sheet with read-only mode for memory efficiency. Extracts cell values as plain text.

📽️

python-pptx

Presentation parsing

Extracts text from .pptx slide shapes. Each slide becomes a labelled section in the Knowledge Base chunk.

Skills Engine & Memoria

🎯

Skills Engine

Python singleton (skill_engine.py)

Runs versioned business procedures — 23 skills total (8 system + 15 business templates) with per-tenant overrides, execution logs, and a self-improving feedback loop. Three skills auto-inject live CRM or Memoria context before the LLM call. Auto-Learn Mode (Pro+) runs nightly: detects underperforming skills (avg rating < -0.2 over ≥3 runs), generates an improved prompt via LLM, auto-applies it, and notifies via Telegram.

🧠

Memoria (Knowledge Graph)

pgvector + D3.js force graph

Stores agent insights, vault documents, and user knowledge as typed nodes with edges. Semantic search via pgvector cosine similarity. D3 force-directed graph for exploration. Crystallization service extracts 2-3 insights per agent response automatically.

💬

Crystallization Service

Background LLM extraction

Fire-and-forget daemon threads extract business insights from every agent response. Rate-limited to 5 extractions per 10 min per tenant. Semantic linking finds related Memoria nodes and creates 'crystallized_from' edges automatically.

💓

Service Guardian

Heartbeat watchdog (guardian_service.py)

All background workers write heartbeats to PostgreSQL every loop cycle. Guardian detects stale workers, evicts zombie advisory locks via pg_terminate_backend, and sends admin Telegram alerts. Admin panel shows live service health.

🔍

Trust Infrastructure

agent_runs · prompt_hash · audit_logs

Every LLM call is logged to a per-tenant agent_runs table: agent, model, trigger type, input/output tokens, latency, cost, and success flag. Every skill run records a prompt_hash (SHA256[:16] of the template) and a schema_valid boolean — so you always know which prompt version produced which output. Audit events (skill_executed, workflow_triggered, scheduled_task_fired) are written to audit_logs per tenant. 90-day retention, cleaned nightly by the guardian. Migration serialised with pg_advisory_xact_lock to prevent DDL races across gunicorn workers.

See it in production

AEGIS OS is live and taking new tenants. Start your 30-day free trial — no credit card required.

Start free trial → Read our security page

Built on open,proven technology

Engineering principles

See it in production

Built on open,
proven technology