The VORLUX AI Stack: Every Tool We Use, Nothing Hidden

When we tell clients their AI will run locally with no cloud dependency, the natural follow-up is: “Okay, but what exactly are you running?” Fair question. If we’re asking you to trust us with your infrastructure, you deserve to see everything under the hood.

This post is our full technology disclosure. Every component, every tool, every decision — and why we made it. No proprietary black boxes. No vague references to “our AI platform.” Just the actual stack.

The Core Components

Here’s everything that powers VORLUX AI, from inference to interface:

Layer	Technology	Role	Why This One
Inference	Ollama	LLM serving	Best local inference server, 14 models loaded
API	FastAPI + Python	REST API & orchestration	Fast, typed, async-native
Dashboard	Next.js	Internal operations dashboard	React ecosystem, SSR, real-time
Database	SQLite	All persistence	Zero config, zero network, battle-tested
Public Site	Astro	vorluxai.com	Static-first, fast, SEO-optimized
Automation	n8n	Workflow automation	Visual workflows, self-hosted
Search	FAISS + BM25	RAG retrieval	Vector + keyword hybrid search
Scheduling	BackgroundScheduler	Cron jobs	58 scheduled tasks, Python-native
Cache	Redis	Session & task cache	In-memory speed, Docker-hosted
Hardware	Mac M3 Pro 32GB	Primary server	Apple Silicon = best performance/watt

Every single component either runs on our hardware or on the client’s hardware. Nothing phones home. Nothing sends telemetry. Nothing requires an internet connection to function.

How It All Fits Together

flowchart TB
    subgraph CLIENT["Client Layer"]
        SITE["Astro Site<br/>vorluxai.com"]
        DASH["Next.js Dashboard<br/>:3000"]
    end
    
    subgraph API_LAYER["API & Orchestration"]
        API["FastAPI API<br/>:8090"]
        ORCH["Orchestrator<br/>:8091"]
        N8N["n8n Workflows<br/>:5678"]
    end
    
    subgraph INFERENCE["Inference Layer"]
        OLLAMA["Ollama<br/>14 Models<br/>:11434"]
        RAG["FAISS + BM25<br/>RAG Search"]
    end
    
    subgraph DATA["Data Layer"]
        SQLITE[("SQLite<br/>All Persistence")]
        REDIS[("Redis<br/>Cache<br/>:6379")]
    end
    
    subgraph AUTOMATION["Automation Layer"]
        SCHED["BackgroundScheduler<br/>58 Cron Jobs"]
        LOOPS["36 Autonomous<br/>Loops"]
    end
    
    SITE --> API
    DASH --> API
    API --> OLLAMA
    API --> RAG
    API --> SQLITE
    API --> REDIS
    ORCH --> API
    ORCH --> N8N
    SCHED --> API
    LOOPS --> ORCH
    RAG --> SQLITE
    
    style CLIENT fill:#0B1628,color:#FAFAFA
    style INFERENCE fill:#059669,color:#fff
    style DATA fill:#F5A623,color:#0B1628

The 14 Models We Run

Not every task needs the same model. We run 14 models simultaneously, routing each request to the right one:

Gemma 2 9B — General-purpose reasoning and conversation
Llama 3.3 70B — Complex analysis and long-form generation
Mistral Small 24B — Fast, capable mid-range inference
Phi-4 — Lightweight tasks, fast turnaround
Qwen 2.5 72B — Multilingual tasks, excellent for Spanish
Qwen 2.5 Coder 7B — Code generation and review
DeepSeek V3 — Technical reasoning
Plus 7 specialized variants for embeddings, summarization, and classification

All running on a single Mac M3 Pro with 32GB of unified memory. No GPU cluster. No data center. One machine on a desk in Valencia.

36 Autonomous Loops, 58 Cron Jobs

The system doesn’t just respond to requests — it works autonomously. Here’s what runs around the clock:

Content loops: Research, draft, review, publish — fully automated content pipeline
Quality loops: Code review, test execution, knowledge base updates
Monitoring loops: Health checks every 60 seconds, auto-restart on failure
Business loops: Lead research, market analysis, competitive intelligence

The BackgroundScheduler manages 58 cron jobs that trigger these loops on precise schedules. The watchdog system ensures everything stays alive. If a service crashes at 3 AM, it restarts itself before anyone notices.

We detailed how this self-healing architecture works in our operations documentation.

Why Open-Source Matters

Every component in our stack is either open-source or built by us in-house. This isn’t ideological — it’s practical:

No license fees — Our clients don’t pay software licenses. Hardware is the only cost.
No vendor lock-in — If Ollama disappears tomorrow, we switch to llama.cpp or vLLM. Same models, different runtime.
Full auditability — Regulated clients can inspect every line of code that touches their data. This directly satisfies GDPR Article 25 requirements for privacy by design.
Community support — 50,000+ GitHub stars across our core dependencies. These aren’t experimental toys.

Compared to Cloud-Dependent Stacks

Aspect	VORLUX AI (Local)	Typical Cloud Stack
Data location	Your hardware	AWS/Azure/GCP
Monthly cost	EUR 0 after hardware	EUR 500-5,000+/mo
Latency	< 100ms first token	200-800ms+
Internet required	No	Yes
GDPR complexity	Minimal	Significant
Vendor lock-in	None	High
Model switching	Minutes	Days-weeks
Uptime dependency	Your power	Their SLA
Audit trail	Full local logs	Provider-dependent

The cloud stack isn’t wrong for everyone. But for businesses processing sensitive data under European regulation, local deployment eliminates entire categories of risk. We explored this tradeoff in depth in our cost analysis.

What This Means for You

When we deploy AI for your business, you get this exact stack — adapted to your hardware and your workloads. Not a watered-down version. Not a hosted service with a “local” label. The real thing, running on metal you own.

The Edge AI for SMEs service we’re launching in May uses this same architecture, scaled down to hardware that fits on a shelf and a budget that fits a small business.

See It in Action

We run live demos of this stack during our free assessment calls. No slides, no mockups — the actual system, running actual models, processing actual queries in real time.

Book your free 15-minute assessment and see for yourself what local AI looks like when it’s built properly.

Tomorrow, we reveal exactly what services we’re launching and what they cost. No surprises — just like the stack.

This is post 2 of our Launch Week series. Yesterday: Local AI Readiness Checklist. Tomorrow: Our Services and Pricing.

External references: Ollama | n8n Workflow Automation | Astro Web Framework | GDPR Article 25 & Local AI

Ready to Get Started?

VORLUX AI helps Spanish and European businesses deploy AI solutions that stay on your hardware, under your control. Whether you need edge AI deployment, LMS integration, or EU AI Act compliance consulting — we can help.

Book a free discovery call to discuss your AI strategy, or explore our services to see how we work.

The VORLUX AI Stack: Every Tool We Use, Nothing Hidden

The VORLUX AI Stack: Every Tool We Use, Nothing Hidden

The Core Components

How It All Fits Together

The 14 Models We Run

36 Autonomous Loops, 58 Cron Jobs

Why Open-Source Matters

Compared to Cloud-Dependent Stacks

What This Means for You

See It in Action

Ready to Get Started?

Blog

VORLUX AI Launch Day: We're Open for Business

Local AI Readiness Checklist: Is Your Business Ready to Run AI On-Premise?

Access exclusive resources

15 minutes to evaluate your case

VORLUX AI

The VORLUX AI Stack: Every Tool We Use, Nothing Hidden

The Core Components

How It All Fits Together

The 14 Models We Run

36 Autonomous Loops, 58 Cron Jobs

Why Open-Source Matters

Compared to Cloud-Dependent Stacks

What This Means for You

See It in Action

Related reading

Ready to Get Started?

Blog

VORLUX AI Launch Day: We're Open for Business

Local AI Readiness Checklist: Is Your Business Ready to Run AI On-Premise?

Access exclusive resources

15 minutes to evaluate your case

VORLUX AI