Hardware + frameworks + deployment tooling for running AI locally. The stack VORLUX uses in production.
Budget Edge AI for clients who won't buy NVIDIA.
— Intel's Arc GPU via OpenVINO is viable; performance varies by model family.
8GB, ~€250, runs Llama-3.2-8B class workloads.
— Our default Edge AI recommendation for cost-conscious deployments.
~€700, surprisingly competitive when macOS is acceptable.
— MLX + unified memory means you get throughput well above the price suggests.
The C++ runtime that Ollama wraps.
— Go direct when you need quantization control Ollama hides.
Apple's on-device inference framework.
— The fastest path to good throughput on Apple Silicon; often beats llama.cpp for the same quantization.
Intel's answer for their hardware.
— Worth a look if the client's stack is Intel-heavy; otherwise MLX + llama.cpp covers more ground.
Asistente IA — respuestas basadas en nuestra KB
Reciba respuestas personalizadas:
¿Qué le trae hoy?