Running local models on Macs gets faster with Ollama's MLX support
Summary
Ollama adds MLX support to run local large language models on Apple Silicon Macs, along with caching improvements and support for a new model compression format. The preview release enables a 35B Qwen3.5 model with hardware requirements of at least 32 GB RAM, and leverages neural accelerators on M5 GPUs, offering privacy advantages but still limited by hardware and model availability.