Strix Halo at Full Context — Why Your Decode Drops 64% and What Actually Fixes It16 May 2026·8 minsStrix-Halo Benchmarks Rocm Vulkan Llama.cpp Qwen Mtp Inference
Strix Halo LLM Serving: 25 tok/s at 151k Context Under 100W26 April 2026·7 minsAi Homelab Llm-Inference Strix-Halo Llama-Cpp Qwen Local-Ai