↓ Skip to main content

Qwen

Strix Halo at Full Context — Why Your Decode Drops 64% and What Actually Fixes It

16 May 2026·8 mins

Strix-Halo Benchmarks Rocm Vulkan Llama.cpp Qwen Mtp Inference

Strix Halo LLM Serving: 25 tok/s at 151k Context Under 100W

26 April 2026·7 mins

Ai Homelab Llm-Inference Strix-Halo Llama-Cpp Qwen Local-Ai