This repository is a fork of llama.cpp with better CPU and hybrid GPU/CPU performance, new SOTA quantization types, first-class Bitnet support, better DeepSeek performance via MLA, FlashMLA, fused MoE ...
Jackrong, the developer behind Qwopus, has released Gemopus—a family of Claude Opus-style fine-tunes built on Google's ...